MY ALT TEXT
RouteExplainer: An Explanation Framework for Vehicle Routing Problem

#Vehicle Routing Problem

#Explainability

#Counterfactual Explanation

#Large Language Models

PAKDD 2024

Daisuke Kikuta*, Hiroki Ikeuchi, Kengo Tajiri, Yuusuke Nakano
NTT Corporation
*Contact:

What is Vehicle Routing Problem (VRP)?

The Vehicle Routing Problem (VRP) is a combinatorial optimization problem that aims to find the optimal routes for a fleet of vehicles to serve customers. Since Dantzig and Ramser [1] introduced its concept, various VRPs that model real-world problems have been proposed, imposing constraints such as time windows [2], vehicle capacity [1], and minimum prize [3]. Concurrently, various solvers have been proposed, ranging from exact solvers [4] to (meta-)heuristics [5], Neural Network (NN) solvers [6-9], and their combinations [10-12].

Why does VRP need explainability?

Conventional solvers such as exact solvers and heuristics inherently possess algorithmic transparency, yet it is difficult to construct interpretable explanations for their outputs summarizing the complicated optimization processes. On the other hand, NN solvers are naturally black-box methods, and their outputs are not explainable as is. However, in contrast to the success of developing VRPs and their solvers, explainability for VRP still remains unexplored. Here, we argue that explainabilty for VRP is essential for responsible/interactive VRP applications. In particular, explaning the influence of each edge in a route is significant. For example, in emergency power supply routing, when asked why a vehicle went rescue to a location instead of other locations at a certain time, the subsequent influence of the movement (i.e., edge) allows the responsible person to justify the decision. In interactive tourist route generation, the influence of an edge in a generated route provides hints to tourists who try to modify the edge based on their preferences.

 ▶

Proposed framework: RouteExplainer

We introduce RouteExplainer, the first post-hoc explanation framework for VRP, which explains the influence of each edge in a route with counterfactual explanation. Rethinking a route as the sequence of actions, we extend counterfactual explanation based on the action influence model [13] to VRP. Given a generated route (i.e., actual route), a user asks a why and why-not question like "why was edge B-C selected at the step, and why not edge B-D?". Here, we call edge B-C actual edge and edge B-D counterfactual (CF) edge. RouteExplainer takes the actual route and question, and then generates an explanation (answer) for that question as follows:

  1. A VRP solver generates a CF route, which is an alternative route where the CF edge is selected at the step instead of the actual edge.
  2. An Edge classifier predicts the intentions of each edge in both actual and CF routes, where the classifier is trained on the datasets labeled by a rule-based annotation and is used for determining the intention more quickly than the annotation.
  3. The influence of actual and CF edges are quantitatively compared based on representative values such as route length and ratio of each edge intention.
  4. A Large Language Model (LLM) generates an explanation text while taking into account the comparison results in the input context. The generated explanation justifies why the VRP solver selected the actual edge rather than the CF edge at the step by comparing their pros and cons.
Our proposal includes the above framework/pipline, a many-to-many sequential edge classifier, a new class-balanced loss function, edge annotation strategy, and LLM-based explanation generation.

The pipeline of RouteExplainer
Fig.1: The pipeline of RouteExplainer.
 ▶

A many-to-many sequential edge classifier

We formulate predicting the intentions of each edge as a many-to-many sequential classification, where the edge classifier takes a route (i.e., sequence of edges) and sequentially classifies each edge. We here propose a Transformer-based edge classifier, which consists of a node encoder, edge encoder, and classification decoder (Fig.2). The node encoder first converts input features into higher-dimensional embeddings with a linear projection and the subsequent Transformer encoders [14] (without positional encoding). Then, the edge encoder generates edge embeddings by concatenating three vectors: two node embeddings of the end of the edge and global states (e.g., current time). In the classification decoder, Causal Transformer encoders (without positional encoding) take the edge embeddings arranged in the route order as input tokens, and then generate the final embeddings of each edge. After a linear projection adjusts the dimension of the final embedding to the number of classes (intention types), the decoder finally outputs the probabilities of each class of each edge by applying the Softmax to the adjusted embeddings.

We prepare datasets labeled by a rule-based annotation and train the edge classifier on the datasets in supervised learning. Some of our datasets have step-wise class imbalanceness such that a class (intention) is majority in the early steps, whereas another class becomes majority as the steps progress. Existing class-balanced loss function cannot handle this because they consider only batch-level imbalanceness. To address this, we propose a step-wise class-balanced cross entropy, which considers step-wise class imbalanceness by separately computing balancing weights at each step.

A many-to-many edge classifer
Fir.2: Overview of Edge classifier
 ▶

A rule-based edge annotation

In this work, we propose a rule-based edge annotation. In particular, we here describe the rule for Traveling Salesman Problem with Time Windows (TSPTW), where we assume that each edge is labeled as either "prioritizing route length" or "prioritizing time windows". Fig.3 shows an example of the annotation in a TSPTW route (top of Fig.3) and the general edge annotation algorithm (bottom of Fig.3). Let's consider the example of labeling the red edge in the TSPTW route. A VRP solver first solves VRP that removes constratins (i.e., TSPTW - time windows = TSP) with past edges (i.e., edges before the red edge) fixed. Then, we compare the red edge in the TSPTW route and the edge at the same step in the TSP route (i.e., the green edge). If they are identical, the red edge is labeled as "prioritizing route length", otherwise, "prioritizing time windows". Intuitively, the green edge is optimal when only route length is considered, so if the red edge matches it, the red edge is considered to prioritize route length. Otherwise, the red edge is considered to have changed from the green edge to adhere to time windows, rather than optimizing route length. See rules for other VRPs in our paper.

The pipeline of RouteExplainer
Fig.3: Edge annotation for TSPTW (top) and general edge anotation algorithm (bottom).
 ▶

Explanation generation

RouteExplainer justifies the actual edge by comparing the influences of the actual and CF edges. To quantify the influence of an edge while considering its impact on the subsequent edges, we introduce Edge Influence Model (EIM), which is an extension of the actual influence model (AIM) [13] for VRP. AIM is an extension of Structural Causal Model (SCM) [15] for actions in reinforcement learning, which replaces variables and directed edges in a Directed Acycle Graph (DAG) with environmental states and actions, respectively (l.h.s of Fig.4). The authors [13] leverage AIM to generate counterfactual explanation for a specific action taken by a model, where that action is justified by comparing the subsequent states of that action and those of a counterfactual action.

Rethinking a route as the sequence of actions (i.e., edges/movements between two nodes), we extend AIM to EIM, which replaces environmental states and actions in AIM with nodes and edges in a VRP instance (r.h.s of Fig.4 ). Each node in EIM possesses environmental states when it is visited (we name them global states). For TSPTW, the global state is the intermidiate route length/travel time, and the structural equation between the previous and next global state is defined as \(S_{(i, t+1)}=S_{(j, t)}+d(v_{i}, v_{j})\), where \(S_{(i, t)}\) is the global state of the \(i\)-th node when visited at step \(t\) and \(d(v_{i}, v_{j})\) is the route length/travel time of the edge between \(i,j\)-th nodes. The set of causal paths in EIM (red pahts in r.h.s of Fig. ) corresponds to a route.

Based on EIM, we define the influence of the edge at step \(t\) in a route as the tuple of the subsequent global states and edge intentions to that edge: \(\boldsymbol{I}_{e_t} = \left(S_{(\pi_{t+1}, t+1)}, \hat{y}_{t+1}, S_{(\pi_{t+2}, t+2)},\dots, \hat{y}_{T-1}, S_{(\pi_{T}, T)}\right)\), where \(\hat{y}_{t}\) is the intention of the edge at step \(t\). To make this more understandable by humans, we further define the respresentative values that reflects \(\boldsymbol{I}_{e_t}\) at the minimum necessary. For TSPTW, we select four representative values calculated from \(\boldsymbol{I}_{e_t}\): the shor-term effect \(S_{(\pi_{t+1}, t+1)}\), long-term effect \(S_{(\pi_{T}, T)}\), intention ratio, and infeasibility ratio. RouteExplainer explains the influence of the actual edge by comparing (taking difference of) the representative values of the actual and CF edges. See our paper for more detailed formulation. In the next slide, we will describe how to generate the final explanation text with LLMs.

The pipeline of RouteExplainer Fig.4: Action Influence Model v.s. Edge Infuluence Model (ours).

In-context learning on GPT-4

RouteExplainer leverages an LLM for generating the final explanation text. We use GPT-4 without additional finetuning and acomplish the explanation generation by in-context learning. Fig.5 shows the system prompt for generating explanation based on RouteExplainer. The comparion of representative values of the actual and CF edges' influences are embedded to {comparison_results} in the system prompt, of which format is the same as the example in the system prompt. The system prompt first clarifies the task here with the description of RouteExplainer and the terminology. It then instructs the LLM to justify the actual edge while mentioning the pros and cons of the actual and CF edges, which are determined by the comparison of the four representative values of those edges. For example, the smaller short/long-term effect (i.e., intermediate/total travel time) and # unvisited nodes (i.e., infeasibility ratio), the better. Subsequent class ratio (i.e., intention ratio) is basically a neutral metric, but it gives us the insights of the edge infuluence. The supplementary information allows the LLM to make the pros and cons more detail while relating them to that information. Prompt-engineering techniques used here include role-play prompting (clarifying the task), one-shot prompting (showing an example of the expected output), and optimizing the placement of prompts (placing important prompts at the beginning or end of the whole prompt).

The pipeline of RouteExplainer
Fig.5: Our system prompt for generating explanation text.

Evaluation

We quantitatively evaluate our edge classifier on Actual-Route-Datasets and CF-Route-Datasets, which include the labels of each edge in actual routes and CF routes of different sizes of four VRPs: Traveling Salesman Problem with Time Windows (TSPTW), Prize Collecting TSP (PCTSP), PCTSP with Time Windows (PCTSPTW), and Capacitated VRP (CVRP). Table 1 shows the Macro-F1 (MF1) and calculation time of edge classification in each problem, where LKH, Concorde, and ORTools (the annotation strategy with different sovlers) are baselines and \(\rm{EC}^*_*\) is the familiy of our edge classifier (second block includes model ablations and the third one includes variants of loss functions). In short, our edge classifer outperforms baselines by a large margin in terms of calculation time while maintaining reasonable MF1 (e.g., 85-90%) on most problems. The results demonstrates its capability for handling a huge number of requests in real-time applications. See our paper for more datails of this evaluation, including the annotation strategy, dataset statics, baselines, and ablations.

The results of quantitative evaluation


We also conduct a qualitative evaluation of RouteExplainer on a Kyoto tour (case study of TSPTW). Fig.6 shows the explanation generated by RouteExplainer, which includes an explanation text generated by GPT-4 and visual infomation (i.e., a route and intentions of each edge in the route) of the actual and CF routes. It successfully explains the importance of the actual edge while mentioning the difference between the actual and CF edges regarding their short/long-time effects and impact on the intentions of the subsequent edges. Furthermore, GPT-4 enables incorporating information of each destination in the explanation (e.g., the explanation points out the losses of attenting a guided tour and taking lunch from not visiting the Kyoto Geishinkan).
Please experience our framework for yourself at Hugging Face Spaces (OpenAI API key is required)!

The results of qualitative evaluation
Fig.6: An explanation generated by RouteExplainer.

What's next?

In this work, we proposed the first explanation framework for VRP, and opened the new field of "explanaibily for VRP". On the other hand, some limitations remain, including the accuracy issue of the edge classifier in some VRPs and simple edge intentions annotated by a rule. We will address these limitaions by increasing data/parameters and leveraging LLM-powered edge annotation. Furthermore, we plan to apply our framework to more general LLM applications, so please stay tuned! We also welcome your extension of this work to various applications :)

Citation

If you find this work useful, please cite our paper as follows:
@article{dkiku2024routeexplainer,
  author = {Daisuke Kikuta and Hiroki Ikeuchi and Kengo Tajiri and Yuusuke Nakano},
  title = {RouteExplainer: An Explanation Framework for Vehicle Routing Problem},
  year = 2024,
  journal = {arXiv preprint arXiv:2403.03585}
  url = {https://arxiv.org/abs/2403.03585}
}

Reference

Here is the part of works related to our framework. See "Related Work" in our paper for all excellent related works.

  1. Dantzig, G.B., Ramser, J.H.: The truck dispatching problem. Management Science 6(1), 80–91 (1959).
  2. Dumas, Y., Desrosiers, J., Gelinas, E., Solomon, M.M.: An optimal algorithm for the traveling salesman problem with time windows. Operations Research 43(2), 367–371 (1995).
  3. Lopez, L., Carter, M.W., Gendreau, M.: The hot strip mill production scheduling problem: A tabu search approach. European Journal of Operational Research 106(2-3), 317–335 (1998).
  4. Pessoa, A.A., Sadykov, R., Uchoa, E., Vanderbeck, F.: A Generic Exact Solver for Vehicle Routing and Related Problems. Mathematical Programming 183, 483–523 (2020).
  5. Helsgaun, K.: An extension of the lin-kernighan-helsgaun tsp solver for constrained traveling salesman and vehicle routing problems: Technical report (2017).
  6. Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Advances in Neural Information Processing Systems. vol. 28 (2015).
  7. Bello, I., Pham, H., Le, Q.V., Norouzi, M., Bengio, S.: Neural combinatorial optimization with reinforcement learning (2017).
  8. Kool, W., van Hoof, H., Welling, M.: Attention, learn to solve routing problems! In: International Conference on Learning Representations (2019).
  9. Joshi, C.K., Laurent, T., Bresson, X.: An efficient graph convolutional network technique for the travelling salesman problem (2019).
  10. Kool, W., van Hoof, H., Gromicho, J., Welling, M.: Deep policy dynamic programming for vehicle routing problems (2021).
  11. Li, S., Yan, Z., Wu, C.: Learning to delegate for large-scale vehicle routing. In: Thirty-Fifth Conference on Neural Information Processing Systems (2021).
  12. Xin, L., Song, W., Cao, Z., Zhang, J.: Neurolkh: Combining deep learning model with lin-kernighan-helsgaun heuristic for solving the traveling salesman problem. In: Advances in Neural Information Processing Systems. vol. 34 (2021).
  13. Madumal, P., Miller, T., Sonenberg, L., Vetere, F.: Explainable reinforcement learning through a causal lens. In: AAAI Conference on Artificial Intelligence (2019).
  14. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems. vol. 30 (2017).
  15. Halpern, J.Y., Pearl, J.: Causes and explanations: A structural-model approach. part i: Causes. The British Journal for the Philosophy of Science 56(4), 843–887 (2005).