Reinforcement learning vehicle routing

Reinforcement learning vehicle routing. In this work, Reinforcement Learning Solver for Vehicle Routing Problem (RL SolVeR Pro) is proposed wherein the optimal route learning problem is cast as a Markov Decision Process (MDP). To address this problem, we employ a decomposition strategy based on weights to decompose the multi Apr 18, 2022 · Accurate and real-time tracking for real-world urban logistics has become a popular research topic in the field of intelligent transportation. Feb 15, 2024 · This paper introduces a reinforcement learning approach to optimize the Stochastic Vehicle Routing Problem with Time Windows (SVRP), focusing on reducing travel costs in goods delivery. The paper surveys the different RL approaches used to solve VRP and its variants. To address this challenge, this article investigates the trajectory planning problem of Implementation of: Nazari, Mohammadreza, et al. • Simultaneously solve emergency vehicle (EMV) routing and congestion alleviation. For that reason, in this paper, a hyper-heuristic (HH) approach called Hyper-heuristic Adaptive Simulated Annealing with Reinforcement Learning (HHASA$_{RL}$) is proposed. Simone Foa, Corrado Coppola, Giorgio Grani, Laura Palagi. 9861–9871. In recent years, many scholars have used the DRL algorithms to solve a classic combinatorial optimization problem, i. Recent works have shown that attention-based RL models outperform recurrent neural network-based methods on these problems in terms of both effectiveness and efficiency. , 2017), vehicle routing problem (James, Yu, & Gu, 2019), critical nodes identification (Fan, Zeng, Sun, & Liu, 2020), semiconductor fabs (Hwang & Jang, 2020), etc. However, finding the best next action given a value function of arbitrary Vehicle routing problem with time windows (VRPTW) is a practical and complex ve-hicle routing problem (VRP) which is faced by thousands of companies in logistics and transportation. Dec 3, 2018 · ABSTRACT. , driving too slowly or too fast will cause high carbon emissions. (2019) applied reinforcement learning, more specifically the Q-Learning algorithm in combination with a metaheuristic framework, to solve vehicle routing and scheduling problems. 2021. i. Reinforcement Learning-based approach for dynamic vehicle routing problem with stochastic demand. (a) dynamic combinatorial optimization problems can naturally be modeled as Markov decision processes (MDP, see Jul 30, 2022 · Solving the vehicle routing problem with deep reinforcement learning. • Vehicle routing problem with time window (VRPTW) is of great importance for a wide spectrum of services and real-life applications, such as online take-out and car-hailing platforms. stochastic demands provide opportunities Jun 1, 2021 · Many successful applications of reinforcement learning have been reported such as games (Silver et al. The Pointer Network (PtrNet) [4], originated from the encoder-decoder-structured sequence-to-sequence network in natural language processing (NLP), was em- May 26, 2021 · Keywords: electric vehicle, energy management, Q-learning, reinforcement learning, vehicle routing. However, vehicles in Feb 12, 2018 · Abstract and Figures. It includes a vehicle selection decoder accounting for the Index Terms—reinforcement learning, supervised learning, neural combinatorial optimization, vehicle routing. Front. Sep 1, 2022 · We have used our model to two classical routing problems, i. Firstly, a deep reinforcement learning based framework integrated based on the attention mechanism is proposed to construct routes for HVRP with both minmax and min-sum objectives. Specifically, we use a variant of the DQN algorithm that incorporates the target network and experience replay Jun 20, 2023 · Coordinated multi-agent hierarchical deep reinforcement learning to solve multi-trip vehicle routing problems with soft time windows Jan 1, 2021 · For example, using deep reinforcement learning technology to improve metaheuristic algorithms and utilizing deep reinforcement learning technology in hyper-heuristic algorithms could effectively . Jul 15, 2020 · Experimental results demonstrate that the DRL model alone finds better solutions compared to construction algorithms and previous DRL approaches, while enabling a 5- to 40-fold speedup, and combined with various local search methods yields excellent solutions at a superior generation speed, comparing to that of other initial solutions. See the tasks/ folder for details. Google Scholar Digital Library; Christian Prins. Recently, VRP is being solved with the use of deep reinforcement learning (DRL), with node sets considered (represented) as a graph structure. However, the nature and scope of real-world urban logistics are highly dynamic, and the existing optimization technique cannot precisely The Vehicle Routing Problem (VRP) is a combinatorial optimization problem that has been studied in applied mathematics and computer science for decades. This is essentially due to the nature of the traditional Jan 1, 2022 · Dynamic routing of electric commercial vehicles can be a challenging problem since besides the uncertainty of energy consumption there are also random customer requests. Oct 22, 2020 · This work develops a framework for value-function-based deep reinforcement learning with a combinatorial action space, in which the action selection problem is explicitly formulated as a mixed-integer optimization problem. Among this class of methods, the Proximal Policy Optimization algorithm (PPO) is discussed. Table 5 is a summary of the average indicators obtained by different algorithms for 200 different instances. Big Data 4:586481. ChenhaoZhoua, JingxinMaa, LouisDougeb, Ek PengChewb, Loo HayLeeb. 04240 Google Scholar [24] Pessoa Artur, Sadykov Ruslan, Uchoa Eduardo, and Vanderbeck François. In this approach, we train a single model that finds near-optimal solutions for problem instances sampled from a given distribution, only by observing the reward signals and following feasibility rules. Vehicle routing problem (VRP) is one of the classic combinatorial optimization problems where an optimal tour to visit customers is required with a minimum total cost in the presence of some constraints. The vehicle routing problem with backhauls (VRPBs) is a challenging problem commonly studied in computer science and operations research. Aug 17, 2023 · We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Jul 15, 2020 · Rabe et al. Math. A simple and effective evolutionary algorithm for the vehicle routing problem. 1 Deep Learning for Vehicle Routing Problems In recent years, there has been extensive exploration of the use of deep (reinforcement) learning to solve vehicle routing problems. We propose a novel end-to-end framework that comprehensively addresses Jun 25, 2023 · The Vehicle Routing Problem with Time Windows (VRPTW) is a challenging combinatorial optimization problem with many real-world applications. Volume 182, August 2023, 109443. AI] 12 Feb 2018 Abstract We present an end-to-end framework for solving Vehicle Routing Problem (VRP) using deep reinforcement learning. May 9, 2024 · Existing deep reinforcement learning (DRL) methods for multi-objective vehicle routing problems (MOVRPs) always decompose an MOVRP into subproblems with respective preferences and then train Feb 13, 2023 · Vehicle routing problem (VRP) is one of the classic combinatorial optimization problems where an optimal tour to visit customers is required with a minimum total cost in the presence of some constraints. I. Different variants of the Vehicle Routing Problem (VRP Dec 8, 2023 · Vehicle routing in VANET is an ever-growing challenge that seeks to identify the most efficient collection of routes for a given fleet of vehicles, although numerous efficient-routing protocols We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. However, existing RL models simply aggregate node embeddings to generate the context embedding without taking into account the Dec 13, 2023 · The suggested RL-based VRP solver learns to make wise routing decisions based on observable environmental information by utilizing RL algorithms such as deep Q-networks (DQN) or proximal policy optimization (PPO) by utilizing RL algorithms such as deep Q-networks (DQN) or proximal policy optimization (PPO). (2021) multi-objective stochastic VRP hypothesis vehicle management intelligence. In this study, the Vehicle Routing Problem (VRP) is solved using reinforcement learning Nov 13, 2023 · This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Stochastic Vehicle Routing Problem (SVRP) that involves the challenging task of optimizing vehicle routes under uncertain conditions. 2. Mar 29, 2024 · In this article, we propose a neural heuristic based on deep reinforcement learning (DRL) to solve the traditional and improved VRPB variants, with an encoder–decoder structured policy network trained to sequentially construct the routes for vehicles. Usually, VRP is solved by traditional heuristic algorithms. May 23, 2022 · Deep Reinforcement Learning (DRL) has been successful applied to a number of fields. In theory, reinforcement learning (RL) is an ideal solution method for dynamic combinatorial optimization and SDVRPs because. While the routing of urban logistic service is usually accomplished via complex mathematical and analytical methods. Received: 23 July 2020; Accepted We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. Jun 26, 2022 · This paper explores the recent advancements in solving VRP using reinforcement learning (RL). 8 (a)–(b) make eco-routing behaviors with the objective of maintaining a reasonable vehicle speed range (e. Program. Jul 8, 2021 · In this article, we study the reinforcement learning (RL) for vehicle routing problems (VRPs). Specifically, we use a variant of the DQN algorithm that incorporates the target network and experience replay Jun 20, 2023 · MA–DRL, multi–agent deep reinforcement learning; CMA-HDRL, coordinated multi-agent hierarchical deep reinforcement learning; MTVRPTW, multi-trip vehicle routing problem with time windows. • Formulate a model to capture the establishment of emergency lane for EMV passage. YUAN JIANG, Zhiguang Cao, Yaoxin Wu, Wen Song, Jie Zhang. Jan 1, 2022 · Keywords: vehicle routing problem; stochastic dynamic vehicle routing problem; multi-agent systems; deep reinforcement learning 1. Crowdsourcing is flexible and convenient to reduce transportation costs and carbon emissions. Many supply chains (SC) choose the joint distribution of multiple depots to cut transportation costs and delivery times. In this section, we describe the Vehicle routing problem from a mathematical optimization point of view and the basics of reinforcement learning (RL), focusing on one class of methods (The actor critic methods). 04240v1 [cs. 2020. ﬁxed) and stochastic customers. Furthermore, a multi-agent reinforcement learning method with an unsupervised auxiliary network is developed for the model training. Hence, their key to construct a solution solely lies in the selection of the next node (customer) to visit excluding the selection of vehicle. Preliminaries and notation. Therefore Feb 13, 2023 · Vehicle routing problem (VRP) is one of the classic combinatorial optimization problems where an optimal tour to visit customers is required with a minimum total cost in the presence of some constraints. INTRODUCTION Cost-effective logistics systems overall deﬁne the competitiveness of the companies, and the relation of logistics expenditure to GDP indicates the effectiveness of business operations in a country. However, crowdshipping requires to adapt to real-time changes such as road conditions and customer demands, which heuristic algorithms are not suitable for addressing these issues. by the learning model, but also by the formulation of the underlying decision problem. Google Scholar Cross Ref [25] Qi Chengming and Sun Yunchuan Dec 1, 2023 · Considering that the relationship between vehicle speed and carbon emissions follows a polynomial curve [38], EVs in Fig. arXiv: 1802. While performing favourably on the independent and identically distributed (i. Introduction Modern society is significantly reliant on complex multi-modal logistic networks as part of a globalization trend. Therefore Oct 6, 2021 · Existing deep reinforcement learning (DRL) based methods for solving the capacitated vehicle routing problem (CVRP) intrinsically cope with homogeneous vehicle fleet, in which the fleet is assumed as repetitions of a single vehicle. This paper introduces the Dynamic Stochastic Electric Vehicle Routing Problem (DS-EVRP). In this paper, we propose a Deep Reinforcement Learning (DRL) approach for solving the VRPTW using the Deep Q-Network (DQN) algorithm. In Proceedings of the 32nd International Conference on Neural Information Processing Systems(Montréal, Canada) (NIPS’18). We present an end-to-end framework for solving Vehicle Routing Problem (VRP) using deep reinforcement learning. The experimental results show that the optimization effect of our model on small and medium-sized TSP and CVRP surpasses the state-of-the-art DRL-based methods and some traditional algorithms. A promising Abstract. For instance, in Jan 1, 2023 · Propose a multi-agent reinforcement learning framework for traffic signal control. We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows. In this approach, we train a single model that finds near May 23, 2022 · A deep reinforcement learning algorithm for large-scale vehicle routing problems. A promising method should gen-erate high-qualified solutions within limited inference time, and there are three major challenges: (a) directly optimizing the goal Oct 22, 2020 · Reinforcement Learning with Combinatorial Actions: An Application to Vehicle Routing. We focus on the Capacitated Vehicle Routing Problem, a classical combinatorial problem from operations research [13], where a single capacity-limited vehicle must be assigned one or more routes to satisfy customer demands while minimizing total travel distance. 2004. Their framework, like Bandit-VNS, utilized a multi-agent structure. The ﬁxed number of vehicles in the ﬂeet and tighter time windows for customer demand have transformed traditional Vehicle Routing Problem (VRP) into Vehicle Routing Problem with Time Windows (VRPTW). methods for solving the capacitated vehicle routing problem. However, heuristic methods may fail to generate high-quality solutions for massive problems instantly. e. g. Feb 12, 2018 · We present an end-to-end framework for solving Vehicle Routing Problem (VRP) using deep reinforcement learning. We propose a novel end-to-end framework that comprehensively addresses the key sources of stochasticity in SVRP and utilizes an RL Jan 1, 2022 · Keywords: vehicle routing problem; stochastic dynamic vehicle routing problem; multi-agent systems; deep reinforcement learning 1. However, finding the best next action given a value function of arbitrary complexity is nontrivial when the Reinforcement Learning for Solving the Vehicle Routing Problem. 183, 1 (2020), 483 – 523. Currently, Traveling Salesman Problems and Vehicle Routing Problems are supported. Oct 6, 2021 · Existing deep reinforcement learning (DRL) based methods for solving the capacitated vehicle routing problem (CVRP) intrinsically cope with homogeneous vehicle fleet, in which the fleet is assumed as repetitions of a single vehicle. In this work, first, the problem is modeled as a Markov Decision Process (MDP) and then the PPO method (which belongs to the Actor-Critic class of Reinforcement learning In this study, the Vehicle Routing Problem (VRP) is solved using reinforcement learning (RL) approaches. A Safe Reinforcement Learning method is proposed for solving the problem. In this research, we propose an end-to-end deep reinforcement learning framework to solve the A Hybrid Reinforcement Learning-based Model for the Vehicle Routing Problem a VRP using MILP, the added penalties often result in task infeasibility, owing to its highly constrained nature. Jan 28, 2022 · Firstly, the form of multi-agent reinforcement learning for the multi-depot vehicle routing problem is defined, including state, action, reward, and transition function, so that the model can be Feb 1, 2023 · The solution in sequential decision problems is a decision policy instead of a single solution vector. , 2016, Silver et al. Vehicle Routing Problem (VRP). Jan 8, 2024 · New extensions to existing encoder-decoder attention models which allow them to handle multiple trucks and multi-leg routing requirements are developed, which find that the algorithm outperforms Aisin's previous best solution. , 50–80 km/h) for the transport network, i. Download conference paper PDF. • Empower the actor-critic method with policy sharing and spatially adjusted reward. The MOVRP considered in this study involves two objectives: travel distance and altitude difference. However, vehicles in Jun 25, 2023 · The Vehicle Routing Problem with Time Windows (VRPTW) is a challenging combinatorial optimization problem with many real-world applications. Recently, the applications of the methodologies of Reinforcement Learning (RL) to NP-Hard Combinatorial optimization problems have become a popular topic. The paper also presents the issues and challenges that emerged with the use of RL to solve the VRP variants. d. In this approach, we train a single model that finds near Jun 20, 2023 · Multi-Trip Vehicle Routing Problem with Time Windows (MTVRPTW), as a further evolved problem of VRP considering multiple departures from one depot and temporal constraint of visiting nodes, has developed into one of the critical issues in the scheduling of logistics, bus transit, railway, and aviation. Specifically, one of its variants, Vehicle Routing Problem with Time Windows (VRPTW), where the customer locations have time windows within which the deliveries should be made, has attracted wide attention . To this end, this thesis aims to develop efficient DRL based methods to solve above limitations. This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Stochastic Vehicle Routing Problem (SVRP) that involves the challenging task of optimizing vehicle routes under uncertain conditions. In order to service a group of consumers efficiently, the VRP involves identifying the shortest, most cost-effective, and fastest routes for a fleet of vehicles. Computers & Industrial Engineering. The results revealed that as the percentage of intelligent agents increases, it is more difficult to converge to an optimal solution. Ensemble-based Deep Reinforcement Learning for Vehicle Routing Problems under Distribution Shift. In this paper, we propose an end-to-end deep reinforcement learning framework to Jan 25, 2024 · Vehicle Routing Problem (VRP) is a long-standing research problem in both academia and industry as is its practical importance [4, 30]. Value-function-based methods have long played an important role in reinforcement learning. An attention-based neural network trained through reinforcement learning is Recently, deep reinforcement learning has been frequently used to deal with vehicle routing problems (VRPs), where the learned policy guides the selection of the next node to be visited. Snyder, 1 Martin Takáč 1 arXiv:1802. Edit social preview. Dec 1, 2020 · Specifically, the vehicle routing problem is regarded as a vehicle tour generation process, and an encoder-decoder framework with attention layers is proposed to generate tours of multiple vehicles iteratively. For instance, in Reinforcement Learning for Solving the Vehicle Routing Problem. This paper proposes a pre-training mechanism for online shared networks in the dual-network reinforcement learning mode, and shows that the algorithm can obtain good solutions in terms of solution quality and offline solution efficiency. We present an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning. Jun 27, 2022 · EMVLight, a decentralized reinforcement learning (RL) framework for simultaneous dynamic routing and traffic signal control, extends Dijkstra's algorithm to efficiently update the optimal route for the EMVs in real-time as it travels through the traffic network. VRP is known to be a computationally difficult problem for which many exact and heuristic algorithms have been proposed, but providing fast and reliable solutions is still a challenging task. Furthermore, allowing multiple departures from depot builds up the Multi-Trip Vehicle Routing Problem with More precisely, we consider a Dynamic Vehicle Routing Problem (DVRP) with time windows and both known (i. Oct 5, 2020 · The past decade has seen a rapid penetration of electric vehicles (EV) in the market, more and more logistics and transportation companies start to deploy EVs for service provision. Citation: Dorokhova M, Ballif C and Wyrsch N (2021) Routing of Electric Vehicles With Intermediary Charging Stations: A Reinforcement Learning Approach. The system implements the action and provides a reward to the agent. Jul 30, 2023 · This paper presents a novel app roach for solving the multi-objective vehicle routing problem (MOVRP) using deep reinforcement learning. For combinatorial optimization problems, DRL can learn the characteristics of problem according to the solution Dec 24, 2021 · Her research interests focus on applying optimisation (LP/IP/MIP, Branch & Bound, Branch & Price, Nonlinear Programming) and machine learning techniques (Heuristics algorithms and Machine Learning algorithms) to solve large scaled real-world problem, including vehicle routing, transportation scheduling, network design, etc. (2019) green logistic system online routing reinforcement learning, combinatorial optimization Niu et al. Jul 3, 2023 · The multi-depot vehicle routing problem (MDVRP) is one of the most essential and useful variants of the traditional vehicle routing problem (VRP) in supply chain management (SCM) and logistics studies. The vehicle routing Feb 1, 2023 · Multi-depot vehicle routing problem with soft time windows (MD-VRPSTW) is a valuable practical issue in urban logistics. Arthur Delarue, Ross Anderson, Christian Tjandraatmadja. In order to model the operations of a commercial EV fleet, we utilize the EV routing problem with time windows (EVRPTW). Our method involves iteratively improving initial solutions using an enhanced heuristic algorithm and automatically learning the improved heuristic rules through deep reinforcement learning. We present QuikRouteFinder that uses reinforcement learning (RL) to address the problem of routing for EVs with multi-service delivery by overcoming the above challenges. Our model represents a parameterized stochastic policy, and by applying a policy In this paper, we explore an attention-based deep reinforcement learning approach for vehicle routing problems. We consider a Aug 1, 2023 · Reinforcement Learning-based approach for dynamic vehicle routing problem with stochastic demand - ScienceDirect. 3389/fdata. Existing attention-based models often treat city nodes merely as input features, overlooking This paper presents a novel app roach for solving the multi-objective vehicle routing problem (MOVRP) using deep reinforcement learning. With large-scale instances and dynamic situations, traditional methods for solving the VRP confront difficulties. Featured by linehaul (or delivery) and backhaul (or pickup) customers, the VRPB has broad applications in real-world logistics. Recently, researchers tend to apply deep reinforcement learning (DRL) to automatically learn the searching rules in Jan 26, 2023 · Based on a brief introduction to the conventional methods to solve the vehicle routing problem, this paper focuses on summarizing the algorithms for solving the vehicle routing problem based on reinforcement learning (RL) and deep reinforcement learning (DRL). In RL, a controller agent observes a system state and decides a control action. However, such a node-output strategy is not adequate for handling vehicle trajectory planning in complex road networks. Recently, deep learning models under the reinforcement learning (RL) framework have been pro- Although widely studied, it is still challenging for conventional methods including exact, approximation, and heuristic methods to efficiently address the vehicle routing problems with fast computation due to the NP-hard nature. The effective management of vehicle routing helps companies reduce operational costs and increases its competitiveness. d The multi-depot vehicle routing problem (MDVRP) is one of the most essential and useful variants of the traditional vehicle routing problem (VRP) in supply chain management (SCM) and logistics studies. (2020) Methods James et al. In the field of hyper-heuristic, reinforcement learning Index Terms—reinforcement learning, supervised learning, neural combinatorial optimization, vehicle routing. Expand. A generic exact solver for vehicle routing and related problems. The curse of dimensionality of RL is also overcome by using two-phase solver with geometric clustering. Feb 12, 2018 · Abstract and Figures. The scale of the problems that are solved in the literatures is small, thus it is difficult to apply the algorithm into practice where there are many large Jun 7, 2022 · However, there are still deficiencies in routing the trajectories of last-mile logistics that continue to impact social and economic sustainability. Oct 10, 2023 · Request PDF | Ensemble-based Deep Reinforcement Learning for Vehicle Routing Problems under Distribution Shift | While performing favourably on the independent and identically distributed (i. (CVRP) intrinsically cope with homogeneous vehicle ﬂeet, in. Deep reinforcement learning (RL) has been shown to be effective in producing approximate solutions to some vehicle routing problems (VRPs), especially when using Mar 1, 2023 · Additionally, Silva et al. Finally, the future research on this problem is prospected. Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track. 586481. In this article, we propose a neural heuristic based on deep reinforcement Jul 11, 2018 · The reinforcement learning algorithm is then applied to the multiagent system with different market penetration of autonomous vehicle agents. However, vehicles in Jul 30, 2022 · At this regard, this paper focuses on the application of RL for the Vehicle Routing Problem (VRP), a famous combinatorial problem that belongs to the class of NP-Hard problems. , Traveling Salesman Problem (TSP) and Capacitate Vehicle Routing Problem (CVRP). In this approach, we train a single policy model that finds near-optimal solutions for a broad range of problem instances of similar size, only by observing the reward signals and following feasibility rules. Extending the vehicle routing problem (VRP), the crowdshipping VRP (CVRP) considers crowdsourcing logistics. Our model represents a parameterized stochastic policy, and by applying a policy Deep Reinforcement Learning for Solving the Vehicle Routing Problem Mohammadreza Nazari, 1 Afshin Oroojlooy, 1 Lawrence V. d The past decade has seen a rapid penetration of electric vehicles (EVs) as more and more logistics and transportation companies start to deploy electric vehicles (EVs) for service provision. which the We present QuikRouteFinder that uses reinforcement learning (RL) to address the problem of routing for EVs with multi-service delivery by overcoming the above challenges. Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i. Sep 4, 2021 · Abstract —Existing deep reinforcement learning (DRL) based. This method utilizes Oct 6, 2021 · Existing deep reinforcement learning (DRL) based methods for solving the capacitated vehicle routing problem (CVRP) intrinsically cope with homogeneous vehicle fleet, in which the fleet is assumed as repetitions of a single vehicle. doi: 10. Feb 1, 2023 · Particularly, deep reinforcement learning (DRL) has become increasingly prominent in solving complex sequence decision problems, such as dynamic routing choice [14], automated vehicle control [15], and emergency evacuation [16]. 04240 (2018). " arXiv preprint arXiv:1802. We model this problem as a route-based Markov Decision Process (MDP), and to solve the problem efﬁciently, we propose an approach that com-bines Deep Reinforcement Learning (RL) (speciﬁcally neu Nov 30, 2021 · Currently, the number of deliveries handled by transportation logistics is rapidly increasing because of the significant growth of the e-commerce industry, resulting in the need for improved functional vehicle routing measures for logistic companies. Thus, this paper presents a novel reinforcement learning algorithm integrated with graph attention network (GAT-RL) to efficiently solve the problem. "Deep Reinforcement Learning for Solving the Vehicle Routing Problem. qu fz se ls kf yy ec uf ip cy