CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

Chao Chen; Lujia Li; Mingyan Li; Ruiyuan Li; Zhu Wang; Fei Wu; Chaocan Xiang

doi:10.1109/TITS.2022.3219543

CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

Chao Chen, Lujia Li, Mingyan Li, Ruiyuan Li, Zhu Wang, Fei Wu, Chaocan Xiang

School of Computer Science

Chongqing University

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.

Original language	English
Pages (from-to)	1949-1961
Number of pages	13
Journal	IEEE Transactions on Intelligent Transportation Systems
Volume	24
Issue number	2
DOIs	https://doi.org/10.1109/TITS.2022.3219543
State	Published - 1 Feb 2023

Keywords

cost and utility
deep reinforcement learning
intelligent transportation systems (ITS)
route planning

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/TITS.2022.3219543

Cite this

@article{bf4ef45d7044416f8f4cfb5711036cb7,

title = "CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning",

abstract = "Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.",

keywords = "cost and utility, deep reinforcement learning, intelligent transportation systems (ITS), route planning",

author = "Chao Chen and Lujia Li and Mingyan Li and Ruiyuan Li and Zhu Wang and Fei Wu and Chaocan Xiang",

note = "Publisher Copyright: {\textcopyright} 2000-2011 IEEE.",

year = "2023",

month = feb,

day = "1",

doi = "10.1109/TITS.2022.3219543",

language = "英语",

volume = "24",

pages = "1949--1961",

journal = "IEEE Transactions on Intelligent Transportation Systems",

issn = "1524-9050",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "2",

}

TY - JOUR

T1 - CuRL

T2 - A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

AU - Chen, Chao

AU - Li, Lujia

AU - Li, Mingyan

AU - Li, Ruiyuan

AU - Wang, Zhu

AU - Wu, Fei

AU - Xiang, Chaocan

PY - 2023/2/1

Y1 - 2023/2/1

N2 - Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.

AB - Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.

KW - cost and utility

KW - deep reinforcement learning

KW - intelligent transportation systems (ITS)

KW - route planning

UR - http://www.scopus.com/inward/record.url?scp=85142810567&partnerID=8YFLogxK

U2 - 10.1109/TITS.2022.3219543

DO - 10.1109/TITS.2022.3219543

M3 - 文章

AN - SCOPUS:85142810567

SN - 1524-9050

VL - 24

SP - 1949

EP - 1961

JO - IEEE Transactions on Intelligent Transportation Systems

JF - IEEE Transactions on Intelligent Transportation Systems

IS - 2

ER -

CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this