CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

Chao Chen; Lujia Li; Mingyan Li; Ruiyuan Li; Zhu Wang; Fei Wu; Chaocan Xiang

doi:10.1109/TITS.2022.3219543

CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

Chao Chen, Lujia Li, Mingyan Li, Ruiyuan Li, Zhu Wang, Fei Wu, Chaocan Xiang

计算机学院

Chongqing University

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.

源语言	英语
页（从-至）	1949-1961
页数	13
期刊	IEEE Transactions on Intelligent Transportation Systems
卷	24
期	2
DOI	https://doi.org/10.1109/TITS.2022.3219543
出版状态	已出版 - 1 2月 2023

联合国可持续发展目标

此成果有助于实现下列可持续发展目标：

访问文件

10.1109/TITS.2022.3219543

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{bf4ef45d7044416f8f4cfb5711036cb7,

title = "CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning",

abstract = "Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.",

keywords = "cost and utility, deep reinforcement learning, intelligent transportation systems (ITS), route planning",

author = "Chao Chen and Lujia Li and Mingyan Li and Ruiyuan Li and Zhu Wang and Fei Wu and Chaocan Xiang",

note = "Publisher Copyright: {\textcopyright} 2000-2011 IEEE.",

year = "2023",

month = feb,

day = "1",

doi = "10.1109/TITS.2022.3219543",

language = "英语",

volume = "24",

pages = "1949--1961",

journal = "IEEE Transactions on Intelligent Transportation Systems",

issn = "1524-9050",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "2",

}

TY - JOUR

T1 - CuRL

T2 - A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

AU - Chen, Chao

AU - Li, Lujia

AU - Li, Mingyan

AU - Li, Ruiyuan

AU - Wang, Zhu

AU - Wu, Fei

AU - Xiang, Chaocan

PY - 2023/2/1

Y1 - 2023/2/1

N2 - Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.

AB - Traditional path-finding studies basically focus on planning the path with the shortest travel distance or the least travel time over city road networks. In recent years, with the increasing needs of diverse routing services in smart cities, the bi-criteria optimum path-finding problem (i.e., minimizing path distance and optimizing extra cost or utility according to users' preference) has drawn wide attention. For instance, in addition to distance, the previous studies further find routes with more scenery (utility) or less crime risk (cost). However, existing works are scenario-oriented which optimize specific cost or utility, ignoring that the routing planner should be universal to deal with both cost and utility in different real-life scenarios. To fill this gap, this paper proposes a generic bi-criteria optimum path-finding framework (cuRL) based on deep reinforcement learning (DRL). Specifically, we design a novel state representation and reward function for the DRL model of cuRL to overcome the challenges that 1) the cost and utility should be optimized with minimal path distance in a unified manner; 2) the diverse distributions of cost and utility in various scenarios should be well-addressed. Then, a transition preprocessing method is proposed to enable the efficient training of DRL and avoid detours. Finally, simulations are performed to verify the effectiveness of cuRL, where two criteria (i.e., solar radiation and crime risk) are modelled based on the real-world data in downtown New York. Comparing with a set of baseline algorithms, the evaluation results demonstrate the priority of the proposed framework for its generality.

KW - cost and utility

KW - deep reinforcement learning

KW - intelligent transportation systems (ITS)

KW - route planning

UR - http://www.scopus.com/inward/record.url?scp=85142810567&partnerID=8YFLogxK

U2 - 10.1109/TITS.2022.3219543

DO - 10.1109/TITS.2022.3219543

M3 - 文章

AN - SCOPUS:85142810567

SN - 1524-9050

VL - 24

SP - 1949

EP - 1961

JO - IEEE Transactions on Intelligent Transportation Systems

JF - IEEE Transactions on Intelligent Transportation Systems

IS - 2

ER -

CuRL: A Generic Framework for Bi-Criteria Optimum Path-Finding Based on Deep Reinforcement Learning

摘要

联合国可持续发展目标

访问文件

其它文件与链接

指纹

引用此