Abstract
The container pre-marshalling problem (CPMP) aims to minimise the number of reshuffling moves, ultimately achieving an optimised stacking arrangement in each bay based on the priority of containers during the non-loading phase. Given the sequential decision nature, we formulated the CPMP as a Markov decision process (MDP) model to account for the specific state and action of the reshuffling process. To address the challenge that the relocated container may trigger a chain effect on the subsequent reshuffling moves, this paper develops an improved policy-based Monte Carlo tree search (P-MCTS) to solve the CPMP, where eight composite reshuffling rules and modified upper confidence bounds are employed in the selection phases, and a well-designed heuristic algorithm is utilised in the simulation phases. Meanwhile, considering the effectiveness of reinforcement learning methods for solving the MDP model, an improved Q-learning is proposed as the compared method. Numerical results show that the P-MCTS outperforms all compared methods in scenarios where all containers have different priorities and scenarios where containers can share the same priority.
| Original language | English |
|---|---|
| Pages (from-to) | 4776-4792 |
| Number of pages | 17 |
| Journal | International Journal of Production Research |
| Volume | 62 |
| Issue number | 13 |
| DOIs | |
| State | Published - 2024 |
Keywords
- Automated container terminal
- Container pre-marshalling problem
- Markov decision process
- Monte Carlo tree search
- Q-learning algorithm
Fingerprint
Dive into the research topics of 'A policy-based Monte Carlo tree search method for container pre-marshalling'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver