Optimal Policy Replay: A Simple Method to Reduce Catastrophic Forgetting in Target Incremental Visual Navigation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Visual navigation is a critical task in robotics and artificial intelligence. In recent years, reinforcement learning-based approaches have gained popularity for visual navigation. However, existing methods lack flexibility in learning multiple navigation targets and suffer from catastrophic forgetting. To address these challenges, we propose a novel paradigm called 'target incremental visual navigation' and introduce a method called Optimal Policy Replay (OPR). Target incremental visual navigation aims to study the performance of visual navigation in continuous learning of navigation targets. OPR enables continuous learning of navigation targets without the need for relearning all targets. Our method divides the learning process into on-policy and off-policy stages and stores only the optimal experiences in memory. Experimental results show that OPR effectively alleviates catastrophic forgetting and achieves good performance with a small memory size.

Original languageEnglish
Title of host publicationProceedings - 2023 China Automation Congress, CAC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages9201-9206
Number of pages6
ISBN (Electronic)9798350303759
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 China Automation Congress, CAC 2023 - Chongqing, China
Duration: 17 Nov 202319 Nov 2023

Publication series

NameProceedings - 2023 China Automation Congress, CAC 2023

Conference

Conference2023 China Automation Congress, CAC 2023
Country/TerritoryChina
CityChongqing
Period17/11/2319/11/23

Keywords

  • catastrophic forgetting
  • continual learning
  • reinforcement learning
  • visual navigation

Fingerprint

Dive into the research topics of 'Optimal Policy Replay: A Simple Method to Reduce Catastrophic Forgetting in Target Incremental Visual Navigation'. Together they form a unique fingerprint.

Cite this