Research progress of unmanned mobile vision technology for complex dynamic scenes

Research output: Contribution to journalArticlepeer-review

Abstract

In today’s era of raid automation and technological advancement,unmanned systems are increasingly becoming a key area of strategic competition among major global powers. These new domains and capabilities of unmanned systems are not only key to supporting national security and strategic interests but also serve as the core force driving future technological innovation and application development. Unmanned systems are reshaping the boundaries of national security and redefining the connotations of strategic advantages. As a key component of unmanned systems,unmanned mobile visual technology is demonstrating its immense potential in assisting humans to gain a deeper understanding of the physical world. The advancement of this technology not only equips unmanned systems with richer and more precise perceptual capabilities but also offers humans with new perspectives to observe,analyze,and ultimately master the complex and dynamic physical environment. In the early stages of unmanned mobile visual technology development,researchers mainly relied on traditional learning methods for image processing. These methods focused on manual feature extraction,which depended heavily on the experience and knowledge of domain experts. For instance,feature descriptors such as scale-invariant feature transform(SIFT)and histogram of oriented gradients(HOG)played crucial roles in tasks such as image matching and target detection. Although traditional visual analysis methods still hold value in specific situations,their dependence on manual feature extraction and professional knowledge limits efficiency and accuracy. With the advent of deep neural network technology,unmanned mobile visual technology has ushered in revolutionary progress. Deep neural networks,through automatic feature extraction and hierarchical structures,can learn feature representations ranging from simple to complex,allowing them to capture local image features but also understand and interpret higher-level semantic information. Thus,these networks notably enhance the fitting and discriminative capabilities of models,offering advantages that traditional methods cannot match. Consequently,deep neural networks have become the benchmark for unmanned mobile visual technology. However,in practical applications,unmanned systems often encounter complex,diverse,and dynamically changing application scenarios,which present considerable challenges for the deployment and effectiveness of deep learning models. First,the complexity and dynamics of the imaging environment present notable problems in unmanned systems. Drastic changes in lighting,unpredictable weather conditions,and interference from other moving objects can degrade image quality,thereby affecting subsequent processing and analysis. Second,the high-speed maneuverability and camouflage strategies of imaging targets add another layer of difficulty for unmanned mobile visual systems. The rapid movement of targets complicates stable tracking,while camouflage and concealment make detection notably more difficult. These factors collectively reduce the accuracy of scene reconstruction,interpretation,and target identification in deep neural network-based unmanned mobile visual models. Furthermore,the diversity of imaging tasks introduces additional challenges. Different tasks often require tailored visual processing strategies,and the system must possess sufficient flexibility and adaptability to effectively handle different tasks. However,current deep neural network models are often tailored for specific tasks,limiting their adaptability across diverse applications. The uncertainty and unpredictability of environmental factors impose demanding requirements on unmanned mobile visual systems. These systems need to offer precise perception and in-depth analysis to provide decision support,enabling automated systems to respond quickly and accurately to environmental changes,thus improving overall efficiency and reliability. In response to the visual challenges of unmanned systems in complex,dynamic environments,this article delves into the current state of development of unmanned mobile visual technology in addressing these challenges,focusing on five key technical areas:image enhancement,3D reconstruction,scene segmentation,object detection,and anomaly detection. Image enhancement,being the first step,is crucial for improving the quality of visual data. This process improves the contrast,clarity,and color of images,providing highly reliable input for subsequent analysis and processing,which enhances the performance of unmanned systems under various environmental conditions. 3D reconstruction technology facilitates the recovery of three-dimensional structures from two-dimensional images,enabling unmanned systems to gain a more comprehensive understanding of the environment,making it better suited for tasks in complex settings. Scene segmentation involves partitioning an image into semantically meaningful regions or objects,providing a basis for precise environmental perception and target recognition. Object detection is central to unmanned mobile visual technology,enabling the system to locate and identify specific targets within images or video streams. In contrast,anomaly detection focuses on identifying anomalies or events in the scene,allowing unmanned systems to timely identify and respond to potential threats. This article will provide an in-depth exploration of the research ideas,current status,and the advantages and disadvantages of typical algorithms for these key technologies,while also analyzing their performance in practical applications. The integration and collaboration of these technologies have substantially enhanced the visual perception capabilities of unmanned systems in dynamic and complex scenes,enabling them to perform tasks more intelligently and autonomously. Although some progress has been made in unmanned mobile visual technology,numerous problems in its practical application within complex dynamic scenes are still encountered. This review aims to provide a comprehensive perspective,systematically examining and analyzing the latest research advancements in unmanned mobile visual technology for such scenes. This paper explores the advantages and limitations of the above key tasks in practical applications. In addition,this paper will discuss the gaps and challenges in current research and propose future possible research directions. Through in-depth exploration of these research directions,unmanned mobile visual technology will continue to advance,offering more robust and flexible solutions to address the challenges posed by complex dynamic scenes. This progress will lay a solid foundation for the long-term development and practical application of unmanned systems in the fields of automation and intelligence.

Translated title of the contribution面向复杂动态场景的无人移动视觉技术研究进展
Original languageEnglish
Pages (from-to)1828-1871
Number of pages44
JournalJournal of Image and Graphics
Volume30
Issue number6
DOIs
StatePublished - Jun 2025

Keywords

  • 3D reconstruction
  • anomaly detection
  • complex dynamic scenes
  • image enhancement
  • object detection
  • scene segmentation
  • unmanned mobile vision

Fingerprint

Dive into the research topics of 'Research progress of unmanned mobile vision technology for complex dynamic scenes'. Together they form a unique fingerprint.

Cite this