TY - GEN
T1 - IMS-YOLOv8
T2 - 15th IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2025
AU - Yang, Fei
AU - Lian, Baowang
AU - Li, Yanda
AU - Dan, Zesheng
AU - Tang, Chengkai
AU - Liu, Yangyang
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - We propose a lightweight IMS-YOLOv8 model built upon the YOLOv8n architecture for fine-grained detection of multiple drone types and joint discrimination between drones and birds. In the backbone, three novel modules are introduced: MV2Block, a lightweight inverted residual unit that substantially reduces parameters and computational cost; MobileViTBlock, which integrates convolution with local self-attention to enhance global context awareness; and MSCAAttention, a multi-scale convolutional attention module with parallel receptive-field branches that strengthens feature extraction for small targets and complex backgrounds. In the neck, a multi-scale fusion structure is employed, and the detection head makes parallel predictions at three scales P3, P4, P5, outputting bounding boxes and confidence scores for seven drone categories plus birds. At the finest scale, an auxiliary segmentation head produces a binary mask to assist detection. Experiments on a mixed dataset containing seven drone models and birds demonstrate that IMS-YOLOv8 achieves the best performance in Precision, Recall, mAP0.5, and mAP0.5:0.95 compared with YOLOv8, YOLOv9, YOLOv10, and various improved variants, while reducing parameter count to just 1.73 M - 42.6% fewer than the baseline - thereby validating its balance of lightweight design and high accuracy in resource-constrained environments.
AB - We propose a lightweight IMS-YOLOv8 model built upon the YOLOv8n architecture for fine-grained detection of multiple drone types and joint discrimination between drones and birds. In the backbone, three novel modules are introduced: MV2Block, a lightweight inverted residual unit that substantially reduces parameters and computational cost; MobileViTBlock, which integrates convolution with local self-attention to enhance global context awareness; and MSCAAttention, a multi-scale convolutional attention module with parallel receptive-field branches that strengthens feature extraction for small targets and complex backgrounds. In the neck, a multi-scale fusion structure is employed, and the detection head makes parallel predictions at three scales P3, P4, P5, outputting bounding boxes and confidence scores for seven drone categories plus birds. At the finest scale, an auxiliary segmentation head produces a binary mask to assist detection. Experiments on a mixed dataset containing seven drone models and birds demonstrate that IMS-YOLOv8 achieves the best performance in Precision, Recall, mAP0.5, and mAP0.5:0.95 compared with YOLOv8, YOLOv9, YOLOv10, and various improved variants, while reducing parameter count to just 1.73 M - 42.6% fewer than the baseline - thereby validating its balance of lightweight design and high accuracy in resource-constrained environments.
KW - Fine-grained UAV model recognition
KW - Lightweight detection
KW - Lightweight vision Transformer
KW - Multi-scale convolutional attention
UR - https://www.scopus.com/pages/publications/105021488869
U2 - 10.1109/ICSPCC66825.2025.11194433
DO - 10.1109/ICSPCC66825.2025.11194433
M3 - 会议稿件
AN - SCOPUS:105021488869
T3 - Proceedings of 2025 IEEE 15th International Conference on Signal Processing, Communications and Computing, ICSPCC 2025
BT - Proceedings of 2025 IEEE 15th International Conference on Signal Processing, Communications and Computing, ICSPCC 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 July 2025 through 21 July 2025
ER -