Abstract
Efficient and accurate detection of apples is critical for the successful implementation of harvesting robots in orchards. However, due to limited memory resources on robotic platforms, it is imperative to develop lightweight detection algorithms that can operate in real-time. To address this challenge, we propose an ultralight convolutional neural network, U-DPnet, based on depth-separable convolution. Our approach incorporates the cross-stage deep separable module (CDM) and the multi-cascade deep separable module (MDM) in the backbone for nonlinear unit addition and attention mechanisms, which reduce the volume of the network while improving the feature representation capability. A simplified bi-directional feature pyramid network (BiFPN) is constructed in the neck for multi-scale feature fusion, and Adaptive feature propagation (AFP) is designed between the neck and the backbone for smooth feature transitions across different scales. To further reduce the network volume, we develop a uniform channel downsampling and network weight-sharing strategy. Multiple loss functions and label assignment strategies are used to optimize the training process. The performance of U-DPnet is verified on a homemade Apple dataset. Experimental results demonstrate that U-DPnet achieves detection accuracy and speed comparable to that of the 7 SOTA models. Moreover, U-DPnet exhibits an absolute advantage in model volume and computations (only 1.067M Params and 0.563G FLOPs, 39.79% and 36.36% less than yolov5-n).
| Original language | English |
|---|---|
| Article number | 76 |
| Journal | Journal of Real-Time Image Processing |
| Volume | 20 |
| Issue number | 4 |
| DOIs | |
| State | Published - Aug 2023 |
| Externally published | Yes |
Keywords
- Apple detection
- DP structure
- Lightweight network
- Localization