Real-Time Target Detection in Visual Sensing Environments Using Deep Transfer Learning and Improved Anchor Box Generation

Zhenbo Ren; Edmund Y. Lam; Jianlin Zhao

doi:10.1109/ACCESS.2020.3032955

Real-Time Target Detection in Visual Sensing Environments Using Deep Transfer Learning and Improved Anchor Box Generation

Zhenbo Ren, Edmund Y. Lam, Jianlin Zhao

物理科学与技术学院

科研成果: 期刊稿件 › 文章 › 同行评审

11 引用（Scopus）

摘要

Visual perception is critical and essential to understand phenomenon and environments of the world. Pervasively configured devices like cameras are key in dynamic status monitoring, object detection and recognition. As such, visual sensor environments using one single or multiple cameras must deal with a huge amount of high-resolution images, videos or other multimedia. In this paper, to promote smart advancement and fast detection of visual environments, we propose a deep transfer learning strategy for real-time target detection for situations where acquiring large-scale data is complicated and challenging. By employing the concept of transfer learning and pre-training the network with established datasets, apart from the outstanding performance in target localization and recognition can be achieved, time consumption of training a deep model is also significantly reduced. Besides, the original clustering method, {k} -means, in the You Only Look Once (YOLOv3) detection model is sensitive to the initial cluster centers when estimating the initial width and height of the predicted bounding boxes, thereby processing large-scale data is extremely time-consuming. To handle such problems, an improved clustering method, mini batch {k} -means++ is incorporated into the detection model to improve the clustering accuracy. We examine the sustainable outperformance in three typical applications, digital pathology, smart agriculture and remote sensing, in vision-based sensing environments.

源语言	英语
文章编号	9235577
页（从-至）	193512-193522
页数	11
期刊	IEEE Access
卷	8
DOI	https://doi.org/10.1109/ACCESS.2020.3032955
出版状态	已出版 - 2020

访问文件

10.1109/ACCESS.2020.3032955

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{957f33136dc44854a8efef7784f065b2,

title = "Real-Time Target Detection in Visual Sensing Environments Using Deep Transfer Learning and Improved Anchor Box Generation",

abstract = "Visual perception is critical and essential to understand phenomenon and environments of the world. Pervasively configured devices like cameras are key in dynamic status monitoring, object detection and recognition. As such, visual sensor environments using one single or multiple cameras must deal with a huge amount of high-resolution images, videos or other multimedia. In this paper, to promote smart advancement and fast detection of visual environments, we propose a deep transfer learning strategy for real-time target detection for situations where acquiring large-scale data is complicated and challenging. By employing the concept of transfer learning and pre-training the network with established datasets, apart from the outstanding performance in target localization and recognition can be achieved, time consumption of training a deep model is also significantly reduced. Besides, the original clustering method, {k} -means, in the You Only Look Once (YOLOv3) detection model is sensitive to the initial cluster centers when estimating the initial width and height of the predicted bounding boxes, thereby processing large-scale data is extremely time-consuming. To handle such problems, an improved clustering method, mini batch {k} -means++ is incorporated into the detection model to improve the clustering accuracy. We examine the sustainable outperformance in three typical applications, digital pathology, smart agriculture and remote sensing, in vision-based sensing environments.",

keywords = "Clustering methods, machine learning algorithms, machine vision, object detection",

author = "Zhenbo Ren and Lam, {Edmund Y.} and Jianlin Zhao",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2020",

doi = "10.1109/ACCESS.2020.3032955",

language = "英语",

volume = "8",

pages = "193512--193522",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Real-Time Target Detection in Visual Sensing Environments Using Deep Transfer Learning and Improved Anchor Box Generation

AU - Ren, Zhenbo

AU - Lam, Edmund Y.

AU - Zhao, Jianlin

PY - 2020

Y1 - 2020

N2 - Visual perception is critical and essential to understand phenomenon and environments of the world. Pervasively configured devices like cameras are key in dynamic status monitoring, object detection and recognition. As such, visual sensor environments using one single or multiple cameras must deal with a huge amount of high-resolution images, videos or other multimedia. In this paper, to promote smart advancement and fast detection of visual environments, we propose a deep transfer learning strategy for real-time target detection for situations where acquiring large-scale data is complicated and challenging. By employing the concept of transfer learning and pre-training the network with established datasets, apart from the outstanding performance in target localization and recognition can be achieved, time consumption of training a deep model is also significantly reduced. Besides, the original clustering method, {k} -means, in the You Only Look Once (YOLOv3) detection model is sensitive to the initial cluster centers when estimating the initial width and height of the predicted bounding boxes, thereby processing large-scale data is extremely time-consuming. To handle such problems, an improved clustering method, mini batch {k} -means++ is incorporated into the detection model to improve the clustering accuracy. We examine the sustainable outperformance in three typical applications, digital pathology, smart agriculture and remote sensing, in vision-based sensing environments.

AB - Visual perception is critical and essential to understand phenomenon and environments of the world. Pervasively configured devices like cameras are key in dynamic status monitoring, object detection and recognition. As such, visual sensor environments using one single or multiple cameras must deal with a huge amount of high-resolution images, videos or other multimedia. In this paper, to promote smart advancement and fast detection of visual environments, we propose a deep transfer learning strategy for real-time target detection for situations where acquiring large-scale data is complicated and challenging. By employing the concept of transfer learning and pre-training the network with established datasets, apart from the outstanding performance in target localization and recognition can be achieved, time consumption of training a deep model is also significantly reduced. Besides, the original clustering method, {k} -means, in the You Only Look Once (YOLOv3) detection model is sensitive to the initial cluster centers when estimating the initial width and height of the predicted bounding boxes, thereby processing large-scale data is extremely time-consuming. To handle such problems, an improved clustering method, mini batch {k} -means++ is incorporated into the detection model to improve the clustering accuracy. We examine the sustainable outperformance in three typical applications, digital pathology, smart agriculture and remote sensing, in vision-based sensing environments.

KW - Clustering methods

KW - machine learning algorithms

KW - machine vision

KW - object detection

UR - http://www.scopus.com/inward/record.url?scp=85096036661&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2020.3032955

DO - 10.1109/ACCESS.2020.3032955

M3 - 文章

AN - SCOPUS:85096036661

SN - 2169-3536

VL - 8

SP - 193512

EP - 193522

JO - IEEE Access

JF - IEEE Access

M1 - 9235577

ER -

Real-Time Target Detection in Visual Sensing Environments Using Deep Transfer Learning and Improved Anchor Box Generation

摘要

访问文件

其它文件与链接

指纹

引用此