An investigation of CNN models for differentiating malignant from benign lesions using small pathologically proven datasets

Shu Zhang; Fangfang Han; Zhengrong Liang; Jiaxing Tan; Weiguo Cao; Yongfeng Gao; Marc Pomeroy; Kenneth Ng; Wei Hou

doi:10.1016/j.compmedimag.2019.101645

An investigation of CNN models for differentiating malignant from benign lesions using small pathologically proven datasets

Shu Zhang, Fangfang Han, Zhengrong Liang, Jiaxing Tan, Weiguo Cao, Yongfeng Gao, Marc Pomeroy, Kenneth Ng, Wei Hou

Research output: Contribution to journal › Article › peer-review

44 Scopus citations

Abstract

Cancer has been one of the most threatening diseases to human health. There have been many efforts devoted to the advancement of radiology and transformative tools (e.g. non-invasive computed tomographic or CT imaging) to detect cancer in early stages. One of the major goals is to identify malignant from benign lesions. In recent years, machine deep learning (DL), e.g. convolutional neural network (CNN), has shown encouraging classification performance on medical images. However, DL algorithms always need large datasets with ground truth. Yet in the medical imaging field, especially for cancer imaging, it is difficult to collect such large volume of images with pathological information. Therefore, strategies are needed to learn effectively from small datasets via CNN models. To forward that goal, this paper explores two CNN models by focusing extensively on expansion of training samples from two small pathologically proven datasets (colorectal polyp dataset and lung nodule dataset) and then differentiating malignant from benign lesions. Experimental outcomes indicate that even in very small datasets of less than 70 subjects, malignance can be successfully differentiated from benign via the proposed CNN models, the average AUCs (area under the receiver operating curve) of differentiating colorectal polyps and pulmonary nodules are 0.86 and 0.71, respectively. Our experiments further demonstrate that for these two small datasets, instead of only studying the original raw CT images, feeding additional image features, such as the local binary pattern of the lesions, into the CNN models can significantly improve classification performance. In addition, we find that our explored voxel level CNN model has better performance when facing the small and unbalanced datasets.

Original language	English
Article number	101645
Journal	Computerized Medical Imaging and Graphics
Volume	77
DOIs	https://doi.org/10.1016/j.compmedimag.2019.101645
State	Published - Oct 2019
Externally published	Yes

Keywords

Cancer imaging
Convolutional neural network
Machine learning
Nodule characterization
Pathologically proven datasets
Polyp characterization

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.compmedimag.2019.101645

Cite this

@article{e23ef0f2fb0f4b138308617c461b99a1,

title = "An investigation of CNN models for differentiating malignant from benign lesions using small pathologically proven datasets",

abstract = "Cancer has been one of the most threatening diseases to human health. There have been many efforts devoted to the advancement of radiology and transformative tools (e.g. non-invasive computed tomographic or CT imaging) to detect cancer in early stages. One of the major goals is to identify malignant from benign lesions. In recent years, machine deep learning (DL), e.g. convolutional neural network (CNN), has shown encouraging classification performance on medical images. However, DL algorithms always need large datasets with ground truth. Yet in the medical imaging field, especially for cancer imaging, it is difficult to collect such large volume of images with pathological information. Therefore, strategies are needed to learn effectively from small datasets via CNN models. To forward that goal, this paper explores two CNN models by focusing extensively on expansion of training samples from two small pathologically proven datasets (colorectal polyp dataset and lung nodule dataset) and then differentiating malignant from benign lesions. Experimental outcomes indicate that even in very small datasets of less than 70 subjects, malignance can be successfully differentiated from benign via the proposed CNN models, the average AUCs (area under the receiver operating curve) of differentiating colorectal polyps and pulmonary nodules are 0.86 and 0.71, respectively. Our experiments further demonstrate that for these two small datasets, instead of only studying the original raw CT images, feeding additional image features, such as the local binary pattern of the lesions, into the CNN models can significantly improve classification performance. In addition, we find that our explored voxel level CNN model has better performance when facing the small and unbalanced datasets.",

keywords = "Cancer imaging, Convolutional neural network, Machine learning, Nodule characterization, Pathologically proven datasets, Polyp characterization",

author = "Shu Zhang and Fangfang Han and Zhengrong Liang and Jiaxing Tan and Weiguo Cao and Yongfeng Gao and Marc Pomeroy and Kenneth Ng and Wei Hou",

note = "Publisher Copyright: {\textcopyright} 2019 Elsevier Ltd",

year = "2019",

month = oct,

doi = "10.1016/j.compmedimag.2019.101645",

language = "英语",

volume = "77",

journal = "Computerized Medical Imaging and Graphics",

issn = "0895-6111",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - An investigation of CNN models for differentiating malignant from benign lesions using small pathologically proven datasets

AU - Zhang, Shu

AU - Han, Fangfang

AU - Liang, Zhengrong

AU - Tan, Jiaxing

AU - Cao, Weiguo

AU - Gao, Yongfeng

AU - Pomeroy, Marc

AU - Ng, Kenneth

AU - Hou, Wei

PY - 2019/10

Y1 - 2019/10

N2 - Cancer has been one of the most threatening diseases to human health. There have been many efforts devoted to the advancement of radiology and transformative tools (e.g. non-invasive computed tomographic or CT imaging) to detect cancer in early stages. One of the major goals is to identify malignant from benign lesions. In recent years, machine deep learning (DL), e.g. convolutional neural network (CNN), has shown encouraging classification performance on medical images. However, DL algorithms always need large datasets with ground truth. Yet in the medical imaging field, especially for cancer imaging, it is difficult to collect such large volume of images with pathological information. Therefore, strategies are needed to learn effectively from small datasets via CNN models. To forward that goal, this paper explores two CNN models by focusing extensively on expansion of training samples from two small pathologically proven datasets (colorectal polyp dataset and lung nodule dataset) and then differentiating malignant from benign lesions. Experimental outcomes indicate that even in very small datasets of less than 70 subjects, malignance can be successfully differentiated from benign via the proposed CNN models, the average AUCs (area under the receiver operating curve) of differentiating colorectal polyps and pulmonary nodules are 0.86 and 0.71, respectively. Our experiments further demonstrate that for these two small datasets, instead of only studying the original raw CT images, feeding additional image features, such as the local binary pattern of the lesions, into the CNN models can significantly improve classification performance. In addition, we find that our explored voxel level CNN model has better performance when facing the small and unbalanced datasets.

AB - Cancer has been one of the most threatening diseases to human health. There have been many efforts devoted to the advancement of radiology and transformative tools (e.g. non-invasive computed tomographic or CT imaging) to detect cancer in early stages. One of the major goals is to identify malignant from benign lesions. In recent years, machine deep learning (DL), e.g. convolutional neural network (CNN), has shown encouraging classification performance on medical images. However, DL algorithms always need large datasets with ground truth. Yet in the medical imaging field, especially for cancer imaging, it is difficult to collect such large volume of images with pathological information. Therefore, strategies are needed to learn effectively from small datasets via CNN models. To forward that goal, this paper explores two CNN models by focusing extensively on expansion of training samples from two small pathologically proven datasets (colorectal polyp dataset and lung nodule dataset) and then differentiating malignant from benign lesions. Experimental outcomes indicate that even in very small datasets of less than 70 subjects, malignance can be successfully differentiated from benign via the proposed CNN models, the average AUCs (area under the receiver operating curve) of differentiating colorectal polyps and pulmonary nodules are 0.86 and 0.71, respectively. Our experiments further demonstrate that for these two small datasets, instead of only studying the original raw CT images, feeding additional image features, such as the local binary pattern of the lesions, into the CNN models can significantly improve classification performance. In addition, we find that our explored voxel level CNN model has better performance when facing the small and unbalanced datasets.

KW - Cancer imaging

KW - Convolutional neural network

KW - Machine learning

KW - Nodule characterization

KW - Pathologically proven datasets

KW - Polyp characterization

UR - http://www.scopus.com/inward/record.url?scp=85071093088&partnerID=8YFLogxK

U2 - 10.1016/j.compmedimag.2019.101645

DO - 10.1016/j.compmedimag.2019.101645

M3 - 文章

C2 - 31454710

AN - SCOPUS:85071093088

SN - 0895-6111

VL - 77

JO - Computerized Medical Imaging and Graphics

JF - Computerized Medical Imaging and Graphics

M1 - 101645

ER -

An investigation of CNN models for differentiating malignant from benign lesions using small pathologically proven datasets

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this