An investigation of CNN models for differentiating malignant from benign lesions using small pathologically proven datasets

Shu Zhang, Fangfang Han, Zhengrong Liang, Jiaxing Tan, Weiguo Cao, Yongfeng Gao, Marc Pomeroy, Kenneth Ng, Wei Hou

Research output: Contribution to journalArticlepeer-review

44 Scopus citations

Abstract

Cancer has been one of the most threatening diseases to human health. There have been many efforts devoted to the advancement of radiology and transformative tools (e.g. non-invasive computed tomographic or CT imaging) to detect cancer in early stages. One of the major goals is to identify malignant from benign lesions. In recent years, machine deep learning (DL), e.g. convolutional neural network (CNN), has shown encouraging classification performance on medical images. However, DL algorithms always need large datasets with ground truth. Yet in the medical imaging field, especially for cancer imaging, it is difficult to collect such large volume of images with pathological information. Therefore, strategies are needed to learn effectively from small datasets via CNN models. To forward that goal, this paper explores two CNN models by focusing extensively on expansion of training samples from two small pathologically proven datasets (colorectal polyp dataset and lung nodule dataset) and then differentiating malignant from benign lesions. Experimental outcomes indicate that even in very small datasets of less than 70 subjects, malignance can be successfully differentiated from benign via the proposed CNN models, the average AUCs (area under the receiver operating curve) of differentiating colorectal polyps and pulmonary nodules are 0.86 and 0.71, respectively. Our experiments further demonstrate that for these two small datasets, instead of only studying the original raw CT images, feeding additional image features, such as the local binary pattern of the lesions, into the CNN models can significantly improve classification performance. In addition, we find that our explored voxel level CNN model has better performance when facing the small and unbalanced datasets.

Original languageEnglish
Article number101645
JournalComputerized Medical Imaging and Graphics
Volume77
DOIs
StatePublished - Oct 2019
Externally publishedYes

Keywords

  • Cancer imaging
  • Convolutional neural network
  • Machine learning
  • Nodule characterization
  • Pathologically proven datasets
  • Polyp characterization

Fingerprint

Dive into the research topics of 'An investigation of CNN models for differentiating malignant from benign lesions using small pathologically proven datasets'. Together they form a unique fingerprint.

Cite this