Efficient tree classifiers for large scale datasets

Research output: Contribution to journalArticlepeer-review

42 Scopus citations

Abstract

Classification plays a significant role in production activities and lives. In this era of big data, it is especially important to design efficient classifiers with high classification accuracy for large scale datasets. In this paper, we propose a randomly partitioned and a Principal Component Analysis (PCA)-partitioned multivariate decision tree classifiers, of which the training time is quite short and the classification accuracy is quite high. Approximately balanced trees are created in the form of a full binary tree based on two simple ways of generating multivariate combination weights and a median-based method to select the divide value having ensured the efficiency and effectiveness of the proposed algorithms. Extensive experiments conducted on a series of large datasets have demonstrated that the proposed methods are superior to other classifiers in most cases.

Original languageEnglish
Pages (from-to)70-79
Number of pages10
JournalNeurocomputing
Volume284
DOIs
StatePublished - 5 Apr 2018

Keywords

  • Big data
  • Classification
  • Multivariate decision tree

Fingerprint

Dive into the research topics of 'Efficient tree classifiers for large scale datasets'. Together they form a unique fingerprint.

Cite this