A MapReduce-Based Parallel Random Forest Approach for Predicting Large-Scale Protein-Protein Interactions

Bo Ya Ji, Zhu Hong You, Long Yang, Ji Ren Zhou, Peng Wei Hu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

The protein-protein interactions (PPIs) play an important part in understanding cellular mechanisms. Recently, a number of computational approaches for predicting PPIs have been proposed. However, most of the existing methods are only suitable for relatively small-scale PPIs prediction. In this study, we propose a MapReduce-based parallel Random Forest model for predicting large-scale PPIs using only proteins sequence information. More specifically, the Moran autocorrelation descriptor is firstly used to extract the local features from protein sequence. Then, the MapReduce-based parallel Random Forest model is utilized to perform PPIs prediction. In the experiment, the proposed method greatly reduces the required time to train the model, while maintaining the high accuracy in the prediction of potential PPIs. The promising results demonstrate that our method can be used as an efficient tool in the field of large-scale PPIs prediction, which greatly reduces the required training time and has high prediction accuracy.

Original languageEnglish
Title of host publicationIntelligent Computing Methodologies - 16th International Conference, ICIC 2020, Proceedings
EditorsDe-Shuang Huang, Prashan Premaratne
PublisherSpringer Science and Business Media Deutschland GmbH
Pages400-407
Number of pages8
ISBN (Print)9783030607951
DOIs
StatePublished - 2020
Externally publishedYes
Event16th International Conference on Intelligent Computing, ICIC 2020 - Bari , Italy
Duration: 2 Oct 20205 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12465 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Intelligent Computing, ICIC 2020
Country/TerritoryItaly
CityBari
Period2/10/205/10/20

Keywords

  • MapReduce
  • Protein sequence
  • Protein-protein interactions
  • Random forest

Fingerprint

Dive into the research topics of 'A MapReduce-Based Parallel Random Forest Approach for Predicting Large-Scale Protein-Protein Interactions'. Together they form a unique fingerprint.

Cite this