ConeSSD: A Novel Policy to Optimize the Performance of HDFS Heterogeneous Storage

Xiao Zhang, Liang Wang, Zhijie Huang, Huiru Xie, Yuchen Zhang, Michael Ngulube

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

HDFS(Hadoop distributed file system) is the core storage service of Hadoop, which stores and processes large datasets efficiently. Therefore, the performance of storage services significantly impacts big data processing applications. Using SSD to replace HDD devices can obtain better performance, but due to the high unit price, it is too expensive to replace all HDD with SSD in HDFS. Therefore, heterogeneous storage has been widely used in HDFS. However, the existing hierarchical strategy of HDFS is simple and cannot obtain better cost performance. This paper presents a novel heterogeneous storage policy named Cone_SSD, which fully uses SSD media on each node to get the IO performance to near SSD. Experimental results show that the proposed policy improves read and write performance by 48.11 % and 47.45%, respectively, and the fluctuation of reading performance decreases by 54.92 % compared with other policies under the same price-performance ratio.

Original languageEnglish
Title of host publicationProceedings - 24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages876-881
Number of pages6
ISBN (Electronic)9798350319934
DOIs
StatePublished - 2022
Event24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022 - Chengdu, China
Duration: 18 Dec 202220 Dec 2022

Publication series

NameProceedings - 24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022

Conference

Conference24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022
Country/TerritoryChina
CityChengdu
Period18/12/2220/12/22

Keywords

  • Data migration
  • HDFS
  • Heteroge-neous storage
  • Hierarchical storage policy
  • Hierarchical storage system

Fingerprint

Dive into the research topics of 'ConeSSD: A Novel Policy to Optimize the Performance of HDFS Heterogeneous Storage'. Together they form a unique fingerprint.

Cite this