TY - GEN
T1 - ConeSSD
T2 - 24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022
AU - Zhang, Xiao
AU - Wang, Liang
AU - Huang, Zhijie
AU - Xie, Huiru
AU - Zhang, Yuchen
AU - Ngulube, Michael
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - HDFS(Hadoop distributed file system) is the core storage service of Hadoop, which stores and processes large datasets efficiently. Therefore, the performance of storage services significantly impacts big data processing applications. Using SSD to replace HDD devices can obtain better performance, but due to the high unit price, it is too expensive to replace all HDD with SSD in HDFS. Therefore, heterogeneous storage has been widely used in HDFS. However, the existing hierarchical strategy of HDFS is simple and cannot obtain better cost performance. This paper presents a novel heterogeneous storage policy named Cone_SSD, which fully uses SSD media on each node to get the IO performance to near SSD. Experimental results show that the proposed policy improves read and write performance by 48.11 % and 47.45%, respectively, and the fluctuation of reading performance decreases by 54.92 % compared with other policies under the same price-performance ratio.
AB - HDFS(Hadoop distributed file system) is the core storage service of Hadoop, which stores and processes large datasets efficiently. Therefore, the performance of storage services significantly impacts big data processing applications. Using SSD to replace HDD devices can obtain better performance, but due to the high unit price, it is too expensive to replace all HDD with SSD in HDFS. Therefore, heterogeneous storage has been widely used in HDFS. However, the existing hierarchical strategy of HDFS is simple and cannot obtain better cost performance. This paper presents a novel heterogeneous storage policy named Cone_SSD, which fully uses SSD media on each node to get the IO performance to near SSD. Experimental results show that the proposed policy improves read and write performance by 48.11 % and 47.45%, respectively, and the fluctuation of reading performance decreases by 54.92 % compared with other policies under the same price-performance ratio.
KW - Data migration
KW - HDFS
KW - Heteroge-neous storage
KW - Hierarchical storage policy
KW - Hierarchical storage system
UR - http://www.scopus.com/inward/record.url?scp=85152239368&partnerID=8YFLogxK
U2 - 10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00142
DO - 10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00142
M3 - 会议稿件
AN - SCOPUS:85152239368
T3 - Proceedings - 24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022
SP - 876
EP - 881
BT - Proceedings - 24th IEEE International Conference on High Performance Computing and Communications, 8th IEEE International Conference on Data Science and Systems, 20th IEEE International Conference on Smart City and 8th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 December 2022 through 20 December 2022
ER -