TY - GEN
T1 - Deep Learning Inference on Heterogeneous Mobile Processors
T2 - 2024 Workshop on Adaptive AIoT Systems, AdaAIoTSys 2024
AU - Liu, Sicong
AU - Zhou, Wentao
AU - Zhou, Zimu
AU - Guo, Bin
AU - Wang, Minfan
AU - Fang, Cheng
AU - Lin, Zheng
AU - Yu, Zhiwen
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2024/6/3
Y1 - 2024/6/3
N2 - There is a growing demand to deploy computation-intensive deep learning (DL) models on resource-constrained mobile devices for real-time intelligent applications. Equipped with a variety of processing units such as CPUs, GPUs, and NPUs, the mobile devices hold potential to accelerate DL inference via parallel execution across heterogeneous processors. Various efficient parallel methods have been explored to optimize computation distribution, achieve load balance, and minimize communication cost across processors. Yet their practical effectiveness in the dynamic and diverse real-world mobile environment is less explored. This paper presents a holistic empirical study to assess the capabilities and challenges associated with parallel DL inference on heterogeneous mobile processors. Through carefully designed experiments covering various DL models, mobile software/hardware environments, workload patterns, and resource availability, we identify limitations of existing techniques and highlight opportunities for cross-level optimization.
AB - There is a growing demand to deploy computation-intensive deep learning (DL) models on resource-constrained mobile devices for real-time intelligent applications. Equipped with a variety of processing units such as CPUs, GPUs, and NPUs, the mobile devices hold potential to accelerate DL inference via parallel execution across heterogeneous processors. Various efficient parallel methods have been explored to optimize computation distribution, achieve load balance, and minimize communication cost across processors. Yet their practical effectiveness in the dynamic and diverse real-world mobile environment is less explored. This paper presents a holistic empirical study to assess the capabilities and challenges associated with parallel DL inference on heterogeneous mobile processors. Through carefully designed experiments covering various DL models, mobile software/hardware environments, workload patterns, and resource availability, we identify limitations of existing techniques and highlight opportunities for cross-level optimization.
KW - Heterogeneous processors
KW - parallel DL inference
UR - http://www.scopus.com/inward/record.url?scp=85196550753&partnerID=8YFLogxK
U2 - 10.1145/3662007.3663881
DO - 10.1145/3662007.3663881
M3 - 会议稿件
AN - SCOPUS:85196550753
T3 - AdaAIoTSys 2024 - Proceedings of the 2024 AdaAIoTSys 2024 - Workshop on Adaptive AIoT Systems
SP - 1
EP - 6
BT - AdaAIoTSys 2024 - Proceedings of the 2024 AdaAIoTSys 2024 - Workshop on Adaptive AIoT Systems
PB - Association for Computing Machinery, Inc
Y2 - 3 June 2024 through 7 June 2024
ER -