TY - GEN
T1 - WMDRS
T2 - 28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022
AU - Liu, Chong
AU - Yao, Yuan
AU - Dang, Yi
AU - Yang, Gang
AU - Jia, Wei
AU - Tian, Xinyu
AU - Zhou, Xingshe
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.
AB - To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.
KW - NPU performance model
KW - embedded system
KW - preemptive scheduling
KW - real-time scheduling
UR - http://www.scopus.com/inward/record.url?scp=85152916900&partnerID=8YFLogxK
U2 - 10.1109/ICPADS56603.2022.00063
DO - 10.1109/ICPADS56603.2022.00063
M3 - 会议稿件
AN - SCOPUS:85152916900
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 435
EP - 442
BT - Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022
PB - IEEE Computer Society
Y2 - 10 January 2023 through 12 January 2023
ER -