WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

Chong Liu; Yuan Yao; Yi Dang; Gang Yang; Wei Jia; Xinyu Tian; Xingshe Zhou

doi:10.1109/ICPADS56603.2022.00063

WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

Chong Liu, Yuan Yao, Yi Dang, Gang Yang, Wei Jia, Xinyu Tian, Xingshe Zhou

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

2 引用（Scopus）

摘要

To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.

源语言	英语
主期刊名	Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022
出版商	IEEE Computer Society
页	435-442
页数	8
ISBN（电子版）	9781665473156
DOI	https://doi.org/10.1109/ICPADS56603.2022.00063
出版状态	已出版 - 2023
活动	28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022 - Nanjing, 中国期限: 10 1月 2023 → 12 1月 2023

出版系列

姓名	Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
卷	2023-January
ISSN（印刷版）	1521-9097

会议

会议	28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022
国家/地区	中国
市	Nanjing
时期	10/01/23 → 12/01/23

访问文件

10.1109/ICPADS56603.2022.00063

其它文件与链接

链接到 Scopus 的出版物

引用此

Liu, C., Yao, Y., Dang, Y., Yang, G., Jia, W., Tian, X., & Zhou, X. (2023). WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units. 在 Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022 (页码 435-442). (Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS; 卷 2023-January). IEEE Computer Society. https://doi.org/10.1109/ICPADS56603.2022.00063

Liu, Chong ; Yao, Yuan ; Dang, Yi 等. / WMDRS : Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units. Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022. IEEE Computer Society, 2023. 页码 435-442 (Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS).

@inproceedings{80d931425b194126bd0a62868822f9a7,

title = "WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units",

abstract = "To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.",

keywords = "NPU performance model, embedded system, preemptive scheduling, real-time scheduling",

author = "Chong Liu and Yuan Yao and Yi Dang and Gang Yang and Wei Jia and Xinyu Tian and Xingshe Zhou",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022 ; Conference date: 10-01-2023 Through 12-01-2023",

year = "2023",

doi = "10.1109/ICPADS56603.2022.00063",

language = "英语",

series = "Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS",

publisher = "IEEE Computer Society",

pages = "435--442",

booktitle = "Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022",

}

Liu, C, Yao, Y, Dang, Y, Yang, G, Jia, W, Tian, X & Zhou, X 2023, WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units. 在 Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022. Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS, 卷 2023-January, IEEE Computer Society, 页码 435-442, 28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022, Nanjing, 中国, 10/01/23. https://doi.org/10.1109/ICPADS56603.2022.00063

WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units. / Liu, Chong; Yao, Yuan; Dang, Yi 等.
Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022. IEEE Computer Society, 2023. 页码 435-442 (Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS; 卷 2023-January).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - WMDRS

T2 - 28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022

AU - Liu, Chong

AU - Yao, Yuan

AU - Dang, Yi

AU - Yang, Gang

AU - Jia, Wei

AU - Tian, Xinyu

AU - Zhou, Xingshe

PY - 2023

Y1 - 2023

N2 - To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.

AB - To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.

KW - NPU performance model

KW - embedded system

KW - preemptive scheduling

KW - real-time scheduling

UR - http://www.scopus.com/inward/record.url?scp=85152916900&partnerID=8YFLogxK

U2 - 10.1109/ICPADS56603.2022.00063

DO - 10.1109/ICPADS56603.2022.00063

M3 - 会议稿件

AN - SCOPUS:85152916900

T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS

SP - 435

EP - 442

BT - Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022

PB - IEEE Computer Society

Y2 - 10 January 2023 through 12 January 2023

ER -

Liu C, Yao Y, Dang Y, Yang G, Jia W, Tian X 等. WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units. 在 Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022. IEEE Computer Society. 2023. 页码 435-442. (Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS). doi: 10.1109/ICPADS56603.2022.00063

WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此