WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units

Chong Liu, Yuan Yao, Yi Dang, Gang Yang, Wei Jia, Xinyu Tian, Xingshe Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

To further improve the capacity of airborne embedded system for dealing with deep learning (DL) applications and reduce overall power consumption, it is necessary to equip Neural Processing Units (NPUs). Comparing with the cloud system, the airborne embedded system usually has a fixed application set, but strict real-time constraints. Unfortunately, the inherent NPU scheduler does not consider the application priority, which cannot provide the sufficient real-time capability for the airborne embedded system. At present, there are few researches on multi-task real-time scheduling for NPUs. Therefore, we propose WMDRS, a workload-aware performance model multi-task dynamic-quota real-time scheduling for Neural Processing Units. The NPU performance model based on workload-awareness can accurately predict the remaining execution time of a task, which is running concurrently with other tasks on NPU. The multi-task dynamic-quota real-time scheduling algorithm can provide the approximate preemption by dynamically adjusting NPU computing resources for active applications. In addition, we implement a prototype NPU scheduler without any hardware extension. Furthermore, the proposed NPU performance model and real-time scheduling algorithm are evaluated in realistic application sets. Experimental results demonstrate that WMDRS can achieve low prediction error and high scheduling success ratio.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022
PublisherIEEE Computer Society
Pages435-442
Number of pages8
ISBN (Electronic)9781665473156
DOIs
StatePublished - 2023
Event28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022 - Nanjing, China
Duration: 10 Jan 202312 Jan 2023

Publication series

NameProceedings of the International Conference on Parallel and Distributed Systems - ICPADS
Volume2023-January
ISSN (Print)1521-9097

Conference

Conference28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022
Country/TerritoryChina
CityNanjing
Period10/01/2312/01/23

Keywords

  • NPU performance model
  • embedded system
  • preemptive scheduling
  • real-time scheduling

Fingerprint

Dive into the research topics of 'WMDRS: Workload-Aware Performance Model Based Multi-Task Dynamic-Quota Real-Time Scheduling for Neural Processing Units'. Together they form a unique fingerprint.

Cite this