TY - JOUR
T1 - Contextual Dependency Vision Transformer for spectrogram-based multivariate time series analysis
AU - Yao, Jieru
AU - Han, Longfei
AU - Yang, Kaihui
AU - Guo, Guangyu
AU - Liu, Nian
AU - Huang, Xiankai
AU - Zheng, Zhaohui
AU - Zhang, Dingwen
AU - Han, Junwei
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2024/3/1
Y1 - 2024/3/1
N2 - Multivariate time series (MTS) analysis plays an important role in various real-world applications. Existing Transformer-based methods address this problem based on hierarchical semantic representations across different scales. However, most of them ignore exploiting the helpful multiple temporal and variable relationships within the hierarchical semantic representations. To this end, this paper proposes a novel method named Contextual Dependency Vision Transformer (CD-ViT), which generates multi-grained semantic information based on spectrogram and explores mutual dependencies between multi-variable and multi-temporal representations. CD-ViT contains two key modules, i.e., the Hierarchical Variable-dependency Transformer (HVT) module and the Bidirectional Temporal-dependency Interaction (BTI) module. Specifically, the HVT module progressively establishes mutual dependencies between multiple variables, from fine to coarse scales, with shared parameters. The BTI module employs two bidirectional flows to fuse multi-temporal tokens through zoom-in and zoom-out operations. Comprehensive experiments on widely used datasets, including UEA, Olszewski, UCI, MIMIC III, and ETT, demonstrate that the proposed approach achieves significant improvement on three popular tasks, i.e., classification, regression, and forecasting. The code is available at https://github.com/Kali-github/CD-ViT.
AB - Multivariate time series (MTS) analysis plays an important role in various real-world applications. Existing Transformer-based methods address this problem based on hierarchical semantic representations across different scales. However, most of them ignore exploiting the helpful multiple temporal and variable relationships within the hierarchical semantic representations. To this end, this paper proposes a novel method named Contextual Dependency Vision Transformer (CD-ViT), which generates multi-grained semantic information based on spectrogram and explores mutual dependencies between multi-variable and multi-temporal representations. CD-ViT contains two key modules, i.e., the Hierarchical Variable-dependency Transformer (HVT) module and the Bidirectional Temporal-dependency Interaction (BTI) module. Specifically, the HVT module progressively establishes mutual dependencies between multiple variables, from fine to coarse scales, with shared parameters. The BTI module employs two bidirectional flows to fuse multi-temporal tokens through zoom-in and zoom-out operations. Comprehensive experiments on widely used datasets, including UEA, Olszewski, UCI, MIMIC III, and ETT, demonstrate that the proposed approach achieves significant improvement on three popular tasks, i.e., classification, regression, and forecasting. The code is available at https://github.com/Kali-github/CD-ViT.
KW - Hierarchical vision-style transformer
KW - Multi-temporal dependency
KW - Multi-variables dependency
KW - Multivariate times series analysis
KW - Spectrogram-based contextual interaction
UR - http://www.scopus.com/inward/record.url?scp=85182259311&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2023.127215
DO - 10.1016/j.neucom.2023.127215
M3 - 文章
AN - SCOPUS:85182259311
SN - 0925-2312
VL - 572
JO - Neurocomputing
JF - Neurocomputing
M1 - 127215
ER -