TY - GEN
T1 - Context-Adaptive Online Reinforcement Learning for Multi-view Video Summarization on Mobile Devices
AU - Hao, Jingyi
AU - Liu, Sicong
AU - Guo, Bin
AU - Ding, Yasan
AU - Yu, Zhiwen
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The huge amount of video data produced by ubiqui tous cameras imposes significant challenges for users to efficiently obtain useful video information. Multi-view video summarization (MVS) aggregates multi-view videos into information-rich video summaries by considering content correlations within each view and between multiple views. Existing MVS methods fail to concentrate on performance across scenarios and usually achieve satisfactory performance on specific training datasets. However, when faced with unseen video scenarios, the quality of the summaries generated by existing methods may degrade. Moreover, they usually only use cameras for data acquisition, which require a large amount of network bandwidth to transfer the data to the server for processing. To bridge this gap, we propose a context-adaptive online reinforcement learning multi-view video summarization framework (COORS) that meets the low response latency performance requirements of context adaptation while ensuring camera hardware compatibility. Specifically, COORS enables retraining in new contexts by extracting contextindependent rewards, while improving model convergence speed based on representation learning and replica playback. Extensive experiments show that COORS has better performance compared to the state-of-the-art baselines.
AB - The huge amount of video data produced by ubiqui tous cameras imposes significant challenges for users to efficiently obtain useful video information. Multi-view video summarization (MVS) aggregates multi-view videos into information-rich video summaries by considering content correlations within each view and between multiple views. Existing MVS methods fail to concentrate on performance across scenarios and usually achieve satisfactory performance on specific training datasets. However, when faced with unseen video scenarios, the quality of the summaries generated by existing methods may degrade. Moreover, they usually only use cameras for data acquisition, which require a large amount of network bandwidth to transfer the data to the server for processing. To bridge this gap, we propose a context-adaptive online reinforcement learning multi-view video summarization framework (COORS) that meets the low response latency performance requirements of context adaptation while ensuring camera hardware compatibility. Specifically, COORS enables retraining in new contexts by extracting contextindependent rewards, while improving model convergence speed based on representation learning and replica playback. Extensive experiments show that COORS has better performance compared to the state-of-the-art baselines.
KW - context-adaptive
KW - multi-view video summarization
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85152948388&partnerID=8YFLogxK
U2 - 10.1109/ICPADS56603.2022.00060
DO - 10.1109/ICPADS56603.2022.00060
M3 - 会议稿件
AN - SCOPUS:85152948388
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 411
EP - 418
BT - Proceedings - 2022 IEEE 28th International Conference on Parallel and Distributed Systems, ICPADS 2022
PB - IEEE Computer Society
T2 - 28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022
Y2 - 10 January 2023 through 12 January 2023
ER -