TY - JOUR
T1 - Learning Semantics-Guided Representations for Scoring Figure Skating
AU - Du, Zexing
AU - He, Di
AU - Wang, Xue
AU - Wang, Qing
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - This paper explores semantic-aware representations for scoring figure skating videos. Most existing approaches to sports video analysis only focus on reasoning action scores based on visual input, limiting their ability to depict high-level semantic representations. Here, we propose a teacher-student-based network with an attention mechanism to realize an adaptive knowledge transfer from the semantic domain to the visual domain, which is termed semantics-guided network (SGN). Specifically, we use a set of learnable atomic queries in the student branch to mimic the semantic-aware distribution in the teacher branch, which is represented by the visual and semantic inputs. In addition, we propose three auxiliary losses to align features in different domains. With aligned feature representations, the adapted teacher is capable of transferring the semantic knowledge to the student. To verify the effectiveness of our method, we collect a new dataset OlympicFS for scoring figure skating. Besides action scores, OlympicFS also provides professional comments on actions for learning semantic representations. By evaluating four challenging datasets, our method achieves state-of-the-art performance.
AB - This paper explores semantic-aware representations for scoring figure skating videos. Most existing approaches to sports video analysis only focus on reasoning action scores based on visual input, limiting their ability to depict high-level semantic representations. Here, we propose a teacher-student-based network with an attention mechanism to realize an adaptive knowledge transfer from the semantic domain to the visual domain, which is termed semantics-guided network (SGN). Specifically, we use a set of learnable atomic queries in the student branch to mimic the semantic-aware distribution in the teacher branch, which is represented by the visual and semantic inputs. In addition, we propose three auxiliary losses to align features in different domains. With aligned feature representations, the adapted teacher is capable of transferring the semantic knowledge to the student. To verify the effectiveness of our method, we collect a new dataset OlympicFS for scoring figure skating. Besides action scores, OlympicFS also provides professional comments on actions for learning semantic representations. By evaluating four challenging datasets, our method achieves state-of-the-art performance.
KW - Figureskatingvideos
KW - action quality assessment
KW - multimodality representation learning
KW - sportsvideoanalysis
KW - teacher-student network
UR - http://www.scopus.com/inward/record.url?scp=85181805246&partnerID=8YFLogxK
U2 - 10.1109/TMM.2023.3328180
DO - 10.1109/TMM.2023.3328180
M3 - 文章
AN - SCOPUS:85181805246
SN - 1520-9210
VL - 26
SP - 4987
EP - 4997
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -