End-to-end sound field reproduction based on deep learning

Xi Hong; Bokai Du; Shuang Yang; Menghui Lei; Xiangyang Zeng

doi:10.1121/10.0019575

End-to-end sound field reproduction based on deep learning

Xi Hong, Bokai Du, Shuang Yang, Menghui Lei, Xiangyang Zeng

航海学院

科研成果: 期刊稿件 › 文章 › 同行评审

8 引用（Scopus）

摘要

Sound field reproduction, which attempts to create a virtual acoustic environment, is a fundamental technology in the achievement of virtual reality. In sound field reproduction, the driving signals of the loudspeakers are calculated by considering the signals collected by the microphones and working environment of the reproduction system. In this paper, an end-to-end reproduction method based on deep learning is proposed. The inputs and outputs of this system are the sound-pressure signals recorded by microphones and the driving signals of loudspeakers, respectively. A convolutional autoencoder network with skip connections in the frequency domain is used. Furthermore, sparse layers are applied to capture the sparse features of the sound field. Simulation results show that the reproduction errors of the proposed method are lower than those generated by the conventional pressure matching and least absolute shrinkage and selection operator methods, especially at high frequencies. Experiments were performed under conditions of single and multiple primary sources. The results in both cases demonstrate that the proposed method achieves better high-frequency performance than the conventional methods.

源语言	英语
页（从-至）	3055-3064
页数	10
期刊	Journal of the Acoustical Society of America
卷	153
期	5
DOI	https://doi.org/10.1121/10.0019575
出版状态	已出版 - 1 5月 2023

访问文件

10.1121/10.0019575

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{8c119ca66f0d4ef3b7c2261439b9f262,

title = "End-to-end sound field reproduction based on deep learning",

abstract = "Sound field reproduction, which attempts to create a virtual acoustic environment, is a fundamental technology in the achievement of virtual reality. In sound field reproduction, the driving signals of the loudspeakers are calculated by considering the signals collected by the microphones and working environment of the reproduction system. In this paper, an end-to-end reproduction method based on deep learning is proposed. The inputs and outputs of this system are the sound-pressure signals recorded by microphones and the driving signals of loudspeakers, respectively. A convolutional autoencoder network with skip connections in the frequency domain is used. Furthermore, sparse layers are applied to capture the sparse features of the sound field. Simulation results show that the reproduction errors of the proposed method are lower than those generated by the conventional pressure matching and least absolute shrinkage and selection operator methods, especially at high frequencies. Experiments were performed under conditions of single and multiple primary sources. The results in both cases demonstrate that the proposed method achieves better high-frequency performance than the conventional methods.",

author = "Xi Hong and Bokai Du and Shuang Yang and Menghui Lei and Xiangyang Zeng",

note = "Publisher Copyright: {\textcopyright} 2023 Acoustical Society of America.",

year = "2023",

month = may,

day = "1",

doi = "10.1121/10.0019575",

language = "英语",

volume = "153",

pages = "3055--3064",

journal = "Journal of the Acoustical Society of America",

issn = "0001-4966",

publisher = "Acoustical Society of America",

number = "5",

}

TY - JOUR

T1 - End-to-end sound field reproduction based on deep learning

AU - Hong, Xi

AU - Du, Bokai

AU - Yang, Shuang

AU - Lei, Menghui

AU - Zeng, Xiangyang

PY - 2023/5/1

Y1 - 2023/5/1

N2 - Sound field reproduction, which attempts to create a virtual acoustic environment, is a fundamental technology in the achievement of virtual reality. In sound field reproduction, the driving signals of the loudspeakers are calculated by considering the signals collected by the microphones and working environment of the reproduction system. In this paper, an end-to-end reproduction method based on deep learning is proposed. The inputs and outputs of this system are the sound-pressure signals recorded by microphones and the driving signals of loudspeakers, respectively. A convolutional autoencoder network with skip connections in the frequency domain is used. Furthermore, sparse layers are applied to capture the sparse features of the sound field. Simulation results show that the reproduction errors of the proposed method are lower than those generated by the conventional pressure matching and least absolute shrinkage and selection operator methods, especially at high frequencies. Experiments were performed under conditions of single and multiple primary sources. The results in both cases demonstrate that the proposed method achieves better high-frequency performance than the conventional methods.

AB - Sound field reproduction, which attempts to create a virtual acoustic environment, is a fundamental technology in the achievement of virtual reality. In sound field reproduction, the driving signals of the loudspeakers are calculated by considering the signals collected by the microphones and working environment of the reproduction system. In this paper, an end-to-end reproduction method based on deep learning is proposed. The inputs and outputs of this system are the sound-pressure signals recorded by microphones and the driving signals of loudspeakers, respectively. A convolutional autoencoder network with skip connections in the frequency domain is used. Furthermore, sparse layers are applied to capture the sparse features of the sound field. Simulation results show that the reproduction errors of the proposed method are lower than those generated by the conventional pressure matching and least absolute shrinkage and selection operator methods, especially at high frequencies. Experiments were performed under conditions of single and multiple primary sources. The results in both cases demonstrate that the proposed method achieves better high-frequency performance than the conventional methods.

UR - http://www.scopus.com/inward/record.url?scp=85160010557&partnerID=8YFLogxK

U2 - 10.1121/10.0019575

DO - 10.1121/10.0019575

M3 - 文章

C2 - 37219493

AN - SCOPUS:85160010557

SN - 0001-4966

VL - 153

SP - 3055

EP - 3064

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

IS - 5

ER -

End-to-end sound field reproduction based on deep learning

摘要

访问文件

其它文件与链接

指纹

引用此