A block-based blind source separation approach with equilateral triangular microphone array

Jian Zhang; Zhonghua Fu; Lei Xie

A block-based blind source separation approach with equilateral triangular microphone array

Jian Zhang, Zhonghua Fu, Lei Xie

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to conference › Paper › peer-review

Abstract

In this paper we describe a method for multiple speech sources separation using an equilateral triangular microphone array. Firstly, the azimuths of horizontal plane are divided into many units and the spatial features of some directions observed by the microphone array are modeled precisely. Secondly, the input mixing signals are segmented into blocks, and then the number of active speakers and their directions are estimated in each block. Thirdly, the pre-trained model with the nearest azimuth to each speaker is adapted to obtain a precise model, which is then used for time-frequency binary mask estimation. Finally, we separate every source appeared in each block and concatenate those sounds from same unit to reproduce the whole stream. The experiments are set up in a real meeting room. The results show that our method can separate multiple speech sources correctly with low distortion, and are competitive with the total un-blind separation results.

Original language	English
Pages	1126-1130
Number of pages	5
State	Published - 2011
Event	Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 - Xi'an, China Duration: 18 Oct 2011 → 21 Oct 2011

Conference

Conference	Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011
Country/Territory	China
City	Xi'an
Period	18/10/11 → 21/10/11

Keywords

Blind source separation
Directions of arrival estimation
Equilateral triangular microphone array
Time-frequency mask

Cite this

@conference{cc3d066867894f2081a2cc78d31a9b63,

title = "A block-based blind source separation approach with equilateral triangular microphone array",

abstract = "In this paper we describe a method for multiple speech sources separation using an equilateral triangular microphone array. Firstly, the azimuths of horizontal plane are divided into many units and the spatial features of some directions observed by the microphone array are modeled precisely. Secondly, the input mixing signals are segmented into blocks, and then the number of active speakers and their directions are estimated in each block. Thirdly, the pre-trained model with the nearest azimuth to each speaker is adapted to obtain a precise model, which is then used for time-frequency binary mask estimation. Finally, we separate every source appeared in each block and concatenate those sounds from same unit to reproduce the whole stream. The experiments are set up in a real meeting room. The results show that our method can separate multiple speech sources correctly with low distortion, and are competitive with the total un-blind separation results.",

keywords = "Blind source separation, Directions of arrival estimation, Equilateral triangular microphone array, Time-frequency mask",

author = "Jian Zhang and Zhonghua Fu and Lei Xie",

year = "2011",

language = "英语",

pages = "1126--1130",

note = "Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 ; Conference date: 18-10-2011 Through 21-10-2011",

}

TY - CONF

T1 - A block-based blind source separation approach with equilateral triangular microphone array

AU - Zhang, Jian

AU - Fu, Zhonghua

AU - Xie, Lei

PY - 2011

Y1 - 2011

N2 - In this paper we describe a method for multiple speech sources separation using an equilateral triangular microphone array. Firstly, the azimuths of horizontal plane are divided into many units and the spatial features of some directions observed by the microphone array are modeled precisely. Secondly, the input mixing signals are segmented into blocks, and then the number of active speakers and their directions are estimated in each block. Thirdly, the pre-trained model with the nearest azimuth to each speaker is adapted to obtain a precise model, which is then used for time-frequency binary mask estimation. Finally, we separate every source appeared in each block and concatenate those sounds from same unit to reproduce the whole stream. The experiments are set up in a real meeting room. The results show that our method can separate multiple speech sources correctly with low distortion, and are competitive with the total un-blind separation results.

AB - In this paper we describe a method for multiple speech sources separation using an equilateral triangular microphone array. Firstly, the azimuths of horizontal plane are divided into many units and the spatial features of some directions observed by the microphone array are modeled precisely. Secondly, the input mixing signals are segmented into blocks, and then the number of active speakers and their directions are estimated in each block. Thirdly, the pre-trained model with the nearest azimuth to each speaker is adapted to obtain a precise model, which is then used for time-frequency binary mask estimation. Finally, we separate every source appeared in each block and concatenate those sounds from same unit to reproduce the whole stream. The experiments are set up in a real meeting room. The results show that our method can separate multiple speech sources correctly with low distortion, and are competitive with the total un-blind separation results.

KW - Blind source separation

KW - Directions of arrival estimation

KW - Equilateral triangular microphone array

KW - Time-frequency mask

UR - http://www.scopus.com/inward/record.url?scp=84866856799&partnerID=8YFLogxK

M3 - 论文

AN - SCOPUS:84866856799

SP - 1126

EP - 1130

T2 - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011

Y2 - 18 October 2011 through 21 October 2011

ER -

A block-based blind source separation approach with equilateral triangular microphone array

Abstract

Conference

Keywords

Other files and links

Fingerprint

Cite this