Learning optimal features for music transcription

Huaiping Ming; Dongyan Huang; Lei Xie; Haizhou Li

doi:10.1109/ChinaSIP.2014.6889211

Learning optimal features for music transcription

Huaiping Ming, Dongyan Huang, Lei Xie, Haizhou Li

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

This paper aims to design time-frequency representation (TFR) functions for automatic music transcription. It is desirable that the decomposition of those TFR functions are suitable for notes having variation of both pitch and spectral envelop over time. The Harmonic Adaptive Latent Component Analysis (HALCA) model adopted in this paper allows considering those two kinds of variations simultaneously. We evaluate the influence of three TFR functions including IIR, FIR filter bank semigram (FBSG) and constant-Q transform semigram in automatic music transcription task, on a database of popular and polyphonic classic music. The experiment results show that the filter bank based representations are suitable for multiple-instrument recordings and a CQT-based representation turns out to provide very accurate transcription for solo-instrument recordings.

源语言	英语
主期刊名	2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings
出版商	Institute of Electrical and Electronics Engineers Inc.
页	105-109
页数	5
ISBN（电子版）	9781479954032
DOI	https://doi.org/10.1109/ChinaSIP.2014.6889211
出版状态	已出版 - 3 9月 2014
活动	2nd IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Xi'an, 中国期限: 9 7月 2014 → 13 7月 2014

出版系列

姓名	2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings

会议

会议	2nd IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014
国家/地区	中国
市	Xi'an
时期	9/07/14 → 13/07/14

访问文件

10.1109/ChinaSIP.2014.6889211

其它文件与链接

链接到 Scopus 的出版物

引用此

Ming, H., Huang, D., Xie, L., & Li, H. (2014). Learning optimal features for music transcription. 在 2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings (页码 105-109). 文章 6889211 (2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ChinaSIP.2014.6889211

Ming, Huaiping ; Huang, Dongyan ; Xie, Lei 等. / Learning optimal features for music transcription. 2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2014. 页码 105-109 (2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings).

@inproceedings{d5bec719321e4f51884636a77fabd449,

title = "Learning optimal features for music transcription",

abstract = "This paper aims to design time-frequency representation (TFR) functions for automatic music transcription. It is desirable that the decomposition of those TFR functions are suitable for notes having variation of both pitch and spectral envelop over time. The Harmonic Adaptive Latent Component Analysis (HALCA) model adopted in this paper allows considering those two kinds of variations simultaneously. We evaluate the influence of three TFR functions including IIR, FIR filter bank semigram (FBSG) and constant-Q transform semigram in automatic music transcription task, on a database of popular and polyphonic classic music. The experiment results show that the filter bank based representations are suitable for multiple-instrument recordings and a CQT-based representation turns out to provide very accurate transcription for solo-instrument recordings.",

keywords = "constant-Q transform, filter bank, logarithmic compression, music transcription, Semigram features",

author = "Huaiping Ming and Dongyan Huang and Lei Xie and Haizhou Li",

note = "Publisher Copyright: {\textcopyright} 2014 IEEE.; 2nd IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 ; Conference date: 09-07-2014 Through 13-07-2014",

year = "2014",

month = sep,

day = "3",

doi = "10.1109/ChinaSIP.2014.6889211",

language = "英语",

series = "2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "105--109",

booktitle = "2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings",

}

Ming, H, Huang, D, Xie, L & Li, H 2014, Learning optimal features for music transcription. 在 2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings., 6889211, 2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 页码 105-109, 2nd IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014, Xi'an, 中国, 9/07/14. https://doi.org/10.1109/ChinaSIP.2014.6889211

Learning optimal features for music transcription. / Ming, Huaiping; Huang, Dongyan; Xie, Lei 等.
2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2014. 页码 105-109 6889211 (2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Learning optimal features for music transcription

AU - Ming, Huaiping

AU - Huang, Dongyan

AU - Xie, Lei

AU - Li, Haizhou

PY - 2014/9/3

Y1 - 2014/9/3

N2 - This paper aims to design time-frequency representation (TFR) functions for automatic music transcription. It is desirable that the decomposition of those TFR functions are suitable for notes having variation of both pitch and spectral envelop over time. The Harmonic Adaptive Latent Component Analysis (HALCA) model adopted in this paper allows considering those two kinds of variations simultaneously. We evaluate the influence of three TFR functions including IIR, FIR filter bank semigram (FBSG) and constant-Q transform semigram in automatic music transcription task, on a database of popular and polyphonic classic music. The experiment results show that the filter bank based representations are suitable for multiple-instrument recordings and a CQT-based representation turns out to provide very accurate transcription for solo-instrument recordings.

AB - This paper aims to design time-frequency representation (TFR) functions for automatic music transcription. It is desirable that the decomposition of those TFR functions are suitable for notes having variation of both pitch and spectral envelop over time. The Harmonic Adaptive Latent Component Analysis (HALCA) model adopted in this paper allows considering those two kinds of variations simultaneously. We evaluate the influence of three TFR functions including IIR, FIR filter bank semigram (FBSG) and constant-Q transform semigram in automatic music transcription task, on a database of popular and polyphonic classic music. The experiment results show that the filter bank based representations are suitable for multiple-instrument recordings and a CQT-based representation turns out to provide very accurate transcription for solo-instrument recordings.

KW - constant-Q transform

KW - filter bank

KW - logarithmic compression

KW - music transcription

KW - Semigram features

UR - http://www.scopus.com/inward/record.url?scp=84929403772&partnerID=8YFLogxK

U2 - 10.1109/ChinaSIP.2014.6889211

DO - 10.1109/ChinaSIP.2014.6889211

M3 - 会议稿件

AN - SCOPUS:84929403772

T3 - 2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings

SP - 105

EP - 109

BT - 2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2nd IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014

Y2 - 9 July 2014 through 13 July 2014

ER -

Ming H, Huang D, Xie L, Li H. Learning optimal features for music transcription. 在 2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2014. 页码 105-109. 6889211. (2014 IEEE China Summit and International Conference on Signal and Information Processing, IEEE ChinaSIP 2014 - Proceedings). doi: 10.1109/ChinaSIP.2014.6889211

Learning optimal features for music transcription

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此