TY - JOUR
T1 - Multigroup classification of audio signals using time-frequency parameters
AU - Umapathy, Karthikeyan
AU - Krishnan, Sridhar
AU - Jimaa, Shihab
N1 - Funding Information:
The authors thank MICORNET and NSERC organizations for funding this project. They would also like to thank the LastWave software group for providing the software for signal decomposition.
PY - 2005/4
Y1 - 2005/4
N2 - The ongoing advancements in the multimedia technologies drive the need for efficient classification of the audio signals to make the content-based retrieval process more accurate and much easier from huge databases. The challenge of this task lies in an accurate extraction of signal characteristics so as to derive a strong discriminatory feature suitable for classification. In this paper, a time-frequency (TF) approach for audio classification is proposed. Audio signals are nonstationary in nature and TF approach is the best way to analyze them. The audio signals were decomposed using an adaptive TF decomposition algorithm, and the signal decomposition parameter based on octave (scaling) was used to generate a set of 42 features over three frequency bands within the auditory range. These features were analyzed using linear discriminant functions and classified into six music groups (rock, classical, country, jazz, folk and pop). Overall classification accuracies as high as 97.6% was achieved by linear discriminant analysis of 170 audio signals.
AB - The ongoing advancements in the multimedia technologies drive the need for efficient classification of the audio signals to make the content-based retrieval process more accurate and much easier from huge databases. The challenge of this task lies in an accurate extraction of signal characteristics so as to derive a strong discriminatory feature suitable for classification. In this paper, a time-frequency (TF) approach for audio classification is proposed. Audio signals are nonstationary in nature and TF approach is the best way to analyze them. The audio signals were decomposed using an adaptive TF decomposition algorithm, and the signal decomposition parameter based on octave (scaling) was used to generate a set of 42 features over three frequency bands within the auditory range. These features were analyzed using linear discriminant functions and classified into six music groups (rock, classical, country, jazz, folk and pop). Overall classification accuracies as high as 97.6% was achieved by linear discriminant analysis of 170 audio signals.
KW - Content-based retrieval
KW - Linear discriminant analysis
KW - Matching pursuit
KW - Music classification
KW - Time-frequency
UR - http://www.scopus.com/inward/record.url?scp=16244420091&partnerID=8YFLogxK
U2 - 10.1109/TMM.2005.843363
DO - 10.1109/TMM.2005.843363
M3 - Article
AN - SCOPUS:16244420091
SN - 1520-9210
VL - 7
SP - 308
EP - 315
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 2
ER -