TY - GEN
T1 - Empirical investigation of consensus clustering for large ECG data sets
AU - Kelarev, Andrei
AU - Stranieri, Andrew
AU - Yearwood, John
AU - Jelinek, Herbert
PY - 2012
Y1 - 2012
N2 - This article investigates a novel machine learning approach applying consensus clustering in conjunction with classification for the data mining of very large and highly dimensional ECG data sets. To obtain robust and stable clusterings, consensus functions can be applied for clustering ensembles combining a multitude of independent initial clusterings. Direct applications of consensus functions to highly dimensional ECG data sets remain computationally expensive and impracticable. We introduce a multistage scheme including various procedures for dimensionality reduction, consensus clustering of randomized samples, followed by the use of a fast supervised classification algorithm. Applying the Hybrid Bipartite Graph Formulation combined with rank ordering and SMO we obtained an area under the receiver operating curve of 0.987. The performance of the classification algorithm at the final stage is crucial for the effectiveness of this technique. It can be regarded as an indication of the reliability, quality and stability of the combined consensus clustering.
AB - This article investigates a novel machine learning approach applying consensus clustering in conjunction with classification for the data mining of very large and highly dimensional ECG data sets. To obtain robust and stable clusterings, consensus functions can be applied for clustering ensembles combining a multitude of independent initial clusterings. Direct applications of consensus functions to highly dimensional ECG data sets remain computationally expensive and impracticable. We introduce a multistage scheme including various procedures for dimensionality reduction, consensus clustering of randomized samples, followed by the use of a fast supervised classification algorithm. Applying the Hybrid Bipartite Graph Formulation combined with rank ordering and SMO we obtained an area under the receiver operating curve of 0.987. The performance of the classification algorithm at the final stage is crucial for the effectiveness of this technique. It can be regarded as an indication of the reliability, quality and stability of the combined consensus clustering.
UR - http://www.scopus.com/inward/record.url?scp=84867289167&partnerID=8YFLogxK
U2 - 10.1109/CBMS.2012.6266364
DO - 10.1109/CBMS.2012.6266364
M3 - Conference contribution
AN - SCOPUS:84867289167
SN - 9781467320511
T3 - Proceedings - IEEE Symposium on Computer-Based Medical Systems
BT - Proceedings of the 25th IEEE International Symposium on Computer-Based Medical Systems, CBMS 2012
T2 - 25th IEEE International Symposium on Computer-Based Medical Systems, CBMS 2012
Y2 - 20 June 2012 through 22 June 2012
ER -