TY - GEN
T1 - Selecting Optimal Trace Clustering Pipelines with Meta-learning
AU - Tavares, Gabriel Marques
AU - Barbon Junior, Sylvio
AU - Damiani, Ernesto
AU - Ceravolo, Paolo
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Trace clustering has been extensively used to discover aspects of the data from event logs. Process Mining techniques guide the identification of sub-logs by grouping traces with similar behaviors, producing more understandable models and improving conformance indicators. Nevertheless, little attention has been posed to the relationship among event log properties, the pipeline of encoding and clustering algorithms, and the quality of the obtained outcome. The present study contributes to the understanding of the aforementioned relationships and provides an automatic selection of a proper combination of algorithms for clustering a given event log. We propose a Meta-Learning framework to recommend the most suitable pipeline for trace clustering, which encompasses the encoding method, clustering algorithm, and its hyperparameters. Our experiments were conducted using a thousand event logs, four encoding techniques, and three clustering methods. Results indicate that our framework sheds light on the trace clustering problem and can assist users in choosing the best pipeline considering their environment.
AB - Trace clustering has been extensively used to discover aspects of the data from event logs. Process Mining techniques guide the identification of sub-logs by grouping traces with similar behaviors, producing more understandable models and improving conformance indicators. Nevertheless, little attention has been posed to the relationship among event log properties, the pipeline of encoding and clustering algorithms, and the quality of the obtained outcome. The present study contributes to the understanding of the aforementioned relationships and provides an automatic selection of a proper combination of algorithms for clustering a given event log. We propose a Meta-Learning framework to recommend the most suitable pipeline for trace clustering, which encompasses the encoding method, clustering algorithm, and its hyperparameters. Our experiments were conducted using a thousand event logs, four encoding techniques, and three clustering methods. Results indicate that our framework sheds light on the trace clustering problem and can assist users in choosing the best pipeline considering their environment.
KW - Meta-learning
KW - Pipeline design
KW - Process mining
KW - Recommendation
KW - Trace clustering
UR - http://www.scopus.com/inward/record.url?scp=85144827048&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-21686-2_11
DO - 10.1007/978-3-031-21686-2_11
M3 - Conference contribution
AN - SCOPUS:85144827048
SN - 9783031216855
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 150
EP - 164
BT - Intelligent Systems - 11th Brazilian Conference, BRACIS 2022, Proceedings
A2 - Xavier-Junior, João Carlos
A2 - Rios, Ricardo Araújo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 11th Brazilian Conference on Intelligent Systems, BRACIS 2022
Y2 - 28 November 2022 through 1 December 2022
ER -