Evaluating Trace Encoding Methods in Process Mining

Sylvio Barbon Junior, Paolo Ceravolo, Ernesto Damiani, Gabriel Marques Tavares

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations


Encoding methods affect the performance of process mining tasks but little work in the literature focused on quantifying their impact. In this paper, we compare 10 different encoding methods from three different families (trace replay and alignment, graph embeddings, and word embeddings) using measures to evaluate the overlaps in the feature space, the accuracy obtained, and the computational resources (time) consumed with a classification task. Across hundreds of event logs representing four variations of five scenarios and five anomalies, it was possible to identify the edge2vec method as the most accurate and effective in reducing class overlapping in the feature space.

Original languageBritish English
Title of host publicationFrom Data to Models and Back - 9th International Symposium, DataMod 2020, Revised Selected Papers
EditorsJuliana Bowles, Giovanna Broccia, Mirco Nanni
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages16
ISBN (Print)9783030706494
StatePublished - 2021
Event9th International Symposium on From Data Models and Back, DataMod 2020 - Virtual, Online
Duration: 20 Oct 202020 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12611 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference9th International Symposium on From Data Models and Back, DataMod 2020
CityVirtual, Online


  • Classification
  • Graph embeddings
  • Process Mining
  • Trace encoding
  • Word embeddings


Dive into the research topics of 'Evaluating Trace Encoding Methods in Process Mining'. Together they form a unique fingerprint.

Cite this