Comparison of generality based algorithm variants for automatic taxonomy generation

Andreas Henschel, Wei Lee Woon, Thomas Wächter, Stuart Madnick

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

We compare a family of algorithms for the automatic generation of taxonomies by adapting the Heymann-algorithm in various ways. The core algorithm determines the generality of terms and iteratively inserts them in a growing taxonomy. Variants of the algorithm are created by altering the way and the frequency, generality of terms is calculated. We analyse the performance and the complexity of the variants combined with a systematic threshold evaluation on a set of seven manually created benchmark sets. As a result, betweenness centrality calculated on unweighted similarity graphs often performs best but requires threshold fine-tuning and is computationally more expensive than closeness centrality. Finally, we show how an entropy-based filter can lead to more precise taxonomies.

Original languageBritish English
Title of host publication2009 International Conference on Innovations in Information Technology, IIT '09
Pages160-164
Number of pages5
DOIs
StatePublished - 2009
Event2009 International Conference on Innovations in Information Technology, IIT '09 - Al-Ain, United Arab Emirates
Duration: 15 Dec 200917 Dec 2009

Publication series

Name2009 International Conference on Innovations in Information Technology, IIT '09

Conference

Conference2009 International Conference on Innovations in Information Technology, IIT '09
Country/TerritoryUnited Arab Emirates
CityAl-Ain
Period15/12/0917/12/09

Fingerprint

Dive into the research topics of 'Comparison of generality based algorithm variants for automatic taxonomy generation'. Together they form a unique fingerprint.

Cite this