Correlated Topic Modeling for Short Texts in Spherical Embedding Spaces

Research output: Contribution to journalArticlepeer-review

Abstract

With the prevalence of short texts in various forms such as news headlines, tweets, and reviews, short text analysis has gained significant interest in recent times. However, modeling short texts remains a challenging task due to its sparse and noisy nature. In this paper, we propose a new Spherical Correlated Topic Model (SCTM), which takes into account the correlation between topics. Our model integrates word and knowledge graph embeddings to better capture the semantic relationships among short texts. We adopt the von Mises-Fisher distribution to model the high-dimensional word and entity embeddings on a hypersphere, enabling better preservation of the angular relationships between topic vectors. Moreover, knowledge graph embeddings are incorporated to further enrich the semantic meaning of short texts. Experimental results on several datasets demonstrate that our proposed SCTM model outperforms existing models in terms of both topic coherence and document classification. In addition, our model is capable of providing interpretable topics and revealing meaningful correlations among short texts.

Original languageBritish English
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
DOIs
StateAccepted/In press - 2025

Keywords

  • Knowledge graph embedding
  • short text analysis
  • topic correlation
  • topic modeling
  • von mises-fisher
  • word embedding

Fingerprint

Dive into the research topics of 'Correlated Topic Modeling for Short Texts in Spherical Embedding Spaces'. Together they form a unique fingerprint.

Cite this