Knowledge-enhanced Spherical Representation Learning for Text Classification

Hafsa Ennajari, Nizar Bouguila, Jamal Bentahar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

We introduce Knowledge-enhanced Spherical Representation Learning (K-SRL), a generative probabilistic model of text documents that combines word embeddings and knowledge graph embeddings to effectively encode the semantic information of text and the related background knowledge into a low-dimensional representation. More specifically, the proposed model represents each text document as a combination of both words and entities linked to an external large knowledge graph and models them as points on the unit hypersphere using the von Mises-Fisher distribution. Furthermore, we develop an efficient variational Bayesian inference algorithm to learn unsupervised text embeddings in the spherical space. Experimental results on multiple benchmark datasets demonstrate that our model outperforms existing probabilistic models on common text classification tasks, including text categorization and sentiment analysis.

Original languageBritish English
Title of host publicationProceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022
PublisherSociety for Industrial and Applied Mathematics Publications
Pages639-647
Number of pages9
ISBN (Electronic)9781611977172
StatePublished - 2022
Event2022 SIAM International Conference on Data Mining, SDM 2022 - Virtual, Online
Duration: 28 Apr 202230 Apr 2022

Publication series

NameProceedings of the 2022 SIAM International Conference on Data Mining, SDM 2022

Conference

Conference2022 SIAM International Conference on Data Mining, SDM 2022
CityVirtual, Online
Period28/04/2230/04/22

Keywords

  • Bayesian inference
  • Knowledge graph embedding
  • Representation learning
  • Von Mises-Fisher

Fingerprint

Dive into the research topics of 'Knowledge-enhanced Spherical Representation Learning for Text Classification'. Together they form a unique fingerprint.

Cite this