TY - GEN
T1 - Embedded Spherical Topic Models for Supervised Learning
AU - Ennajari, Hafsa
AU - Bouguila, Nizar
AU - Bentahar, Jamal
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Probabilistic topic models are powerful techniques for analyzing and understanding large collections of text documents to learn meaningful patterns of words. Their supervised extensions also capture topics conditioned on the response metadata associated with each document such as user rating. However, inferring such information from data often comes at the detriment of topics quality, leading to uninterpreted and meaningless topics. In this paper, we propose a novel Supervised-Embedded Spherical Topic Model (S-ESTM) that balances two goals: interpretable and coherent topics explaining the data and accurate prediction of the associated response values. Our model combines word embeddings and knowledge graph embeddings to effectively encode the semantic information of text and the related background knowledge to guide the inference of supervised topics. In S-ESTM, document constituents are drawn as points on spherical manifolds along with topics using the von Mises-Fisher distribution. Efficient variational inference methods for posterior approximation and latent parameter estimation are derived and various empirical studies on real-world datasets are also provided. Our experiments demonstrate that our model can discover discriminative and coherent topical patterns associated with regression tasks, while achieving improved prediction quality.
AB - Probabilistic topic models are powerful techniques for analyzing and understanding large collections of text documents to learn meaningful patterns of words. Their supervised extensions also capture topics conditioned on the response metadata associated with each document such as user rating. However, inferring such information from data often comes at the detriment of topics quality, leading to uninterpreted and meaningless topics. In this paper, we propose a novel Supervised-Embedded Spherical Topic Model (S-ESTM) that balances two goals: interpretable and coherent topics explaining the data and accurate prediction of the associated response values. Our model combines word embeddings and knowledge graph embeddings to effectively encode the semantic information of text and the related background knowledge to guide the inference of supervised topics. In S-ESTM, document constituents are drawn as points on spherical manifolds along with topics using the von Mises-Fisher distribution. Efficient variational inference methods for posterior approximation and latent parameter estimation are derived and various empirical studies on real-world datasets are also provided. Our experiments demonstrate that our model can discover discriminative and coherent topical patterns associated with regression tasks, while achieving improved prediction quality.
UR - http://www.scopus.com/inward/record.url?scp=85143628345&partnerID=8YFLogxK
U2 - 10.1109/ICPR56361.2022.9956503
DO - 10.1109/ICPR56361.2022.9956503
M3 - Conference contribution
AN - SCOPUS:85143628345
T3 - Proceedings - International Conference on Pattern Recognition
SP - 1650
EP - 1656
BT - 2022 26th International Conference on Pattern Recognition, ICPR 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 26th International Conference on Pattern Recognition, ICPR 2022
Y2 - 21 August 2022 through 25 August 2022
ER -