Topic modeling: a comprehensive review

P Kherwa, P Bansal - EAI Endorsed transactions on scalable information …, 2019 - eudl.eu
Topic modelling is the new revolution in text mining. It is a statistical technique for revealing
the underlying semantic structure in large collection of documents. After analysing …

[HTML][HTML] An overview of topic modeling and its current applications in bioinformatics

L Liu, L Tang, W Dong, S Yao, W Zhou - SpringerPlus, 2016 - Springer
Background With the rapid accumulation of biological datasets, machine learning methods
designed to automate data analysis are urgently needed. In recent years, so-called topic …

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

H Jelodar, Y Wang, C Yuan, X Feng, X Jiang… - Multimedia tools and …, 2019 - Springer
Topic modeling is one of the most powerful techniques in text mining for data mining, latent
data discovery, and finding relationships among data and text documents. Researchers …

A model of text for experimentation in the social sciences

ME Roberts, BM Stewart, EM Airoldi - Journal of the American …, 2016 - Taylor & Francis
Statistical models of text have become increasingly popular in statistics and computer
science as a method of exploring large document collections. Social scientists often want to …

[PDF][PDF] Baselines and bigrams: Simple, good sentiment and topic classification

SI Wang, CD Manning - Proceedings of the 50th Annual Meeting …, 2012 - aclanthology.org
Abstract Variants of Naive Bayes (NB) and Support Vector Machines (SVM) are often used
as baseline methods for text classification, but their performance varies greatly depending …

Topical word embeddings

Y Liu, Z Liu, TS Chua, M Sun - Proceedings of the AAAI Conference on …, 2015 - ojs.aaai.org
Most word embedding models typically represent each word using a single vector, which
makes these models indiscriminative for ubiquitous homonymy and polysemy. In order to …

[PDF][PDF] Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora

D Ramage, D Hall, R Nallapati… - Proceedings of the 2009 …, 2009 - aclanthology.org
A significant portion of the world's text is tagged by readers on social bookmarking websites.
Credit attribution is an inherent problem in these corpora because most pages have multiple …

Improving topic models with latent feature word representations

DQ Nguyen, R Billingsley, L Du… - Transactions of the …, 2015 - direct.mit.edu
Probabilistic topic models are widely used to discover latent topics in document collections,
while latent feature vector representations of words have been used to obtain high …

[HTML][HTML] Short text topic modelling approaches in the context of big data: taxonomy, survey, and analysis

BAH Murshed, S Mallappa, J Abawajy… - Artificial Intelligence …, 2023 - Springer
Social media platforms such as (Twitter, Facebook, and Weibo) are being increasingly
embraced by individuals, groups, and organizations as a valuable source of information …

[PDF][PDF] Incorporating lexical priors into topic models

J Jagarlamudi, H Daumé III… - Proceedings of the 13th …, 2012 - aclanthology.org
Topic models have great potential for helping users understand document corpora. This
potential is stymied by their purely unsupervised nature, which often leads to topics that are …