SMS spam filtering: Methods and data

SJ Delany, M Buckley, D Greene - Expert Systems with Applications, 2012 - Elsevier
Mobile or SMS spam is a real and growing problem primarily due to the availability of very
cheap bulk pre-pay SMS packages and the fact that SMS engenders higher response rates …

[PDF][PDF] Named entity recognition in tweets: an experimental study

A Ritter, S Clark, O Etzioni - … of the 2011 conference on empirical …, 2011 - aclanthology.org
People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes
informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented …

A review of shorthand systems: From brachygraphy to microtext and beyond

R Satapathy, E Cambria, A Nanetti, A Hussain - Cognitive Computation, 2020 - Springer
Human civilizations have performed the art of writing across continents and over different
time periods. In order to speed up the writing process, the art of shorthand (brachygraphy) …

Neural models of text normalization for speech applications

H Zhang, R Sproat, AH Ng, F Stahlberg… - Computational …, 2019 - direct.mit.edu
Abstract Machine learning, including neural network techniques, have been applied to
virtually every domain in natural language processing. One problem that has been …

Colloquial indonesian lexicon

NA Salsabila, YA Winatmoko… - … Conference on Asian …, 2018 - ieeexplore.ieee.org
Colloquial Indonesian Lexicon Page 1 Colloquial Indonesian Lexicon Nikmatun Aliyah
Salsabila∗‡, Yosef Ardhito Winatmoko† Ali Akbar Septiandri∗, Ade Jamal∗ ∗Faculty of …

[PDF][PDF] A broad-coverage normalization system for social media language

F Liu, F Weng, X Jiang - Proceedings of the 50th Annual Meeting …, 2012 - aclanthology.org
Social media language contains huge amount and wide variety of nonstandard tokens,
created both intentionally and unintentionally by the users. It is of crucial importance to …

RNN approaches to text normalization: A challenge

R Sproat, N Jaitly - arXiv preprint arXiv:1611.00068, 2016 - arxiv.org
This paper presents a challenge to the community: given a large corpus of written text
aligned to its normalized spoken form, train an RNN to learn the correct normalization …

Phonetic-based microtext normalization for twitter sentiment analysis

R Satapathy, C Guerreiro, I Chaturvedi… - … conference on data …, 2017 - ieeexplore.ieee.org
The proliferation of Web 2.0 technologies and the increasing use of computer-mediated
communication resulted in a new form of written text, termed microtext. This poses new …

[PDF][PDF] An unsupervised model for text message normalization

P Cook, S Stevenson - Proceedings of the workshop on …, 2009 - aclanthology.org
Cell phone text messaging users express themselves briefly and colloquially using a variety
of creative forms. We analyze a sample of creative, non-standard text message word forms …

[PDF][PDF] Insertion, deletion, or substitution? Normalizing text messages without pre-categorization nor supervision

F Liu, F Weng, B Wang, Y Liu - … of the 49th Annual Meeting of the …, 2011 - aclanthology.org
Most text message normalization approaches are based on supervised learning and rely on
human labeled training data. In addition, the nonstandard words are often categorized into …