From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …
Statistical language models for information retrieval a critical review
CX Zhai - Foundations and Trends® in Information Retrieval, 2008 - nowpublishers.com
Statistical language models have recently been successfully applied to many information
retrieval problems. A great deal of recent work has shown that statistical language models …
retrieval problems. A great deal of recent work has shown that statistical language models …
Domain-specific language model pretraining for biomedical natural language processing
Pretraining large neural language models, such as BERT, has led to impressive gains on
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …
Large-scale evidence for logarithmic effects of word predictability on reading time
During real-time language comprehension, our minds rapidly decode complex meanings
from sequences of words. The difficulty of doing so is known to be related to words' …
from sequences of words. The difficulty of doing so is known to be related to words' …
Good-enough compositional data augmentation
J Andreas - arXiv preprint arXiv:1904.09545, 2019 - arxiv.org
We propose a simple data augmentation protocol aimed at providing a compositional
inductive bias in conditional and unconditional sequence models. Under this protocol …
inductive bias in conditional and unconditional sequence models. Under this protocol …
Applications of topic models
How can a single person understand what's going on in a collection of millions of
documents? This is an increasingly common problem: sifting through an organization's e …
documents? This is an increasingly common problem: sifting through an organization's e …
Automatic language identification in texts: A survey
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …
document or part thereof is written in. Automatic LI has been extensively researched for over …
Composition in distributional models of semantics
J Mitchell, M Lapata - Cognitive science, 2010 - Wiley Online Library
Vector‐based models of word meaning have become increasingly popular in cognitive
science. The appeal of these models lies in their ability to represent meaning simply by …
science. The appeal of these models lies in their ability to represent meaning simply by …
An empirical study of smoothing techniques for language modeling
SF Chen, J Goodman - Computer Speech & Language, 1999 - Elsevier
We survey the most widely-used algorithms for smoothing models for language n-gram
modeling. We then present an extensive empirical comparison of several of these smoothing …
modeling. We then present an extensive empirical comparison of several of these smoothing …
Automating the construction of internet portals with machine learning
Abstract Domain-specific internet portals are growing in popularity because they gather
content from the Web and organize it for easy access, retrieval and search. For example …
content from the Web and organize it for easy access, retrieval and search. For example …