Concept decompositions for large sparse text data using clustering

M Kobayashi, K Takeda - ACM computing surveys (CSUR), 2000 - dl.acm.org

In this paper we review studies of the growth of the Internet and technologies that are useful
for information search and retrieval on the Web. We present data on the Internet from several …

被引用次数：1042 相关文章所有 21 个版本

[PDF] arxiv.org

Recent advances in directional statistics

A Pewsey, E García-Portugués - Test, 2021 - Springer

Mainstream statistical methodology is generally applicable to data observed in Euclidean
space. There are, however, numerous contexts of considerable scientific interest in which …

被引用次数：128 相关文章所有 10 个版本

[HTML] thelancet.com Full View

[HTML][HTML] Characterizing long COVID in an international cohort: 7 months of symptoms and their impact

HE Davis, GS Assaf, L McCorkell, H Wei, RJ Low… - …, 2021 - thelancet.com

Background A significant number of patients with COVID-19 experience prolonged
symptoms, known as Long COVID. Few systematic studies have investigated this …

被引用次数：2598 相关文章所有 25 个版本

[HTML] nih.gov

Fast, sensitive and accurate integration of single-cell data with Harmony

I Korsunsky, N Millard, J Fan, K Slowikowski, F Zhang… - Nature …, 2019 - nature.com

The emerging diversity of single-cell RNA-seq datasets allows for the full transcriptional
characterization of cell types across a wide variety of biological and clinical conditions …

被引用次数：5402 相关文章所有 11 个版本

[图书][B] Machine learning for text: An introduction

CC Aggarwal, CC Aggarwal - 2018 - Springer

The extraction of useful insights from text with various types of statistical algorithms is
referred to as text mining, text analytics, or machine learning from text. The choice of …

被引用次数：463 相关文章所有 9 个版本

[PDF] nature.com

Semantic encoding during language comprehension at single-cell resolution

M Jamali, B Grannan, J Cai, AR Khanna, W Muñoz… - Nature, 2024 - nature.com

From sequences of speech sounds, or letters, humans can extract rich and nuanced
meaning through language. This capacity is essential for human communication. Yet …

被引用次数：14 相关文章所有 13 个版本

[PDF] arxiv.org

Stop using the elbow criterion for k-means and how to choose the number of clusters instead

E Schubert - ACM SIGKDD Explorations Newsletter, 2023 - dl.acm.org

A major challenge when using k-means clustering often is how to choose the parameter k,
the number of clusters. In this letter, we want to point out that it is very easy to draw poor …

被引用次数：87 相关文章所有 5 个版本

[PDF] aclanthology.org

[PDF][PDF] Improving word representations via global context and multiple word prototypes

EH Huang, R Socher, CD Manning… - Proceedings of the 50th …, 2012 - aclanthology.org

Unsupervised word representations are very useful in NLP tasks both as inputs to learning
algorithms and as extra word features in NLP systems. However, most of these models are …

被引用次数：1649 相关文章所有 18 个版本

[PDF] psu.edu

Learning feature representations with k-means

A Coates, AY Ng - Neural Networks: Tricks of the Trade: Second Edition, 2012 - Springer

Many algorithms are available to learn deep hierarchies of features from unlabeled data,
especially images. In many cases, these algorithms involve multi-layered networks of …

被引用次数：976 相关文章所有 10 个版本

Unsupervised grouped axial data modeling via hierarchical Bayesian nonparametric models with Watson distributions

W Fan, L Yang, N Bouguila - IEEE Transactions on Pattern …, 2021 - ieeexplore.ieee.org

This paper aims at proposing an unsupervised hierarchical nonparametric Bayesian
framework for modeling axial data (ie, observations are axes of direction) that can be …

被引用次数：74 相关文章所有 4 个版本