ccImpute: an accurate and scalable consensus clustering based algorithm to impute dropout events in the single-cell RNA-seq data

M Malec, H Kurban, M Dalkilic - BMC bioinformatics, 2022 - Springer
Background In recent years, the introduction of single-cell RNA sequencing (scRNA-seq)
has enabled the analysis of a cell's transcriptome at an unprecedented granularity and …

[HTML][HTML] DCEM: An R package for clustering big data via data-centric modification of Expectation Maximization

P Sharma, H Kurban, M Dalkilic - SoftwareX, 2022 - Elsevier
Clustering is intractable, so techniques exist to give a best approximation. Expectation
Maximization (EM), initially used to impute missing data, is among the most popular …

[PDF][PDF] Data expressiveness and its use in data-centric AI

H Kurban, P Sharma, M Dalkilic - Proceedings of NeurIPS Data …, 2021 - researchgate.net
To deal with the unimaginable continual growth of data and the focus on its use rather than
its governance, the value of data has begun to deteriorate seen in lack of reproducibility …

Designing a parallel Feel-the-Way clustering algorithm on HPC systems

W Zheng, D Wang, F Song - The International Journal of …, 2021 - journals.sagepub.com
This paper introduces a new parallel clustering algorithm, named Feel-the-Way clustering
algorithm, that provides better or equivalent convergence rate than the traditional clustering …

[PDF][PDF] ccImpute: anaccurate andscalable consensus clustering based algorithm toimpute dropout events inthesingle-cell RNA-seq data

M Malec, H Kurban, M Dalkilic - 2022 - researchgate.net
Background: In recent years, the introduction of single-cell RNA sequencing (scRNA-seq)
has enabled the analysis of a cell's transcriptome at an unprecedented granularity and …

[HTML][HTML] Scalable Parallel Machine Learning on High Performance Computing Systems–Clustering and Reinforcement Learning

W Zheng - 2022 - hammer.purdue.edu
High-performance computing (HPC) and machine learning (ML) have been widely adopted
by both academia and industries to address enormous data problems at extreme scales …

[PDF][PDF] What Data-Centric AI Can Do For k-means: a Faster, Robust kmeans-d

P Sharma, H Kurban, M Dalkilic - researchgate.net
Data-centric AI (DCAI) is an emerging paradigm that prioritizes the quality, diversity, and
representation of data over model architecture and hyperparameter tuning. DCAI …

[PDF][PDF] Design and Implementation of an Efficient Parallel Feel-the-Way Clustering Algorithm on High Performance Computing Systems

W Zheng, F Song, D Wang - 2018 - docs.lib.purdue.edu
This paper proposes a Feel-the-Way clustering method, which reduces the synchronization
and communication overhead, meanwhile providing as good as or better convergence rate …