作者
Weiping Ma, Sunkyu Kim, Shrabanti Chowdhury, YANG Mi, Seungyeul Yoo, Francesca Petralia, Jeremy Jacobsen, Jingyi Jessica Li, Xinzhou Ge, Kexin Li, Thomas Yu, Nathan John Edwards, Samuel Payne, Paul C Boutros, Henry Rodriguez, Gustavo A Stolovitzky, Jaewoo Kang, David Fenyo, Julio Saez, Pei Wang
发表日期
2020/1/1
期刊
bioRxiv
出版商
Cold Spring Harbor Laboratory
简介
Deep proteomics profiling using labeled LC-MS/MS experiments has been proven to be powerful to study complex diseases. However, due to the dynamic nature of the discovery mass spectrometry, the generated data contain a substantial fraction of missing values. This poses great challenges for data analyses, as many tools, especially those for high dimensional data, cannot deal with missing values directly. To address this problem, the NCI-CPTAC Proteogenomics DREAM Challenge was carried out to develop effective imputation algorithms for labeled LC-MS/MS proteomics data through crowd learning. The final resulting algorithm, DreamAI, is based on an ensemble of six different imputation methods. The imputation accuracy of DreamAI, as measured by Pearson correlation, is about 15%-50% greater than existing tools among less abundant proteins, which are more vulnerable to be missed in proteomics data sets. This new tool notably enhances data analysis capabilities in proteomics research.
引用总数
202020212022202320241151110
学术搜索中的文章