Challenges of big data analysis

J Fan, F Han, H Liu - National science review, 2014 - academic.oup.com
Big Data bring new opportunities to modern society and challenges to data scientists. On the
one hand, Big Data hold great promises for discovering subtle population patterns and …

Cancer transcriptome profiling at the juncture of clinical translation

M Cieślik, AM Chinnaiyan - Nature Reviews Genetics, 2018 - nature.com
Methodological breakthroughs over the past four decades have repeatedly revolutionized
transcriptome profiling. Using RNA sequencing (RNA-seq), it has now become possible to …

ArrayExpress update–from bulk to single-cell expression data

A Athar, A Füllgrabe, N George, H Iqbal… - Nucleic acids …, 2019 - academic.oup.com
Abstract ArrayExpress (https://www. ebi. ac. uk/arrayexpress) is an archive of functional
genomics data from a variety of technologies assaying functional modalities of a genome …

A genome-wide transcriptomic analysis of protein-coding genes in human blood cells

M Uhlen, MJ Karlsson, W Zhong, A Tebani, C Pou… - Science, 2019 - science.org
INTRODUCTION Blood is the predominant source for molecular analyses in humans, both in
clinical and research settings, and is the target for many therapeutic strategies, emphasizing …

Random forest versus logistic regression: a large-scale benchmark experiment

R Couronné, P Probst, AL Boulesteix - BMC bioinformatics, 2018 - Springer
Abstract Background and goal The Random Forest (RF) algorithm for regression and
classification has considerably gained popularity since its introduction in 2001. Meanwhile, it …

Massive mining of publicly available RNA-seq data from human and mouse

A Lachmann, D Torre, AB Keenan, KM Jagodnik… - Nature …, 2018 - nature.com
RNA sequencing (RNA-seq) is the leading technology for genome-wide transcript
quantification. However, publicly available RNA-seq data is currently provided mostly in raw …

ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap

T Metsalu, J Vilo - Nucleic acids research, 2015 - academic.oup.com
Abstract The Principal Component Analysis (PCA) is a widely used method of reducing the
dimensionality of high-dimensional data, often followed by visualizing two of the …

Tissue-based map of the human proteome

M Uhlén, L Fagerberg, BM Hallström, C Lindskog… - Science, 2015 - science.org
INTRODUCTION Resolving the molecular details of proteome variation in the different
tissues and organs of the human body would greatly increase our knowledge of human …

Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics

L Fagerberg, BM Hallström, P Oksvold, C Kampf… - Molecular & cellular …, 2014 - ASBMB
Global classification of the human proteins with regards to spatial expression patterns
across organs and tissues is important for studies of human biology and disease. Here, we …

Standardization of sample collection, isolation and analysis methods in extracellular vesicle research

KW Witwer, EI Buzás, LT Bemis, A Bora… - Journal of …, 2013 - Taylor & Francis
The emergence of publications on extracellular RNA (exRNA) and extracellular vesicles
(EV) has highlighted the potential of these molecules and vehicles as biomarkers of disease …