Challenges of big data analysis
Big Data bring new opportunities to modern society and challenges to data scientists. On the
one hand, Big Data hold great promises for discovering subtle population patterns and …
one hand, Big Data hold great promises for discovering subtle population patterns and …
High-dimensional statistics with a view toward applications in biology
We review statistical methods for high-dimensional data analysis and pay particular
attention to recent developments for assessing uncertainties in terms of controlling false …
attention to recent developments for assessing uncertainties in terms of controlling false …
[HTML][HTML] Development of a stacked ensemble model for forecasting and analyzing daily average PM2. 5 concentrations in Beijing, China
B Zhai, J Chen - Science of the Total Environment, 2018 - Elsevier
A stacked ensemble model is developed for forecasting and analyzing the daily average
concentrations of fine particulate matter (PM 2.5) in Beijing, China. Special feature extraction …
concentrations of fine particulate matter (PM 2.5) in Beijing, China. Special feature extraction …
Non-negative least squares for high-dimensional linear models: Consistency and sparse recovery without regularization
M Slawski, M Hein - 2013 - projecteuclid.org
Least squares fitting is in general not useful for high-dimensional linear models, in which the
number of predictors is of the same or even larger order of magnitude than the number of …
number of predictors is of the same or even larger order of magnitude than the number of …
Higher criticism for large-scale inference, especially for rare and weak effects
In modern high-throughput data analysis, researchers perform a large number of statistical
tests, expecting to find perhaps a small fraction of significant effects against a predominantly …
tests, expecting to find perhaps a small fraction of significant effects against a predominantly …
Exact post model selection inference for marginal screening
We develop a framework for post model selection inference, via marginal screening, in
linear regression. At the core of this framework is a result that characterizes the exact …
linear regression. At the core of this framework is a result that characterizes the exact …
Matrix factorization techniques in machine learning, signal processing, and statistics
Compressed sensing is an alternative to Shannon/Nyquist sampling for acquiring sparse or
compressible signals. Sparse coding represents a signal as a sparse linear combination of …
compressible signals. Sparse coding represents a signal as a sparse linear combination of …
Ordered weighted l1 regularized regression with strongly correlated covariates: Theoretical aspects
M Figueiredo, R Nowak - Artificial Intelligence and Statistics, 2016 - proceedings.mlr.press
This paper studies the ordered weighted L1 (OWL) family of regularizers for sparse linear
regression with strongly correlated covariates. We prove sufficient conditions for clustering …
regression with strongly correlated covariates. We prove sufficient conditions for clustering …
Maximin effects in inhomogeneous large-scale data
N Meinshausen, P Bühlmann - 2015 - projecteuclid.org
Large-scale data are often characterized by some degree of inhomogeneity as data are
either recorded in different time regimes or taken from multiple sources. We look at …
either recorded in different time regimes or taken from multiple sources. We look at …