The energy of data
GJ Székely, ML Rizzo - Annual Review of Statistics and Its …, 2017 - annualreviews.org
The energy of data is the value of a real function of distances between data in metric spaces.
The name energy derives from Newton's gravitational potential energy, which is also a …
The name energy derives from Newton's gravitational potential energy, which is also a …
A new coefficient of correlation
S Chatterjee - Journal of the American Statistical Association, 2021 - Taylor & Francis
Is it possible to define a coefficient of correlation which is (a) as simple as the classical
coefficients like Pearson's correlation or Spearman's correlation, and yet (b) consistently …
coefficients like Pearson's correlation or Spearman's correlation, and yet (b) consistently …
Gene networks in plant biology: approaches in reconstruction and analysis
Y Li, SA Pearl, SA Jackson - Trends in plant science, 2015 - cell.com
Even though vast amounts of genome-wide gene expression data have become available in
plants, it remains a challenge to effectively mine this information for the discovery of genes …
plants, it remains a challenge to effectively mine this information for the discovery of genes …
Equitability, mutual information, and the maximal information coefficient
JB Kinney, GS Atwal - … of the National Academy of Sciences, 2014 - National Acad Sciences
How should one quantify the strength of association between two random variables without
bias for relationships of a specific form? Despite its conceptual simplicity, this notion of …
bias for relationships of a specific form? Despite its conceptual simplicity, this notion of …
Rates of estimation of optimal transport maps using plug-in estimators via barycentric projections
Optimal transport maps between two probability distributions $\mu $ and $\nu $ on $\R^ d $
have found extensive applications in both machine learning and statistics. In practice, these …
have found extensive applications in both machine learning and statistics. In practice, these …
Multivariate rank-based distribution-free nonparametric testing using measure transportation
In this article, we propose a general framework for distribution-free nonparametric testing in
multi-dimensions, based on a notion of multivariate ranks defined using the theory of …
multi-dimensions, based on a notion of multivariate ranks defined using the theory of …
Deep knockoffs
This article introduces a machine for sampling approximate model-X knockoffs for arbitrary
and unspecified data distributions using deep generative models. The main idea is to …
and unspecified data distributions using deep generative models. The main idea is to …
Efficient estimation of mutual information for strongly dependent variables
We demonstrate that a popular class of non-parametric mutual information (MI) estimators
based on k-nearest-neighbor graphs requires number of samples that scales exponentially …
based on k-nearest-neighbor graphs requires number of samples that scales exponentially …
A comparative study of statistical methods used to identify dependencies between gene expression signals
S de Siqueira Santos, DY Takahashi… - Briefings in …, 2014 - academic.oup.com
One major task in molecular biology is to understand the dependency among genes to
model gene regulatory networks. Pearson's correlation is the most common method used to …
model gene regulatory networks. Pearson's correlation is the most common method used to …
Distribution-free consistent independence tests via center-outward ranks and signs
This article investigates the problem of testing independence of two random vectors of
general dimensions. For this, we give for the first time a distribution-free consistent test. Our …
general dimensions. For this, we give for the first time a distribution-free consistent test. Our …