The energy of data

GJ Székely, ML Rizzo - Annual Review of Statistics and Its …, 2017 - annualreviews.org
The energy of data is the value of a real function of distances between data in metric spaces.
The name energy derives from Newton's gravitational potential energy, which is also a …

A new coefficient of correlation

S Chatterjee - Journal of the American Statistical Association, 2021 - Taylor & Francis
Is it possible to define a coefficient of correlation which is (a) as simple as the classical
coefficients like Pearson's correlation or Spearman's correlation, and yet (b) consistently …

Gene networks in plant biology: approaches in reconstruction and analysis

Y Li, SA Pearl, SA Jackson - Trends in plant science, 2015 - cell.com
Even though vast amounts of genome-wide gene expression data have become available in
plants, it remains a challenge to effectively mine this information for the discovery of genes …

Equitability, mutual information, and the maximal information coefficient

JB Kinney, GS Atwal - … of the National Academy of Sciences, 2014 - National Acad Sciences
How should one quantify the strength of association between two random variables without
bias for relationships of a specific form? Despite its conceptual simplicity, this notion of …

Rates of estimation of optimal transport maps using plug-in estimators via barycentric projections

N Deb, P Ghosal, B Sen - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Optimal transport maps between two probability distributions $\mu $ and $\nu $ on $\R^ d $
have found extensive applications in both machine learning and statistics. In practice, these …

Multivariate rank-based distribution-free nonparametric testing using measure transportation

N Deb, B Sen - Journal of the American Statistical Association, 2023 - Taylor & Francis
In this article, we propose a general framework for distribution-free nonparametric testing in
multi-dimensions, based on a notion of multivariate ranks defined using the theory of …

Deep knockoffs

Y Romano, M Sesia, E Candès - Journal of the American Statistical …, 2020 - Taylor & Francis
This article introduces a machine for sampling approximate model-X knockoffs for arbitrary
and unspecified data distributions using deep generative models. The main idea is to …

Efficient estimation of mutual information for strongly dependent variables

S Gao, G Ver Steeg, A Galstyan - Artificial intelligence and …, 2015 - proceedings.mlr.press
We demonstrate that a popular class of non-parametric mutual information (MI) estimators
based on k-nearest-neighbor graphs requires number of samples that scales exponentially …

A comparative study of statistical methods used to identify dependencies between gene expression signals

S de Siqueira Santos, DY Takahashi… - Briefings in …, 2014 - academic.oup.com
One major task in molecular biology is to understand the dependency among genes to
model gene regulatory networks. Pearson's correlation is the most common method used to …

Distribution-free consistent independence tests via center-outward ranks and signs

H Shi, M Drton, F Han - Journal of the American Statistical …, 2022 - Taylor & Francis
This article investigates the problem of testing independence of two random vectors of
general dimensions. For this, we give for the first time a distribution-free consistent test. Our …