On measuring and correcting the effects of data mining and model selection

J Ye - Journal of the American Statistical Association, 1998 - Taylor & Francis
In the theory of linear models, the concept of degrees of freedom plays an important role.
This concept is often used for measurement of model complexity, for obtaining an unbiased …

On the distributional properties of model selection criteria

P Zhang - Journal of the American Statistical Association, 1992 - Taylor & Francis
It is commonly accepted that statistical modeling should follow the parsimony principle;
namely, that simple models should be given priority whenever possible. But little quantitative …

Checking normality and homoscedasticity in the general linear model using diagnostic plots

A Schützenmeister, U Jensen… - … in Statistics-Simulation …, 2012 - Taylor & Francis
Inference for the general linear model makes several assumptions, including independence
of errors, normality, and homogeneity of variance. Departure from the latter two of these …

Consistent variable selection in linear models

X Zheng, WY Loh - Journal of the American Statistical Association, 1995 - Taylor & Francis
A method of estimating linear model dimension and variable selection is proposed. This new
criterion, which generalizes the Cp criterion, the Akaike information criterion (AIC), the Bayes …

The relationship between variable selection and data agumentation and a method for prediction

DM Allen - technometrics, 1974 - Taylor & Francis
We show that data augmentation provides a rather general formulation for the study of
biased prediction techniques using multiple linear regression. Variable selection is a limiting …

Assessment of local influence

RD Cook - Journal of the Royal Statistical Society Series B …, 1986 - academic.oup.com
Statistical models usually involve some degree of approximation and therefore are nearly
always wrong. Because of this inexactness, an assessment of the influence of minor …

The statistics of linear models: back to basics

JA Nelder - Statistics and computing, 1994 - Springer
Inference from the fitting of linear models is basic to statistical practice, but the development
of strategies for analysis has been hindered by unnecessary complexities in the descriptions …

Generalized collinearity diagnostics

J Fox, G Monette - Journal of the American Statistical Association, 1992 - Taylor & Francis
Working in the context of the linear model y= Xβ+ ε, we generalize the concept of variance
inflation as a measure of collinearity to a subset of parameters in β (denoted by β 1, with the …

Evaluation of regression models: Model assessment, model selection and generalization error

F Emmert-Streib, M Dehmer - Machine learning and knowledge extraction, 2019 - mdpi.com
When performing a regression or classification analysis, one needs to specify a statistical
model. This model should avoid the overfitting and underfitting of data, and achieve a low …

A coefficient of determination for generalized linear models

D Zhang - The American Statistician, 2017 - Taylor & Francis
The coefficient of determination, aka R 2, is well-defined in linear regression models, and
measures the proportion of variation in the dependent variable explained by the predictors …