The revival of the Gini importance?

S Nembrini, IR König, MN Wright - Bioinformatics, 2018 - academic.oup.com
Motivation Random forests are fast, flexible and represent a robust approach to analyze high
dimensional data. A key advantage over alternative machine learning algorithms are …

[HTML][HTML] Bias in random forest variable importance measures: Illustrations, sources and a solution

C Strobl, AL Boulesteix, A Zeileis, T Hothorn - BMC bioinformatics, 2007 - Springer
Background Variable importance measures for random forests have been receiving
increased attention as a means of variable selection in many classification tasks in …

[HTML][HTML] Conditional permutation importance revisited

D Debeer, C Strobl - BMC bioinformatics, 2020 - Springer
Background Random forest based variable importance measures have become popular
tools for assessing the contributions of the predictor variables in a fitted random forest. In this …

Exploitation of surrogate variables in random forests for unbiased analysis of mutual impact and importance of features

LF Voges, LC Jarren, S Seifert - Bioinformatics, 2023 - academic.oup.com
Motivation Random forest is a popular machine learning approach for the analysis of high-
dimensional data because it is flexible and provides variable importance measures for the …

Unbiased variable importance for random forests

M Loecher - Communications in Statistics-Theory and Methods, 2022 - Taylor & Francis
The default variable-importance measure in random forests, Gini importance, has been
shown to suffer from the bias of the underlying Gini-gain splitting criterion. While the …

[HTML][HTML] Conditional variable importance for random forests

C Strobl, AL Boulesteix, T Kneib, T Augustin… - BMC …, 2008 - Springer
Background Random forests are becoming increasingly popular in many scientific fields
because they can cope with" small n large p" problems, complex interactions and even …

[HTML][HTML] An AUC-based permutation variable importance measure for random forests

S Janitza, C Strobl, AL Boulesteix - BMC bioinformatics, 2013 - Springer
Background The random forest (RF) method is a commonly used tool for classification with
high dimensional data as well as for ranking candidate predictors based on the so-called …

A computationally fast variable importance test for random forests for high-dimensional data

S Janitza, E Celik, AL Boulesteix - Advances in Data Analysis and …, 2018 - Springer
Random forests are a commonly used tool for classification and for ranking candidate
predictors based on the so-called variable importance measures. These measures attribute …

[HTML][HTML] The behaviour of random forest permutation-based variable importance measures under predictor correlation

KK Nicodemus, JD Malley, C Strobl, A Ziegler - BMC bioinformatics, 2010 - Springer
Background Random forests (RF) have been increasingly used in applications such as
genome-wide association and microarray studies where predictor correlation is frequently …

Trees, forests, and impurity-based variable importance in regression

E Scornet - Annales de l'Institut Henri Poincare (B) Probabilites et …, 2023 - projecteuclid.org
Tree ensemble methods such as random forests (Mach. Learn. 45 (2001) 5–32) are very
popular to handle high-dimensional tabular data sets, notably because of their ability to …