Sampling design optimization for soil mapping with random forest

AMJC Wadoux, DJ Brus, GBM Heuvelink - Geoderma, 2019 - Elsevier
Abstract Machine learning techniques are widely employed to generate digital soil maps.
The map accuracy is partly determined by the number and spatial locations of the …

A debiased MDI feature importance measure for random forests

X Li, Y Wang, S Basu, K Kumbier… - Advances in Neural …, 2019 - proceedings.neurips.cc
Tree ensembles such as Random Forests have achieved impressive empirical success
across a wide variety of applications. To understand how these models make predictions …

Unraveling correlations between molecular properties and device parameters of organic solar cells using machine learning

H Sahu, H Ma - The journal of physical chemistry letters, 2019 - ACS Publications
Understanding the relationships between molecular properties and device parameters is
highly desired not only to improve the overall performance of an organic solar cell but also to …

Machine learning-based predictive modeling of surgical intervention in glaucoma using systemic data from electronic health records

SL Baxter, C Marks, TT Kuo, L Ohno-Machado… - American journal of …, 2019 - Elsevier
Purpose To predict the need for surgical intervention in patients with primary open-angle
glaucoma (POAG) using systemic data in electronic health records (EHRs). Design …

Identification of discriminatory antibiotic resistance genes among environmental resistomes using extremely randomized tree algorithm

S Gupta, G Arango-Argoty, L Zhang, A Pruden… - Microbiome, 2019 - Springer
Background The interconnectivities of built and natural environments can serve as conduits
for the proliferation and dissemination of antibiotic resistance genes (ARGs). Several studies …

Machine learning prediction on properties of nanoporous materials utilizing pore geometry barcodes

X Zhang, J Cui, K Zhang, J Wu… - Journal of chemical …, 2019 - ACS Publications
In this work, we propose a computational framework for machine learning prediction on
structural and performance properties of nanoporous materials for methane storage …

Surrogate minimal depth as an importance measure for variables in random forests

S Seifert, S Gundlach, S Szymczak - Bioinformatics, 2019 - academic.oup.com
Motivation It has been shown that the machine learning approach random forest can be
successfully applied to omics data, such as gene expression data, for classification or …

[PDF][PDF] Package 'ranger'

MN Wright, S Wager, P Probst, MMN Wright - Version 0.11, 2019 - mirror.las.iastate.edu
Description A fast implementation of Random Forests, particularly suited for high
dimensional data. Ensembles of classification, regression, survival and probability prediction …

Personalised analytics for rare disease diagnostics

D Anderson, G Baynam, JM Blackwell… - Nature …, 2019 - nature.com
Whole genome and exome sequencing is a standard tool for the diagnosis of patients
suffering from rare and other genetic disorders. The interpretation of the tens of thousands of …

Unbiased measurement of feature importance in tree-based methods

Z Zhou, G Hooker - arXiv preprint arXiv:1903.05179, 2019 - arxiv.org
We propose a modification that corrects for split-improvement variable importance measures
in Random Forests and other tree-based methods. These methods have been shown to be …