Benchmark of filter methods for feature selection in high-dimensional gene expression survival data

A Bommert, T Welchowski, M Schmid… - Briefings in …, 2022 - academic.oup.com
Feature selection is crucial for the analysis of high-dimensional data, but benchmark studies
for data with a survival outcome are rare. We compare 14 filter methods for feature selection …

mlr: Machine Learning in R

B Bischl, M Lang, L Kotthoff, J Schiffner… - Journal of Machine …, 2016 - jmlr.org
The MLR package provides a generic, object-oriented, and extensible framework for
classification, regression, survival analysis and clustering for the R language. It provides a …

ranger: A fast implementation of random forests for high dimensional data in C++ and R

MN Wright, A Ziegler - arXiv preprint arXiv:1508.04409, 2015 - arxiv.org
We introduce the C++ application and R package ranger. The software is a fast
implementation of random forests for high dimensional data. Ensembles of classification …

To tune or not to tune the number of trees in random forest

P Probst, AL Boulesteix - Journal of Machine Learning Research, 2018 - jmlr.org
The number of trees T in the random forest (RF) algorithm for supervised learning has to be
set by the user. It is unclear whether T should simply be set to the largest computationally …

A unifying framework for parallel and distributed processing in R using futures

H Bengtsson - arXiv preprint arXiv:2008.00553, 2020 - arxiv.org
A future is a programming construct designed for concurrent and asynchronous evaluation
of code, making it particularly useful for parallel processing. The future package implements …

[HTML][HTML] Aslib: A benchmark library for algorithm selection

B Bischl, P Kerschke, L Kotthoff, M Lindauer… - Artificial Intelligence, 2016 - Elsevier
The task of algorithm selection involves choosing an algorithm from a set of algorithms on a
per-instance basis in order to exploit the varying performance of algorithms over a set of …

[HTML][HTML] systemPipeR: NGS workflow and report generation environment

TWH Backman, T Girke - BMC bioinformatics, 2016 - Springer
Background Next-generation sequencing (NGS) has revolutionized how research is carried
out in many areas of biology and medicine. However, the analysis of NGS data remains a …

mlrMBO: A modular framework for model-based optimization of expensive black-box functions

B Bischl, J Richter, J Bossek, D Horn, J Thomas… - arXiv preprint arXiv …, 2017 - arxiv.org
We present mlrMBO, a flexible and comprehensive R toolbox for model-based optimization
(MBO), also known as Bayesian optimization, which addresses the problem of expensive …

Integrative analysis from the epigenome to translatome uncovers patterns of dominant nuclear regulation during transient stress

TA Lee, J Bailey-Serres - The Plant Cell, 2019 - academic.oup.com
Gene regulation is a dynamic process involving changes ranging from the remodeling of
chromatin to preferential translation. To understand integrated nuclear and cytoplasmic …

Learning the high-dimensional immunogenomic features that predict public and private antibody repertoires

V Greiff, CR Weber, J Palme, U Bodenhofer… - The Journal of …, 2017 - journals.aai.org
Recent studies have revealed that immune repertoires contain a substantial fraction of
public clones, which may be defined as Ab or TCR clonal sequences shared across …