Navigating the pitfalls of applying machine learning in genomics

S Whalen, J Schreiber, WS Noble… - Nature Reviews Genetics, 2022 - nature.com
The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data
available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the …

A spectrum of explainable and interpretable machine learning approaches for genomic studies

AM Conard, A DenAdel… - Wiley Interdisciplinary …, 2023 - Wiley Online Library
The advancement of high‐throughput genomic assays has led to enormous growth in the
availability of large‐scale biological datasets. Over the last two decades, these increasingly …

Tree-based QTL mapping with expected local genetic relatedness matrices

V Link, JG Schraiber, C Fan, B Dinh, N Mancuso… - The American Journal of …, 2023 - cell.com
Understanding the genetic basis of complex phenotypes is a central pursuit of genetics.
Genome-wide association studies (GWASs) are a powerful way to find genetic loci …

Gene regulatory effects of a large chromosomal inversion in highland maize

T Crow, J Ta, S Nojoomi, MR Aguilar-Rangel… - PLoS …, 2020 - journals.plos.org
Chromosomal inversions play an important role in local adaptation. Inversions can capture
multiple locally adaptive functional variants in a linked block by repressing recombination …

Genome wide association mapping for agronomic, fruit quality, and root architectural traits in tomato under organic farming conditions

P Tripodi, S Soler, G Campanelli, MJ Díez, S Esposito… - BMC Plant …, 2021 - Springer
Background Opportunity and challenges of the agriculture scenario of the next decades will
face increasing demand for secure food through approaches able to minimize the input to …

MegaLMM: mega-scale linear mixed models for genomic predictions with thousands of traits

DE Runcie, J Qu, H Cheng, L Crawford - Genome biology, 2021 - Springer
Large-scale phenotype data can enhance the power of genomic prediction in plant and
animal breeding, as well as human genetics. However, the statistical foundation of multi-trait …

An adaptive teosinte mexicana introgression modulates phosphatidylcholine levels and is associated with maize flowering time

AC Barnes, F Rodríguez-Zapata… - Proceedings of the …, 2022 - National Acad Sciences
Native Americans domesticated maize (Zea mays ssp. mays) from lowland teosinte
parviglumis (Zea mays ssp. parviglumis) in the warm Mexican southwest and brought it to …

Efficient variance components analysis across millions of genomes

A Pazokitoroudi, Y Wu, KS Burch, K Hou, A Zhou… - Nature …, 2020 - nature.com
While variance components analysis has emerged as a powerful tool in complex trait
genetics, existing methods for fitting variance components do not scale well to large-scale …

Efficient ReML inference in variance component mixed models using a Min-Max algorithm

F Laporte, A Charcosset… - PLoS computational …, 2022 - journals.plos.org
Since their introduction in the 50's, variance component mixed models have been widely
used in many application fields. In this context, ReML estimation is by far the most popular …

Matrix sketching framework for linear mixed models in association studies

M Burch, A Bose, G Dexter, L Parida… - Genome …, 2024 - genome.cshlp.org
Linear mixed models (LMMs) have been widely used in genome-wide association studies to
control for population stratification and cryptic relatedness. However, estimating LMM …