作者
Digna R Velez, Bill C White, Alison A Motsinger, William S Bush, Marylyn D Ritchie, Scott M Williams, Jason H Moore
发表日期
2007/5
期刊
Genetic Epidemiology: the Official Publication of the International Genetic Epidemiology Society
卷号
31
期号
4
页码范围
306-315
出版商
Wiley Subscription Services, Inc., A Wiley Company
简介
Multifactor dimensionality reduction (MDR) was developed as a method for detecting statistical patterns of epistasis. The overall goal of MDR is to change the representation space of the data to make interactions easier to detect. It is well known that machine learning methods may not provide robust models when the class variable (e.g. case‐control status) is imbalanced and accuracy is used as the fitness measure. This is because most methods learn patterns that are relevant for the larger of the two classes. The goal of this study was to evaluate three different strategies for improving the power of MDR to detect epistasis in imbalanced datasets. The methods evaluated were: (1) over‐sampling that resamples with replacement the smaller class until the data are balanced, (2) under‐sampling that randomly removes subjects from the larger class until the data are balanced, and (3) balanced accuracy …
引用总数
20072008200920102011201220132014201520162017201820192020202120222023202441218393731343232261938302223412123
学术搜索中的文章
DR Velez, BC White, AA Motsinger, WS Bush… - Genetic Epidemiology: the Official Publication of the …, 2007