作者
Mousumi Banerjee, Ying Ding, Anne-Michelle Noone
发表日期
2012/7/10
期刊
Statistics in medicine
卷号
31
期号
15
页码范围
1601-1616
出版商
John Wiley & Sons, Ltd
简介
Tree‐based methods have become popular for analyzing complex data structures where the primary goal is risk stratification of patients. Ensemble techniques improve the accuracy in prediction and address the instability in a single tree by growing an ensemble of trees and aggregating. However, in the process, individual trees get lost. In this paper, we propose a methodology for identifying the most representative trees in an ensemble on the basis of several tree distance metrics. Although our focus is on binary outcomes, the methods are applicable to censored data as well. For any two trees, the distance metrics are chosen to (1) measure similarity of the covariates used to split the trees; (2) reflect similar clustering of patients in the terminal nodes of the trees; and (3) measure similarity in predictions from the two trees. Whereas the latter focuses on prediction, the first two metrics focus on the architectural similarity …
引用总数
201220132014201520162017201820192020202120222023202411644768878146
学术搜索中的文章