A combined M5P tree and hazard-based duration model for predicting urban freeway traffic accident durations
The duration of freeway traffic accidents duration is an important factor, which affects traffic
congestion, environmental pollution, and secondary accidents. Among previous studies, the
M5P algorithm has been shown to be an effective tool for predicting incident duration. M5P
builds a tree-based model, like the traditional classification and regression tree (CART)
method, but with multiple linear regression models as its leaves. The problem with M5P for
accident duration prediction, however, is that whereas linear regression assumes that the …
congestion, environmental pollution, and secondary accidents. Among previous studies, the
M5P algorithm has been shown to be an effective tool for predicting incident duration. M5P
builds a tree-based model, like the traditional classification and regression tree (CART)
method, but with multiple linear regression models as its leaves. The problem with M5P for
accident duration prediction, however, is that whereas linear regression assumes that the …
Abstract
The duration of freeway traffic accidents duration is an important factor, which affects traffic congestion, environmental pollution, and secondary accidents. Among previous studies, the M5P algorithm has been shown to be an effective tool for predicting incident duration. M5P builds a tree-based model, like the traditional classification and regression tree (CART) method, but with multiple linear regression models as its leaves. The problem with M5P for accident duration prediction, however, is that whereas linear regression assumes that the conditional distribution of accident durations is normally distributed, the distribution for a “time-to-an-event” is almost certainly nonsymmetrical. A hazard-based duration model (HBDM) is a better choice for this kind of a “time-to-event” modeling scenario, and given this, HBDMs have been previously applied to analyze and predict traffic accidents duration. Previous research, however, has not yet applied HBDMs for accident duration prediction, in association with clustering or classification of the dataset to minimize data heterogeneity. The current paper proposes a novel approach for accident duration prediction, which improves on the original M5P tree algorithm through the construction of a M5P-HBDM model, in which the leaves of the M5P tree model are HBDMs instead of linear regression models. Such a model offers the advantage of minimizing data heterogeneity through dataset classification, and avoids the need for the incorrect assumption of normality for traffic accident durations. The proposed model was then tested on two freeway accident datasets. For each dataset, the first 500 records were used to train the following three models: (1) an M5P tree; (2) a HBDM; and (3) the proposed M5P-HBDM, and the remainder of data were used for testing. The results show that the proposed M5P-HBDM managed to identify more significant and meaningful variables than either M5P or HBDMs. Moreover, the M5P-HBDM had the lowest overall mean absolute percentage error (MAPE).
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果