Using OVA modeling to improve classification performance for large datasets
Expert Systems with Applications
Department of Computer Science, University of Pretoria, South Africa
One-Versus-All (OVA) classification is a classifier construction method where a k-class prediction task is decomposed into k 2-class sub-problems. One base model is constructed for each sub-problem and the base models are then combined into one model. Aggregate model implementation is the process of constructing several base models which are then combined into a single model for prediction. In essence, OVA classification is a method of aggregate modeling. This paper reports studies that were conducted to establish whether OVA classification can provide predictive performance gains when large volumes of data are available for modeling as is commonly the case in data mining. It is demonstrated in this paper that firstly, OVA modeling can be used to increase the amount of training data while at the same time using base model training sets whose size is much smaller than the total amount of available training data. Secondly, OVA models created from large datasets provide a higher level of predictive performance compared to single k-class models. Thirdly, the use of boosted OVA base models can provide higher predictive performance compared to un-boosted OVA base models. Fourthly, when the combination algorithm for base model predictions is able to resolve tied predictions, the resulting aggregate models provide a higher level of predictive performance. © 2011 Elsevier Ltd. All rights reserved.
Aggregate model; Base models; Boosting; Classification performance; Construction method; Data sets; Ensemble classification; Large datasets; Prediction tasks; Predictive performance; ROC analysis; Sub-problems; Training data; Forecasting; Classification (of information)