African Evaluation Database

Back

2513

Source

Scopus

Electronic ID

2-s2.0-82255179058

Authors

Lutu P.E.N., Engelbrecht A.P.

Title

Using OVA modeling to improve classification performance for large datasets

Publication Year

2012

Source Title

Expert Systems with Applications

Volume

Issue

Citations

DOI

10.1016/j.eswa.2011.09.156

URL

https://www.scopus.com/inward/record.uri?eid=2-s2.0-82255179058&partnerID=40&md5=4e7f739c12268d50cf8bdb8dfc2b0f1b

Affiliations

Department of Computer Science, University of Pretoria, South Africa

Authors with Affiliations

Lutu, P.E.N., Department of Computer Science, University of Pretoria, South Africa; Engelbrecht, A.P., Department of Computer Science, University of Pretoria, South Africa

Abstract

One-Versus-All (OVA) classification is a classifier construction method where a k-class prediction task is decomposed into k 2-class sub-problems. One base model is constructed for each sub-problem and the base models are then combined into one model. Aggregate model implementation is the process of constructing several base models which are then combined into a single model for prediction. In essence, OVA classification is a method of aggregate modeling. This paper reports studies that were conducted to establish whether OVA classification can provide predictive performance gains when large volumes of data are available for modeling as is commonly the case in data mining. It is demonstrated in this paper that firstly, OVA modeling can be used to increase the amount of training data while at the same time using base model training sets whose size is much smaller than the total amount of available training data. Secondly, OVA models created from large datasets provide a higher level of predictive performance compared to single k-class models. Thirdly, the use of boosted OVA base models can provide higher predictive performance compared to un-boosted OVA base models. Fourthly, when the combination algorithm for base model predictions is able to resolve tied predictions, the resulting aggregate models provide a higher level of predictive performance. © 2011 Elsevier Ltd. All rights reserved.

Author Keywords

Boosting; Dataset partitioning; Dataset sampling; Dataset selection; Ensemble classification; Model aggregation; OVA classification; ROC analysis

Index Keywords

Aggregate model; Base models; Boosting; Classification performance; Construction method; Data sets; Ensemble classification; Large datasets; Prediction tasks; Predictive performance; ROC analysis; Sub-problems; Training data; Forecasting; Classification (of information)

Funding Details

None