TY - JOUR
T1 - DCSVM
T2 - fast multi-class classification using support vector machines
AU - Don, Duleep Rathgamage
AU - Iacob, Ionut E.
N1 - Publisher Copyright:
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2020/2/1
Y1 - 2020/2/1
N2 - Using binary classification techniques to perform multi-class classification of data is still of great practical interest due to the robustness and simplicity of binary classifiers. These techniques produce a single multi-class classification decision based on many binary decisions. Our work relies on the simple observation that as dimensionality increases so does the data sparsity and, consequently, a single binary classifier may separate multiple classes. Therefore, we claim that the number of binary decisions can be significantly reduced. We present Divide and Conquer Support Vector Machines (DCSVM), an efficient algorithm for multi-class classification using Support Vector Machines. DCSVM is a divide and conquer algorithm which relies on data sparsity in high dimensional space and performs a smart partitioning of the whole training data set into disjoint subsets that are easily separable. A single prediction performed between two partitions eliminates at once one or more classes in one partition, leaving only a reduced number of candidate classes for subsequent steps. The algorithm continues recursively, reducing the number of classes at each step, until a final binary decision is made between the last two classes left in the competition. In the best case scenario, our algorithm makes a final decision between k classes in O(log k) decision steps and in the worst case scenario DCSVM makes a final decision in k- 1 steps, which is not worse than the existent techniques.
AB - Using binary classification techniques to perform multi-class classification of data is still of great practical interest due to the robustness and simplicity of binary classifiers. These techniques produce a single multi-class classification decision based on many binary decisions. Our work relies on the simple observation that as dimensionality increases so does the data sparsity and, consequently, a single binary classifier may separate multiple classes. Therefore, we claim that the number of binary decisions can be significantly reduced. We present Divide and Conquer Support Vector Machines (DCSVM), an efficient algorithm for multi-class classification using Support Vector Machines. DCSVM is a divide and conquer algorithm which relies on data sparsity in high dimensional space and performs a smart partitioning of the whole training data set into disjoint subsets that are easily separable. A single prediction performed between two partitions eliminates at once one or more classes in one partition, leaving only a reduced number of candidate classes for subsequent steps. The algorithm continues recursively, reducing the number of classes at each step, until a final binary decision is made between the last two classes left in the competition. In the best case scenario, our algorithm makes a final decision between k classes in O(log k) decision steps and in the worst case scenario DCSVM makes a final decision in k- 1 steps, which is not worse than the existent techniques.
KW - Divide and conquer
KW - Multiclass classification
KW - SVM
UR - http://www.scopus.com/inward/record.url?scp=85069505445&partnerID=8YFLogxK
U2 - 10.1007/s13042-019-00984-9
DO - 10.1007/s13042-019-00984-9
M3 - Article
AN - SCOPUS:85069505445
SN - 1868-8071
VL - 11
SP - 433
EP - 447
JO - International Journal of Machine Learning and Cybernetics
JF - International Journal of Machine Learning and Cybernetics
IS - 2
ER -