Column generation based heuristic for learning classification trees

Murat Firat, Guillaume Crognier, Adriana F. Gabor, C. A.J. Hurkens, Yingqian Zhang

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

This paper explores the use of Column Generation (CG) techniques in constructing univariate binary decision trees for classification tasks. We propose a novel Integer Linear Programming (ILP) formulation, based on root-to-leaf paths in decision trees. The model is solved via a Column Generation based heuristic. To speed up the heuristic, we use a restricted instance data by considering a subset of decision splits, sampled from the solutions of the well-known CART algorithm. Extensive numerical experiments show that our approach is competitive with the state-of-the-art ILP-based algorithms. In particular, the proposed approach is capable of handling big data sets with tens of thousands of data rows. Moreover, for large data sets, it finds solutions competitive to CART.

Original languageBritish English
Article number104866
JournalComputers and Operations Research
Volume116
DOIs
StatePublished - Apr 2020

Keywords

  • CART
  • Classification
  • Column generation
  • Decision trees
  • Integer linear programming
  • Machine learning

Fingerprint

Dive into the research topics of 'Column generation based heuristic for learning classification trees'. Together they form a unique fingerprint.

Cite this