Journal of Artificial Intelligence Research 2 (1995) 369-409
Submitted
10/94; published 3/95
(c) 1995 National Research Council Canada. All rights
reserved. Published by permission.
Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic
Decision Tree Induction Algorithm
Peter D. Turney
(peter@ai.iit.nrc.ca)
Knowledge Systems Laboratory
Institute for Information Technology
National Research Council Canada
Ottawa,
Ontario, Canada, K1A 0R6
This paper introduces ICET, a new algorithm for cost-sensitive
classification. ICET uses a genetic algorithm to evolve a population of biases
for a decision tree induction algorithm. The fitness function of the genetic
algorithm is the average cost of classification when using the decision tree,
including both the costs of tests (features, measurements) and the costs of
classification errors. ICET is compared here with three other algorithms for
cost-sensitive classification -- EG2, CS-ID3, and IDX -- and also with C4.5,
which classifies without regard to cost. The five algorithms are evaluated
empirically on five real-world medical datasets. Three sets of experiments are
performed. The first set examines the baseline performance of the five
algorithms on the five datasets and establishes that ICET performs significantly
better than its competitors. The second set tests the robustness of ICET under a
variety of conditions and shows that ICET maintains its advantage. The third set
looks at ICET's search in bias space and discovers a way to improve the search.
Return to the JAIR home page.