Open vs. Close Source Decision Tree Algorithms: Comparing Performance Measures of Accuracy, Sensitivity and Specificity

Sushmita Khan, Hayden Wimmer, Loreen Marie Powell

Research output: Contribution to book or proceedingChapter

Abstract

Data Science research is trending due the abundance of publicly available data and open source and close (proprietary) tools available. Currently, an abundant amount of research exists on various data science techniques, tools and mining of medical data and big data. However, there is little to nonexistent research, which actually compares closed and open source algorithms. This research compared a closed source algorithm (Microsoft Decision Tree ) with open source algorithms (CART and C4.5) performances for accuracy, sensitivity, and specificity using data form the U.S. government’s Surveillance, Epidemiology, and End Results Program (SEERS). Data was downloaded, converted from raw data to structured data using a custom designed python script and transformed via the removal of missing and irrelevant data, and outliers. Predictive modeling results for accuracy, sensitivity, and specificity, indicated that closed algorithms have the best accuracy and specificity.

Original languageAmerican English
Title of host publicationProceedings of the CONISAR
StatePublished - Jan 1 2017

Disciplines

  • Computer Sciences

Keywords

  • Accuracy
  • Close
  • Comparing
  • Open
  • Performance measures
  • Sensitivity
  • Source decision tree algorithms
  • Specificity

Fingerprint

Dive into the research topics of 'Open vs. Close Source Decision Tree Algorithms: Comparing Performance Measures of Accuracy, Sensitivity and Specificity'. Together they form a unique fingerprint.

Cite this