APPLICATION OF DATA MINING TO EXPLORE THE PATTERN OF TUBERCULOSIS: THE CASE OF DEBIRE BIRHAN HOSPITAL, NORTH SHOA, ETHIOPIA

APPLICATION OF DATA MINING TO EXPLORE THE PATTERN OF TUBERCULOSIS: THE CASE OF DEBIRE BIRHAN HOSPITAL, NORTH SHOA, ETHIOPIA

YILMA, MENGISTU

URI: http://hdl.handle.net/123456789/848

Date: 2012-06-30

Abstract:

Background: Tuberculosis is the leading cause of mortality among infectious diseases worldwide. Evaluation of treatment outcome is used as a major indicator of program quality performed by the health institutes. Since data mining can be applied to explore interesting, useful and task oriented knowledge from huge amount of data, this study implemented data mining to explore the pattern of tuberculosis and to develop predictive model in relation to the treatment outcome. Objective: To explore patterns from the tuberculosis data and develop predictive model using data mining technology. Methods: An open source data mining tool WEKA software was used in this study. The study design was the standard procedure to data mining called Cross Industry Standard process for Data Mining (CRISP-DM). A total of 4780 patient records were taken for this study from the registration book of tuberculosis patients registered for treatment in Debirebirhan hospital from October, 2001 to June, 2011. Result: From the total 4780 registered patients 1320 (27.6%) were perform HIV test and from those 468 (35.6%) were reactive for HIV. From pulmonary positive tuberculosis cases 668 (51.5%) patients were performed sputum follow up test at 7th month. The outcomes were cured 649 (50%), completed 1813 (37.9%), died 370 (7.7%), failed 4 (0.3%), defaulted 458 (9.58%) and transferred out 1486 (31.1%). Multilayer perceptron registers the highest accuracy of 85.8%. All the attributes used in this study were considered as a predictor attributes to explore the pattern. Conclusion and recommendation: All algorithms experimented in this study showed a promising result. Sputum test result of 7th month for smear positive patients was the most determinant predictor attribute for cured and failed classes. Multilayer perceptron (MLP) was the best algorithm to classify and predict tuberculosis data. The outcomes died, defaulted and failed classes accounted 17.4% which is serious problem as a public health concern. Further research will be expected to be undertaken on large scale data and adding attributes like sign and symptom of the patients. 1

Show full item record