dc.contributor.author |
Dereje, Tizita |
|
dc.date.accessioned |
2024-02-12T07:43:57Z |
|
dc.date.available |
2024-02-12T07:43:57Z |
|
dc.date.issued |
2024-02-12 |
|
dc.identifier.uri |
http://hdl.handle.net/123456789/7190 |
|
dc.description.abstract |
Embedded random forest is used as feature selection techniques. Binning is used to discretizing numerical variables into fewer categorical counterparts. Then, different experiments are done using homogenous ensemble methods of machine learning algorithms. To make preprocessing and build a model Python 3.7.4 programing language with anaconda distribution is used. The accuracy of decision tree and naïve Bayes is 91.79% and 85.25% respectively. A maximum increase of 4% accuracy for weak classifiers is achieved with the help of ensemble classification. Accuracy of decision tree with bagging and AdaBoost is 94.34% and 94.79% and area under ROC is 86% and 87% respectively. Naïve Bayes achieves 87.60% and 89.5% with bagging and AdaBoost. The best performing classifier is a decision tree with AdaBoost ensemble method with highest F-measure and recall or true positive rate which is and 97.19% and 99.92% respectively |
en_US |
dc.description.sponsorship |
uog |
en_US |
dc.language.iso |
en_US |
en_US |
dc.subject |
Embedded random forest is used as feature selection techniques. Binning is used to discretizing numerical variables into fewer categorical counterparts. Then, different experiments are done using homogenous ensemble methods of machine learning algorithms. To make preprocessing and build a model Python 3.7.4 programing language with anaconda distribution is used. The accuracy of decision tree and naïve Bayes is 91.79% and 85.25% respectively. A maximum increase of 4% accuracy for weak classifiers is achieved with the help of ensemble classification. Accuracy of decision tree with bagging and AdaBoost is 94.34% and 94.79% and area under ROC is 86% and 87% respectively. Naïve Bayes achieves 87.60% and 89.5% with bagging and AdaBoost. The best performing classifier is a decision tree with AdaBoost ensemble method with highest F-measure and recall or true positive rate which is and 97.19% and 99.92% respectively |
en_US |
dc.title |
Predict Neonatal and Infant Mortality Based on Maternal Determinants using Homogenous Ensemble Methods |
en_US |
dc.type |
Thesis |
en_US |