Abstract:
Chronic non-communicable diseases are becoming more prevalent every day, especially in
developing countries like Ethiopia. These disorders must be treated as soon as possible. Machine
learning-based models can be used to predict chronic diseases before medical diagnosis. Numerous
prediction models have been used for a single disease. However, they have overlooked the
suffering of many people from multiple related diseases that have a comparable effect due to
physiological irregularities. So far, there is a paucity of information regarding multi-label
predictive models for hypertension and diabetes. Therefore, this study aimed to design and develop
a multi-label predictive machine learning model and application which predicts diabetes and
hypertension simultaneously. The study used a physical examination dataset from the Ethiopian
Public Health Institute, of which 30% are for testing and 70% are for training. The linked diseases
were predicted using common risk variables and label correlation. A multi-label feature selection
was performed using a chi-square test, correlation-based, combing fisher's-score with chi-score,
and using the featurewiz python library. Multi-label synthetic minority oversampling technique is
used to handle the class imbalance problem. Moreover, problem transformation approaches, binary
relevance, and classifier chain were integrated with five common machin