Hepatitis C Virus (HCV) Prediction by Machine Learning Techniques

Satish CR Nandipati, Chew XinYing, Khaw Khai Wah


Hepatitis C being as a prevalent disease in the world especially in countries like Egypt. It is estimated that 3-4 million new cases every year, indicating as a public health problem and should be addressed with identification and treatment policies. In the initial stage, it is asymptomatic however when infection progress it leads to chronic conditions such as liver cirrhosis and hepatocellular carcinoma. Some of the various non-invasive serum biochemical markers are used to identify this disease. This study aims to know the performance comparisons between multi and binary class labels of the same dataset, not limited to tool comparison, and to know which selected features play a key role in the prediction of Hepatitis C Virus (HCV) by using Egyptian patient’s dataset. The highest accuracy is shown by KNN (51.06%, R) and random forest (54.56%, Python) in multi and binary class label respectively. The overall evaluation metrics comparison shows R as a better tool for this case. On the other hand, the performance score of the binary class shows better that the multiclass label. The multi-feature selection methods did not show any similar arrangement/topology in the ranking order of selected features. Finally, the 12 selected features by principal component analysis show similar performances to complete dataset and also the 21 selected features, thus showing these features may play a role in the prediction of the HCV dataset.


Classification; Feature selection; Hepatitis C virus; Machine learning; Prediction multi and binary class labels; Python and R tools

Article Metrics

Abstract view : 931 times
PDF - 357 times

Full Text:



A. Elgharably, A. I. Gomaa, M. M. Crossey, P. J. Norsworthy, I. Waked and S. D. Taylor-Robinson, Hepatitis C in Egypt - past, present, and future, International Journal of General Medicine, 10, 2017, 1-6.

A. M. Vladimir and L. Sylvie, Hepatitis C virus: Morphogenesis, infection and therapy, World Journal of Hepatology, 10(2), 2018, 186-212.

C. W. Spearman, G. M. Dusheiko, M. Hellard and M. Sonderup, Hepatitis C, Lancet, 394(10207), 2019, 1451-1466.

D. Omran, M. Alboraie, R. A. Zayed, M. N. Wifi, M. Naguib, M. Eltabbakh, M. Abdellah, A. F. Sherief, S. Maklad, H. H. Eldemellawy, O. K. Saad, D. M. Khamiss and M. E. Kassas, Towards hepatitis C virus elimination: Egyptian experience, achievements and limitations, World Journal of Gastroenterology, 24(38), 2018, 4330-4340.

X. Li, H. Xu, P. Gao, Fibrosis index based on 4 factors (fib-4) predicts liver cirrhosis and hepatocellular carcinoma in chronic Hepatitis C virus (HCV) patients, Medical Science Monitor, 25, 2019, 7243-7250.

J. L. Horsley-Silva and H. E. Vargas, New therapies for hepatitis C virus infection, Gastroenterology and Hepatology, 13(1), 2017, 22-31.

V. Palanisamy and R. Thirunavukarasu, Implications of big data analytics in developing healthcare frameworks – A review, Journal of King Saud University–Computer and Information Sciences, 31(4), 2019, 415-425.

N. Satish Chandra Reddy, S. N. Song, Z. M. Lim and C. Xin Ying, Classification and feature selection approaches by machine learning techniques: Heart disease prediction, International Journal of Innovative Computing, 9(1), 2019, 39-46.

S. Hashem, G. Esmat, W. Elakel and H. Shahira, Comparison of machine learning approaches for prediction of advanced liver fibrosis in chronic hepatitis C patients, IEEE/ACM Trans Computational Biology and Bioinformatics, 15(3), 2018, 861-868.

S. M. El-Salam, M. M. Ezz, S. Hashem, W. Elakel, R. M. Salama, H. Elmakhzangy and M. ElHefnawi, Performance of machine learning approaches on prediction of esophageal varices for Egyptian chronic hepatitis C patients, Informatics in Medicine Unlocked, 17, 2019, 1-7.

G. G. Agarwal, A. K. Singh, V. Venkatesh and N. Wal, Determination of risk factors for hepatitis C by the method of random forest. Annal of Infectious Disease and Epidemiology, 4(1), 2019, 1-4.

N. Metwally, E. AbuSharekh and S. Abu-Naser, Diagnosis of hepatitis virus using artificial neural network, International Journal for Academic Development, 2, 2018, 1-7.

N. H. Barakat, S. H. Barakat and N. Ahmed, Prediction and staging of hepatic fibrosis in children with hepatitis C virus: A machine learning approach, Healthcare Informatics Research, 25(3), 2019, 173-181.




Scikit-learn, “Scikit-learn: Machine Learning in Python,” 2016.




C. Ray and A. Ray, Intrapartum cardiotocography and its correlation with umbilical cord blood pH in term pregnancies: a prospective study, International Journal of Reproduction, Contraception, Obstetrics and Gynecology, 6, 2017, 2745-2752


  • There are currently no refbacks.