Machine Learning-Based Classification and Statistical Analysis of Liver Cancer: A Comprehensive Study of Model Performance and Clinical Significance

Pratyush Kumar MAHARANA; Tapan Kumar BEHERA; Pradeep Kumar NAIK

Authors

Pratyush Kumar MAHARANA Department of Biotechnology and Bioinformatics, Sambalpur University, Jyoti Vihar, Burla, Sambalpur-768019, Odisha, India
Tapan Kumar BEHERA Department of Biotechnology and Bioinformatics, Sambalpur University, Jyoti Vihar, Burla, Sambalpur-768019, Odisha, India https://orcid.org/0009-0001-9601-3898
Pradeep Kumar NAIK Department of Biotechnology and Bioinformatics, Sambalpur University, Jyoti Vihar, Burla, Sambalpur-768019, Odisha, India

Keywords:

Hepatocellular Carcinoma (HCC), Machine Learning, Multilayer Perceptron, Receiver Operating Characteristic (ROC) curve

Abstract

Background: The liver is an internal organ located in the upper right section of the abdomen, just beneath the diaphragm, and near the stomach. It performs numerous functions that are essential for metabolism, digestion, detoxification, and nutrient storage. Several types of liver cancers are known, with the most common being hepatocellular carcinoma (HCC), which is the main type of liver cell called hepatocytes. Another less common type is cholangiocarcinoma, which originates in the bile ducts the liver. This study aimed to evaluate and compare machine-learning-based models for the early detection of liver cancer to improve diagnostic accuracy. Method: In this study, various models, such as SVM, decision tree, random forest, logistic regression, K-neighbor, Gaussian NB, AdaBoost classifier, MLP classifier, passive aggressive, ridge classifier, extra tree, bagging classifier, extra trees, gradient boosting, SGD classifier, linear SVC, voting classifier, and stacking classifier were used. Five performance metrics (accuracy, precision, recall, F1 score, and Cohen’s kappa) were used to evaluate the performance of the proposed methods. Results: The dataset comprised 12 instances, and across all the models tested, we utilized the extra tree classifier for the early detection of liver cancer, achieving a notable accuracy of 85.8%. The model demonstrated a precision of 75.5%, while it achieved a high recall of 92.2%, and the F1 score of 83.2% underscored its robust performance, suggesting significant potential for enhancing diagnostic accuracy and necessitating further investigation. These performance metrics highlight the potential of the extra tree classifier to improve early detection strategies for liver cancer. Conclusion: After performing the complete process, we conclude that the extra tree classifier out of 17 models is the most suitable machine learning algorithm for liver cancer prediction.

Machine Learning-Based Classification and Statistical Analysis of Liver Cancer: A Comprehensive Study of Model Performance and Clinical Significance

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission