Advanced Search

Publications > Journals > Journal of Clinical and Translational Hepatology > Article Full Text


Current Status and Analysis of Machine Learning in Hepatocellular Carcinoma

  • Sijia Feng ,
  • Jianhua Wang,
  • Liheng Wang,
  • Qixuan Qiu,
  • Dongdong Chen,
  • Huo Su,
  • Xiaoli Li,
  • Yao Xiao  and
  • Chiayen Lin* 
 Author information
Journal of Clinical and Translational Hepatology   2023;11(5):1184-1191

doi: 10.14218/JCTH.2022.00077S


Hepatocellular carcinoma (HCC) is a common tumor. Although the diagnosis and treatment of HCC have made great progress, the overall prognosis remains poor. As the core component of artificial intelligence, machine learning (ML) has developed rapidly in the past decade. In particular, ML has become widely used in the medical field, and it has helped in the diagnosis and treatment of cancer. Different algorithms of ML have different roles in diagnosis, treatment, and prognosis. This article reviews recent research, explains the application of different ML models in HCC, and provides suggestions for follow-up research.

Graphical Abstract


Machine learning, Hepatocellular carcinoma, Artificial intelligence, Prognosis


Hepatocellular carcinoma (HCC) is the most common primary liver malignancy and one of the four most common causes of cancer-related death worldwide.1,2 HCC is the fastest-growing cause of cancer-related deaths in the USA, and it is possible that HCC will become the third largest cause in 2030.3 In recent years, hepatitis B vaccine and antiviral treatment have been widely used,4,5 and the treatments of HCC are various, including surgery, ablation, transcatheter arterial chemoembolization (TACE), chemotherapy targeted immunotherapy and others.3,6 However, the symptoms of HCC are not easy to detect, and the overall prognosis is poor.7 HCC needs early detection, accurate prediction, individualized treatment and follow-up.

Artificial intelligence (AI) is a new subject that studies and develops theories, methods, technologies, and application systems for simulating, extending, and expanding human intelligence. Machine learning (ML) is the core of AI. ML can make the computer simulate or realize human learning behavior to acquire new knowledge or skills and reorganize existing knowledge to improve its own performance.8 In the past decade, ML has been gradually applied to medical research, and has made progress in many aspects.9 In particular, cancer-related research, including lung cancer, breast cancer and so on, and HCC-related research is increasing. ML studies in HCC involve not only diagnosis, treatment, prognosis, and other aspects, but also a variety of algorithm models including decision trees, support vector machines (SVMs), random forest and deep learning.10 The application of ML in HCC can reveal the relationship between AI and HCC and also be instrumental in the prevention and treatment of HCC. This review focuses on the application of ML in the aspects of diagnosis, treatment, and prognosis of HCC.

ML for the Diagnosis of HCC

The diagnosis of HCC depends on pathology, for patients with chronic hepatitis B and liver cirrhosis, radiology can also help with diagnosis.3 However, radiologic diagnosis requires typical imaging features,11 but more than 10% of tumors lack imaging hallmarks of HCC. If the imaging is not typical, a biopsy or second contrast-enhanced study should be performed.12 Biopsy is an invasive procedure with a sensitivity of about 70%, and lower for tumors with a diameter <2 cm. Sometimes it is difficult to distinguish well-differentiated HCC from dysplastic nodules. The diagnosis model of HCC can be established through ML, which can help to diagnose and treat the disease in clinic early and easily. For the diagnosis, it is convenient to obtain clinical data, like albumin, platelet (PLT), total bilirubin, alpha-fetoprotein (AFP), alkaline phosphatase (ALP), γ-glutamyl transferase (GGT), aspartate transaminase (AST), portal vein thrombosis, and others. Phan et al.13 established a convolutional neural network (CNN) model to predict the occurrence of HCC in HBV infected patients by selecting clinical data from Taiwan Health database. The AUC of the model was 0.886 and the accuracy was 0.980. Nam et al.14 constructed a deep neural network to predict the incidence rate of HCC in patients with HBV-related cirrhosis who received entecavir antiviral treatment. The c-index of the model was 0.782, which was significantly better than the traditional six scores (PAGE-B, CU-HCC, HCC-RESCUE, ADRESS-HCC, mPAGE-B, and THRI).

In addition, the accuracy of ML models varies significantly. Sato et al.15 collected relevant clinical data from patients diagnosed with HCC at the first visit and HBV infected patients who developed HCC during the follow-up period. They used logistic regression model for linear classification, SVM, gradient boosting, random forest, neural network, deep learning, and other algorithms for nonlinear classification, and established HCC diagnosis prediction model based on clinical data. Then, all the models were verified in the test set. They found that the gradient boosting model had the highest accuracy. Similarly, Angelis et al.16 used six algorithms including decision tree, random forest, SVM, k-nearest neighbor (KNN) classification, AdaBoost, and gradient boosting to make models based on the collected clinical data. They also found that the gradient boosting had the highest accuracy of 84% and a sensitivity of 92%. Kim et al.17 used gradient boosting machine (referred to as GBM), which is one of the boosting algorithms, to establish a model for the follow-up results of patients with HBV hepatitis treated with entecavir or tenofovir. The model predicted high or low risk of HCC in patients with HBV hepatitis, and it has been externally verified in Western cohorts. However, the study of Wong et al.18 on HCC prediction models of HBV and HCV patients in Hong Kong reported that among logistic regression model, ridge regression model, AdaBoost algorithm model, decision tree model, and random forest model, the accuracy of ridge regression [area under the receiver operating characteristic (AUROC) 0.844] and random forest model (AUROC: 0.837) were stable, and better than other traditional scores (CU-HCC, GAG-HCC, REACH-B, PAGE-B, and REAL-B) (Table 1).19–27

Table 1

Details of traditional scores mentioned in this article

ScoresAuthor and yearFunctionBased indicatorsResearch centerResults
PAGE-BPapatheodoridis et al. 201619Score for prediction of the 5-year HCC risk in Caucasian CHB patients under entecavir/tenofovirAge, sex, and plateletsMulticenterc-index :0.82
CU-HCCWong et al. 201020Clinical score in predicting the risk of HCC among HBV carriersAge, albumin, bilirubin, HBV DNA, and cirrhosisMulticenterNegative predictive value: 97.8% and 97.3% in the training and validation cohorts
HCC-RESCUESohn et al. 201721Prediction model for the development of HCC in treatment-naïve patients receiving oral antiviral treatment for CHBAge, sex, and cirrhosis.MulticenterAUROCs :1 year, 3 years, and 5 years were 0.798, 0.788, and 0.817 in the testing cohort and 0.817, 0.810 and 0.809 in the validation cohort
ADRESS-HCCFlemming et al. 201422Risk prediction model to estimate the 1-year probability of HCCAge, diabetes, race, etiology of cirrhosis, sex, and severity of liver dysfunctionMulticenter
mPAGE-BKim et al. 201823Modified PAGE-B scores to improve the predictive performanceAge, sex, platelet counts, and serum albumin levelsMulticenterc-index 0.704 and 0.691 in the testing cohort and the validation cohort
THRISharma et al. 201724Scoring system to predict HCC risk for patients with cirrhosisAge, sex, etiology, and plateletsMulticenterAUROC: 0.82 and 0.72 in the testing cohort and the validation cohort
GAG-HCCYuen et al. 200925Score to identify high-risk CHB patients for treatment and screening of HCCAge, sex, HBV DNA levels, core promoter mutations, and cirrhosisMulticenterc-index: 0.77
REACH-BYang et al. 201126Score to estimate the risk of developing HCC at 3, 5, and 10 years in patients with chronic hepatitis BSex, age, serum alanine aminotransferase concentration, HBeAg status, and serum HBV DNA levelMulticenterAUROC: 0.902, 0.783 and 0.806
REAL-BYang et al. 202027HCC risk score using routine clinical variables among a treated Asian cohortSex, age, alcohol use, diabetes, baseline cirrhosis, platelet count, and alpha-fetoproteinMulticenterAUROC: >0.80

For high-risk patients with chronic hepatitis B and cirrhosis, the diagnosis can be established by imaging. However, it is difficult to identify when the image characteristics are not typical. ML is good at processing images, so it has advantages in imaging identification. Bharti et al.28 obtained 754 regions of interest (ROI) through the echotexture and roughness of the liver surface in the ultrasound imaging, and constructed a CNN model to distinguish normal liver, chronic hepatitis, cirrhosis, and HCC. The classification accuracy of the model was 96.6%. Similarly, other studies have suggested that the model established by ultrasound imaging features has good accuracy in distinguishing benign and malignant liver nodules.29,30 Moreover, Brehar et al.31 compared HCC detection models based on ultrasound imaging. They compared the CNN model with the traditional multilayer perceptron, SVM, random forest, and AdaBoost algorithm, and found that the accuracy, sensitivity and specificity of CNN were good, and it was significantly better than the traditional ML algorithm. Recently, Jin et al.32 established a deep learning model through two-dimensional shear wave elastography and corresponding ultrasound images, which can predict the possibility of hepatitis B patients developing into HCC within 5 years. This provides an important reference for the treatment and follow-up of patients with chronic hepatitis B. HCC and intrahepatic cholangiocarcinoma (ICC) both occur in the liver, but their biological behavior, treatment methods, and prognosis are very different. The overall prognosis of ICC is poor. Most patients present with advanced tumors, and only 15% of patients with ICC underwent resection.33 Even for patients who are indicated for surgical resection, the study suggests that the probability of cure is about 10%.34 The resection mode, chemotherapy and targeted treatment of ICC are very different from those of HCC.35,36 Therefore, it has significant to be able to distinguish HCC and ICC in a noninvasive manner. Ren et al.37 established a SVM model by selecting the ROI of lesion on the ultrasound imaging to identify HCC and ICC. The results showed that the accuracy, specificity, and sensitivity of the model were above 0.800, and it had good generalization ability.

Enhanced CT is of great significance in the diagnosis of HCC. When it is controversial to discriminate the nature of nodules with CT images, a good ML model can improve the reliability of diagnosis. The CNN model established by Yasaka et al.38 effectively identified the types of liver masses through enhanced CT, and masses can be divided into five categories using this model. They are category A, classic HCCs; category B, malignant liver tumors other than classic and early HCCs; category C, indeterminate masses or mass-like lesions including early HCCs and dysplastic nodules and rare benign liver masses other than hemangiomas and cysts; category D, hemangiomas; and category E, cysts. Mokrane et al.39 extracted quantitative imaging features from CT images to establish candidate models for diagnosing uncertain liver nodules in patients with liver cirrhosis using three ML algorithms (KNN, SVM, and random forest). They selected the best model using the AUC and Youden index. The model helped to judge uncertain liver nodules in a noninvasive manner. MRI has a similar role. Hamm et al.40 established a CNN model using MRI images, and it divided liver lesions into six categories (simple cyst, cavernous hemangioma, focal nodular hyperplasia (FNH), HCC, ICC, and colorectal cancer metastasis). The discrimination result of the model was better than that of the radiologist, and the specificity and sensitivity were greater than 90%. Liu et al.41 made a SVM model to distinguish combined hepatocellular cholangiocarcinoma, ICC, and HCC using the radiological characteristics of MRI and CT. At the same time, they found that enhanced phase MRI and nonenhanced phase and portal vein phase CT were more helpful for differentiation. Because of the development of ML, it is also possible to predict the pathological grade of HCC by noninvasive evaluation by imaging. Mao et al.42 manually extracted radiomics features and synthesized features using recursive feature elimination, and then established a prediction model of HCC pathological grade with AUC of 0.8014 using the XGBoost model. Nebbia et al.43 established a ML model using multiparameter MRI images to achieve preoperative prediction of microvascular infiltration (MVI) status. The researchers also compared the effects of extracting only from the tumor region, extracting only from the peritumor edge region and combining them. The result showed that preoperative MRI is feasible to predict MVI, and multiparameter MRI sequences are complementary in recognition.

The results of pathological examination depend to a certain extent on the selection of specimens and pathologist judgment. Using ML can not only reduce the error of results but also shorten the time to diagnosis. Lin et al.44 established a CNN model using multiphoton microscopic imaging of unstained specimens to judge the degree of HCC differentiation. Chen et al.45 established a CNN model using HE stained pathological images. The accuracy of the model in distinguishing benign and malignant HCC was 96.0%, and it predicted HCC mutated genes (including CTNNB1, FMN2, TP53 and ZFX4) from the images. Kiani et al.46 made a CNN model based on hematoxylin and eosin stained specimen images that effectively helped in the pathological differentiation of HCC and ICC. In addition, on the molecular level, ML training through mutations in related genes can also assist in HCC diagnosis. The research of Zhang et al.47 established a SVM model based on 11 selected genes that distinguished HCC, adjacent noncancerous tissues, and hepatitis cirrhosis.

Chen et al.48 explored the significance of HBV reverse transcriptase (RT) gene for HCC patients. They used four ML methods to establish HBV RT sequences to predict HCC. The results show that the random forest model based on 10 combined features had the best predictive performance, and the individual HCC risk score obtained by the random forest model distinguished HCC and HBV patients. Circulating tumor gene (ctDNA) detection makes it possible to detect tumors early and noninvasively and helps to match suitable targeted drugs. The study of Tao et al.49 established a random forest model that distinguished HCC and HBV patients by somatic copy number aberrations of ctDNA through low-depth whole-genome sequencing of plasma samples from HBV-related HCC patients and cancer-free HBV patients. The diagnosis of HCC is not limited to “diagnosis.” Diagnosis, including etiology and disease severity, greatly affects the treatment plan and prognosis. Especially in clinical practice, diseases are often atypical. The application of ML of HCC has a huge impact on the diagnosis and differential diagnosis of HCC (Table 2).13–15,17,18,29–32,37–49

Table 2

Details of machine learning for the diagnosis of hepatocellular carcinoma

Author and yearData typeSample numberMachine learning model/algorithmResults
Phan et al. 202013Clinical dataN: 6,052 (training set: 70%; test set: 30%)Convolutional neural networkAUC: 0.886
Nam et al. 202014Clinical dataTraining set: 424; validation set (independent external cohort): 316Deep neural networkc-index: 0.782
Sato et al. 201915Clinical dataN: 1,580 (training set: 80%; development set and test set: 20%)SVM, gradient boosting, random forest, neural network, deep learning, and other algorithmsGradient boosting model had the highest accuracy (87.34%) AUC: 0.94
Kim et al. 202117Clinical dataTraining set: 6,051; validation set (external validation cohorts): (5,817 patients from Korean centers and 1,640 from Western centers)GBMc-index: 0.79
Wong et al. 202218Clinical dataN: 124,006 (training set: 70%; test set: 30%)AdaBoost, decision tree and random forestAccuracy of random forest (AUROC: 0.837) was stable
Schmauch et al. 201929ImagingTraining set: 367; validation set: 177Deep learningWeighted mean ROC-AUC scores of 0.891
Li et al. 202130ImagingN: 226 (training set: 80%; test set: 20%)SVMAUC: 0.86
Brehar et al. 202031ImagingN: 268 (training set: 66%; test set: 20%; validation set: 14%)CNN, SVM, random forest, and AdaBoostCNN was the best (accuracy of 91% with AUC of 95%)
Jin et al. 202132ImagingTraining set: 262; validation set: 86; testing set: 86Deep learningAUCs: 0.981, 0.942 and 0.900 in training, validation, and testing cohorts
Ren et al. 202137ImagingTraining set: 149; test set: 38; validation set: 39SVMAUC: 0.936
Yasaka et al. 201838ImagingTraining set: 460; test set: 100CNNAUC: 0.92
Mokrane et al. 202039ImagingDiscovery set: 142; validation set: 36KNN, SVM, and random forestAUC: 0.70 and 0.66 in discovery and validation cohorts
Hamm et al. 201940ImagingTraining set: 434; test set: 60CNNAUC: 0.992
Liu et al. 202141ImagingN: 86SVMAUC: 0.77
Mao et al. 202042ImagingTraining set: 237; test set: 60XGBoostAUC: 0.8014
Nebbia et al. 202043ImagingN: 99SVMHighest AUC: 0.8669 (multiparametric MRI combination yield)
Lin et al. 201944PathologyN: 113CNNAccuracy>90%
Chen et al. 202045PathologyTraining set: 261; test set: 50; internal validation set: 155; external validation set: 101CNNAccuracy: 96.0%
Kiani et al. 202046PathologyTraining set: 70; test set: 80; validation set: 26CNNAccuracy: 0.885
Zhang et al. 202047GeneTraining set: 1,333; test set: 336SVMSensitivity: 91.93%, specificity: 100%, and AUC: 0.9597
Chen et al. 202148GenesTraining set: 361; validation set: 183Random forest, SVM, KNNBest predictive performances: random forest (AUC: 0.96; accuracy, 0.90)
Tao et al. 202049GenesTraining set: 209; validation sets: 76/99Random forestAUC>0.800

ML for the treatment of HCC

The preferred treatment for HCC is surgical resection, and R0 resection should be performed in patients who can undergo surgery. TACE or radiofrequency ablation (RFA) is recommended for nonresectable HCC patients, and targeted or immunotherapy and other systemic treatment schemes can be used for patients who cannot undergo the above treatment.3 In clinical practice, doctors may encounter some patients whose treatment methods are difficult to decide. For individual patients, there is only one choice, which therefore needs to be made carefully. Properly used, ML can help patients to choose treatment methods.

Choi et al.50 established a clinical decision support system based on 20 clinical indicators selected using a random forest model. The system recommended the initial treatment plan for HCC patients and predicted the overall survival of the corresponding treatment methods. Liu et al.51 established a radiomics model using ultrasound images of HCC patients to predict the efficacy of TACE. The model AUC was 0.93. It predicted progression-free survival of patients and optimized their treatment. On the basis of predicting the first TACE treatment response of HCC patients, Dong et al.52 used six ML models and compared them to select the most appropriate model. The results showed that the random forest model performed best and accurately predicted the early response to the first TACE treatment. With the development of targeted therapy and immunotherapy, the application of ML for the treatment may tend to the selection of targeted Immunologic drugs for HCC patients. ML will provide reference for patients to select suitable targeted drugs in the future (Table 3).50–52

Table 3

Details of machine learning for the treatment of hepatocellular carcinoma

Author and yearData typeSample numberMachine learning model/algorithmResults
Choi et al. 202050Clinical dataTraining set: 813; validation set: 208Random forestc-index: 0.725 (RFA/PEIT), 0.695 (resection), 0.803 (TACE), 0.676 (TACE + EBRT), 0.684 (sorafenib), 0.710 (supportive care), 0.959 (transplantation), 0.850 (other therapies)
Liu et al. 202051ImagingN: 419 (training and validation cohorts by a ratio of 2:1)CNNAUC: 0.93
Dong et al. 202152Clinical data& ImagingN: 110 (training set: 80%; validation set: 20%)XGBoost, decision tree, SVM, random forest, KNN, fully convolutional networksBest performance: random forest (AUC: 0.802 accuracy: 0.784, sensitivity: 0.904, and specificity: 0.480)

ML for the prognosis of HCC

Since the 21st century, HCC has been the fastest-growing cause of cancer-related death in the USA, and it is expected that HCC will become the third largest cause by 2030.53 The long-term prognosis of liver transplantation is better than that of hepatectomy, with a recurrence rate of 70% and a 10-year survival rate of 7–15%.54 Liver transplantation is an ideal surgical method for HCC patients, but it is still limited by a small number of donors and high medical costs. How to choose these two treatment methods for people with appropriate indications? Schoenberg et al.55 established a random forest model based on clinical data. The predictive value of the model for early disease-free survival was 0.788. The model divides the patients into high-risk and low-risk groups. The low-risk patients are suitable for liver resection, and the high-risk patients are considered suitable for liver transplantation, so as to guide the selection of treatment.

Ji et al.56 established a prediction model for the prognosis of patients with tumors ≤ 5 cm and no evidence of extrahepatic disease or large vessel invasion after resection. The model determined a critical value using eight clinical characteristics including age, race, AFP, tumor size, tumor number, vascular invasion, histological grade and fibrosis score, and divided the prognosis into low risk, medium risk, and high risk. The results showed that there was no significant difference in the prognosis of low-risk patients undergoing tumor resection or liver transplantation. The model provided a reference for patients as to whether they should undergo neoadjuvant therapy. Huang et al.57 also used the clinical data of patients after hepatectomy to establish a model, but their study compared multiple models (DeepSurv, XGBoost and Random Survival Forest) and found that XGBoost was the best one. They used a heat map to individualize the recurrence risk. The study also divided the prognostic variables of patients in more detail, according to time. Within 1 year after surgery, the importance of cancer thrombus was the highest. At 1 to 2 years after surgery, the number of tumors was the most important variable related to the prognosis of patients, followed by the type of resection, tumor thrombus, and tumor diameter. In the two periods of 2 to 3 years and 3 to 5 years, in addition to the number of tumors, HBV infection was a relatively important variable. Smoking was also associated with late recurrence. A model established by Jiang et al.58 using CT radiomics features not only predicted the MVI status of patients before surgery, but also judged the difference in recurrence-free survival of patients by grouping. Regarding RFA, an SVM model established by Liang et al.59 can effectively identify HCC patients with relatively high recurrence risk after ablation therapy, which is helpful for postoperative follow-up and management of patients.

There are also many studies that used clinical data, pathological information, radiomics characteristics and other data to establish ML models.60–65 They effectively predicted the prognosis of patients and provided great help in the selection of treatment methods, the requirements of postoperative review, and the avoidance of high-risk factors. For patients with Barcelona Clinic Liver Cancer stage B, the international guidelines recommend TACE. However, there is great heterogeneity in patients at that stage, and the efficacy of TACE is different. Lin et al.66 selected the clinical data of patients with BCLC stage B, and extracted five indicators including tumor size, tumor number, BCLC-B substage, AFP, and ALB to establish a random forest model. The model can predict the prognosis of patients after TACE treatment, and distinguish the middle-term HCC patients who are suitable for TACE. The CNN model established by Peng et al.67 also effectively predicted the efficacy of TACE. A model established by Jin et al.68 by extracting the features of enhanced CT effectively predicted the possibility of extrahepatic diffusion or vascular invasion of the patients after the initial TACE treatment (EVIT).

In terms of genes, many studies have explored the model of predicting the prognosis of HCC patients. Chaudhary et al.69 used deep learning for the first time to explore the difference in survival time of HCC patients. They established a model using RNA sequencing (RNA Seq), microRNA sequencing (miRNA Seq), and methylation data that reliably predicted the survival times of six different cohorts. Liu et al.70 selected immune genes with differences between normal and HCC. The model established with those genes predicted the 5-year survival HCC patients. Bedon et al.71 classified HCC patients with progression-free survival with methylation maps, and constructed a model. High-risk and low-risk patients with early cancer progression were classified. Prognosis is a common concern of patients and doctors. The extensive application of ML makes the prognosis more specific, and provides great help for follow-up guidance of patients (Table 4).55–59,66,67,69–71

Table 4

Details of machine learning for the prognosis of hepatocellular carcinoma

Author and yearData typeSample numberMachine learning model/algorithmResults
Schoenberg et al. 202055Clinical dataTraining set: 127; test set: 53Random forestAUC: 0.788
Ji et al. 202156Clinical dataTraining/validation set: 1,899; test set: 879GBMc-index: >0.72
Huang et al. 202157Clinical dataTraining set: 5,928; internal validation set: 1,483; external validation set: 508DeepSurv, XGBoost, random survival forestBest performance: XGBoost (c-index: 0.713)
Jiang et al. 202158Clinical data & imagingTraining set: 324; validation set: 81XGBoost, 3D-CNNAUROCs: training set 0.952 and 0.980; validation: 0.887 and 0.906
Liang et al. 201459Clinical dataN: 83SVMAUC: 0.69
Lin et al. 202166Clinical dataTraining set: 602; internal validation set: 301; external validation set: 343Random forestc-index: 0.69, AUROC>0.71
Peng et al. 202067ImagingTraining set: 562; validation sets: 89/138CNNAUC:>0.95
Chaudhary et al. 201869GeneTraining set: 360; validation sets (5 external datasets): 230/221/166/40/27Deep learningc-index: 0.68
Liu et al. 202170GeneN (3 databases): TCGA 365; ICGC 232; GSE14520 209Random forestAUC:>0.7
Bedon et al. 202171GeneTraining set: 300; test set: 74Random forestAccuracy: 0.80


The study of ML in HCC involves a variety of data such as patient clinical information, imaging information, pathological information, and gene loci. ML can provide guidance and help in the diagnosis of HCC, the selection of patient treatment methods, and prognosis prediction. Especially for noninvasive diagnosis. Its advantages include accuracy in processing images. ML can avoid the contraindications and complications of biopsy, as well as the possibility of tumor rupture and disseminated metastasis. Because of the different treatment methods of HCC and ICC, preoperative differentiation of HCC and ICC by ML can help preoperative assessment of whether surgery can be performed as well as the surgical procedure.

ML has brought great guiding significance to the diagnosis and treatment of HCC from many aspects. In particular, it is not based on subjective assessment and experience to determine the diagnosis and treatment method, but is based on actual data and accuracy to provide evidence. Currently, as described above, there are many types of data available for the application of ML of HCC, including basic clinical information (sex, hepatitis history, blood biochemical examination, and others), imaging data including ultrasound, CT, and MRI, pathology data, and gene data. Moreover, there are many models and algorithms that can be used in the application of ML for HCC. For example, random forest, SVM, deep learning, and so on. It is uncertain which model is suitable for the research problem, but ML models can be selected according to the type of research data. SVM, random forest, artificial neural network, boosting, and bagging algorithms are common models in ML, which are more suitable for the traditional “learning mode,” and so are more suitable for processing numerical data. While the essence of deep learning, including CNN, and others, is complex, along with the complexity of learning and training models, the algorithm are closer to human brain models. Deep learning may be more suitable for processing complex data types. However, there are shortcomings of ML. The learning process is still a black box. It is hard to understand its essence, which may have potential harm. The interpretability of AI is still a problem that needs to be solved. In addition, models are always based on a part of the population. Then the extensive application of the models is facing a huge test and needs to be constantly improved.

There are many types of ML algorithms, different data types, and research methods. However, researchers have been exploring suitable algorithms and models, and have achieved much. It is believed that with the continuous development of AI and ML, HCC-related research models of ML will also be improved and bring good news to HCC patients.





artificial intelligence


area under the receiver operating characteristic


convolutional neural network


gradient boosting machine


hepatocellular carcinoma


intrahepatic cholangiocarcinoma


k-nearest neighbor


machine learning


microvascular infiltration


radiofrequency ablation


regions of interest


support vector machine


transcatheter arterial chemoembolization



This work was supported by the Natural Science Foundation of Hunan Province (2022JJ30939) and The Science and Technology Innovation Leading Project for High-tech Industry of Hunan Province (2020SK2009).

Conflict of interest

The authors have no conflict of interests related to this publication.

Authors’ contributions

Study conception and design (SF, XY, CL), acquisition of data (SF, JW, LW, QQ, DC, HS, XL), analysis and interpretation of data (SF, JW, LW, QQ, DC, HS), drafting of the manuscript (SF), critical revision of the manuscript for important intellectual content (SF, JW, LW, QQ, DC, HS, XL, XY, CL), project administration (XY, CL), and study supervision (XL, XY, CL). All authors have made significant contributions to this study and have approved the final manuscript.


  1. Yang JD, Hainaut P, Gores GJ, Amadou A, Plymoth A, Roberts LR. A global view of hepatocellular carcinoma: trends, risk, prevention and management. Nat Rev Gastroenterol Hepatol 2019;16(10):589-604 View Article PubMed/NCBI
  2. Chidambaranathan-Reghupaty S, Fisher PB, Sarkar D. Hepatocellular carcinoma (HCC): Epidemiology, etiology and molecular classification. Adv Cancer Res 2021;149:1-61 View Article PubMed/NCBI
  3. Llovet JM, Kelley RK, Villanueva A, Singal AG, Pikarsky E, Roayaie S, et al. Hepatocellular carcinoma. Nat Rev Dis Primers 2021;7(1):6 View Article PubMed/NCBI
  4. Terrault NA, Lok ASF, McMahon BJ, Chang KM, Hwang JP, Jonas MM, et al. Update on prevention, diagnosis, and treatment of chronic hepatitis B: AASLD 2018 hepatitis B guidance. Hepatology 2018;67(4):1560-1599 View Article PubMed/NCBI
  5. Liu J, Liang W, Jing W, Liu M. Countdown to 2030: eliminating hepatitis B disease, China. Bull World Health Organ 2019;97(3):230-238 View Article PubMed/NCBI
  6. Yang JD, Heimbach JK. New advances in the diagnosis and management of hepatocellular carcinoma. BMJ 2020;371:m3544 View Article PubMed/NCBI
  7. Anwanwan D, Singh SK, Singh S, Saikam V, Singh R. Challenges in liver cancer and possible treatment approaches. Biochim Biophys Acta Rev Cancer 2020;1873(1):188314 View Article PubMed/NCBI
  8. Michael H, Andreas K. A Brief History of Artificial Intelligence: On the Past, Present, and Future of Artificial Intelligence. Calif Manage Rev 2019;61(4):5-14 View Article
  9. Bhinder B, Gilvary C, Madhukar NS, Elemento O. Artificial Intelligence in Cancer Research and Precision Medicine. Cancer Discov 2021;11(4):900-915 View Article PubMed/NCBI
  10. Bansal M, Goyal A, Choudhary A. A comparative analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory algorithms in machine learning. Decision Analytics Journal 2022;3:100071 View Article
  11. van der Pol CB, Lim CS, Sirlin CB, McGrath TA, Salameh JP, Bashir MR, et al. Accuracy of the Liver Imaging Reporting and Data System in Computed Tomography and Magnetic Resonance Image Analysis of Hepatocellular Carcinoma or Overall Malignancy-A Systematic Review. Gastroenterology 2019;156(4):976-986 View Article PubMed/NCBI
  12. Marrero JA, Kulik LM, Sirlin CB, Zhu AX, Finn RS, Abecassis MM, et al. Diagnosis, Staging, and Management of Hepatocellular Carcinoma: 2018 Practice Guidance by the American Association for the Study of Liver Diseases. Hepatology 2018;68(2):723-750 View Article
  13. Phan DV, Chan CL, Li AA, Chien TY, Nguyen VC. Liver cancer prediction in a viral hepatitis cohort: A deep learning approach. Int J Cancer 2020;147(10):2871-2878 View Article PubMed/NCBI
  14. Nam JY, Sinn DH, Bae J, Jang ES, Kim JW, Jeong SH. Deep learning model for prediction of hepatocellular carcinoma in patients with HBV-related cirrhosis on antiviral therapy. JHEP Rep 2020;2(6):100175 View Article PubMed/NCBI
  15. Sato M, Morimoto K, Kajihara S, Tateishi R, Shiina S, Koike K, et al. Machine-learning Approach for the Development of a Novel Predictive Model for the Diagnosis of Hepatocellular Carcinoma. Sci Rep 2019;9(1):7704 View Article PubMed/NCBI
  16. Angelis I, Exarchos T. Hepatocellular Carcinoma Detection Using Machine Learning Techniques. Adv Exp Med Biol 2021;1338:21-29 View Article PubMed/NCBI
  17. Kim HY, Lampertico P, Nam JY, Lee HC, Kim SU, Sinn DH, et al. An artificial intelligence model to predict hepatocellular carcinoma risk in Korean and Caucasian patients with chronic hepatitis B. J Hepatol 2022;76(2):311-318 View Article PubMed/NCBI
  18. Wong GL, Hui VW, Tan Q, Xu J, Lee HW, Yip TC, et al. Novel machine learning models outperform risk scores in predicting hepatocellular carcinoma in patients with chronic viral hepatitis. JHEP Rep 2022;4(3):100441 View Article PubMed/NCBI
  19. Papatheodoridis G, Dalekos G, Sypsa V, Yurdaydin C, Buti M, Goulis J, et al. PAGE-B predicts the risk of developing hepatocellular carcinoma in Caucasians with chronic hepatitis B on 5-year antiviral therapy. J Hepatol 2016;64(4):800-806 View Article PubMed/NCBI
  20. Wong VW, Chan SL, Mo F, Chan T, Loong HH, Wong GL, et al. Clinical scoring system to predict hepatocellular carcinoma in chronic hepatitis B carriers. J Clin Oncol 2010;28(10):1660-1665 View Article PubMed/NCBI
  21. Sohn W, Cho JY, Kim JH, Lee JI, Kim HJ, Woo M, et al. Risk score model for the development of hepatocellular carcinoma in treatment-naïve patients receiving oral antiviral treatment for chronic hepatitis B. Clin Mol Hepatol 2017;23(2):170-178 View Article PubMed/NCBI
  22. Flemming JA, Yang JD, Vittinghoff E, Kim WR, Terrault NA. Risk prediction of hepatocellular carcinoma in patients with cirrhosis: the ADRESS-HCC risk model. Cancer 2014;120(22):3485-3493 View Article PubMed/NCBI
  23. Kim JH, Kim YD, Lee M, Jun BG, Kim TS, Suk KT, et al. Modified PAGE-B score predicts the risk of hepatocellular carcinoma in Asians with chronic hepatitis B on antiviral therapy. J Hepatol 2018;69(5):1066-1073 View Article PubMed/NCBI
  24. Sharma SA, Kowgier M, Hansen BE, Brouwer WP, Maan R, Wong D, et al. Toronto HCC risk index: A validated scoring system to predict 10-year risk of HCC in patients with cirrhosis. J Hepatol 2018;68(1):92-99 View Article PubMed/NCBI
  25. Yuen MF, Tanaka Y, Fong DY, Fung J, Wong DK, Yuen JC, et al. Independent risk factors and predictive score for the development of hepatocellular carcinoma in chronic hepatitis B. J Hepatol 2009;50(1):80-88 View Article PubMed/NCBI
  26. Yang HI, Yuen MF, Chan HL, Han KH, Chen PJ, Kim DY, et al. Risk estimation for hepatocellular carcinoma in chronic hepatitis B (REACH-B): development and validation of a predictive score. Lancet Oncol 2011;12(6):568-574 View Article PubMed/NCBI
  27. Yang HI, Yeh ML, Wong GL, Peng CY, Chen CH, Trinh HN, et al. Real-World Effectiveness From the Asia Pacific Rim Liver Consortium for HBV Risk Score for the Prediction of Hepatocellular Carcinoma in Chronic Hepatitis B Patients Treated With Oral Antiviral Therapy. J Infect Dis 2020;221(3):389-399 View Article PubMed/NCBI
  28. Bharti P, Mittal D, Ananthasivan R. Preliminary Study of Chronic Liver Classification on Ultrasound Images Using an Ensemble Model. Ultrason Imaging 2018;40(6):357-379 View Article PubMed/NCBI
  29. Schmauch B, Herent P, Jehanno P, Dehaene O, Saillard C, Aubé C, et al. Diagnosis of focal liver lesions from ultrasound using deep learning. Diagn Interv Imaging 2019;100(4):227-233 View Article PubMed/NCBI
  30. Li W, Lv XZ, Zheng X, Ruan SM, Hu HT, Chen LD, et al. Machine Learning-Based Ultrasomics Improves the Diagnostic Performance in Differentiating Focal Nodular Hyperplasia and Atypical Hepatocellular Carcinoma. Front Oncol 2021;11:544979 View Article PubMed/NCBI
  31. Brehar R, Mitrea DA, Vancea F, Marita T, Nedevschi S, Lupsor-Platon M, et al. Comparison of Deep-Learning and Conventional Machine-Learning Methods for the Automatic Recognition of the Hepatocellular Carcinoma Areas from Ultrasound Images. Sensors (Basel) 2020;20(11):3085 View Article PubMed/NCBI
  32. Jin J, Yao Z, Zhang T, Zeng J, Wu L, Wu M, et al. Deep learning radiomics model accurately predicts hepatocellular carcinoma occurrence in chronic hepatitis B patients: a five-year follow-up. Am J Cancer Res 2021;11(2):576-589 PubMed/NCBI
  33. Amini N, Ejaz A, Spolverato G, Kim Y, Herman JM, Pawlik TM. Temporal trends in liver-directed therapy of patients with intrahepatic cholangiocarcinoma in the United States: a population-based analysis. J Surg Oncol 2014;110(2):163-170 View Article PubMed/NCBI
  34. Spolverato G, Vitale A, Cucchetti A, Popescu I, Marques HP, Aldrighetti L, et al. Can hepatic resection provide a long-term cure for patients with intrahepatic cholangiocarcinoma?. Cancer 2015;121(22):3998-4006 View Article PubMed/NCBI
  35. Mazzaferro V, Gorgen A, Roayaie S, Droz Dit Busset M, Sapisochin G. Liver resection and transplantation for intrahepatic cholangiocarcinoma. J Hepatol 2020;72(2):364-377 View Article PubMed/NCBI
  36. Kelley RK, Bridgewater J, Gores GJ, Zhu AX. Systemic therapies for intrahepatic cholangiocarcinoma. J Hepatol 2020;72(2):353-363 View Article PubMed/NCBI
  37. Ren S, Li Q, Liu S, Qi Q, Duan S, Mao B, et al. Clinical Value of Machine Learning-Based Ultrasomics in Preoperative Differentiation Between Hepatocellular Carcinoma and Intrahepatic Cholangiocarcinoma: A Multicenter Study. Front Oncol 2021;11:749137 View Article PubMed/NCBI
  38. Yasaka K, Akai H, Abe O, Kiryu S. Deep Learning with Convolutional Neural Network for Differentiation of Liver Masses at Dynamic Contrast-enhanced CT: A Preliminary Study. Radiology 2018;286(3):887-896 View Article PubMed/NCBI
  39. Mokrane FZ, Lu L, Vavasseur A, Otal P, Peron JM, Luk L, et al. Radiomics machine-learning signature for diagnosis of hepatocellular carcinoma in cirrhotic patients with indeterminate liver nodules. Eur Radiol 2020;30(1):558-570 View Article PubMed/NCBI
  40. Hamm CA, Wang CJ, Savic LJ, Ferrante M, Schobert I, Schlachter T, et al. Deep learning for liver tumor diagnosis part I: development of a convolutional neural network classifier for multi-phasic MRI. Eur Radiol 2019;29(7):3338-3347 View Article PubMed/NCBI
  41. Liu X, Khalvati F, Namdar K, Fischer S, Lewis S, Taouli B, et al. Can machine learning radiomics provide pre-operative differentiation of combined hepatocellular cholangiocarcinoma from hepatocellular carcinoma and cholangiocarcinoma to inform optimal treatment planning?. Eur Radiol 2021;31(1):244-255 View Article PubMed/NCBI
  42. Mao B, Zhang L, Ning P, Ding F, Wu F, Lu G, et al. Preoperative prediction for pathological grade of hepatocellular carcinoma via machine learning-based radiomics. Eur Radiol 2020;30(12):6924-6932 View Article PubMed/NCBI
  43. Nebbia G, Zhang Q, Arefan D, Zhao X, Wu S. Pre-operative Microvascular Invasion Prediction Using Multi-parametric Liver MRI Radiomics. J Digit Imaging 2020;33(6):1376-1386 View Article PubMed/NCBI
  44. Lin H, Wei C, Wang G, Chen H, Lin L, Ni M, et al. Automated classification of hepatocellular carcinoma differentiation using multiphoton microscopy and deep learning. J Biophotonics 2019;12(7):e201800435 View Article PubMed/NCBI
  45. Chen M, Zhang B, Topatana W, Cao J, Zhu H, Juengpanich S, et al. Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning. NPJ Precis Oncol 2020;4:14 View Article PubMed/NCBI
  46. Kiani A, Uyumazturk B, Rajpurkar P, Wang A, Gao R, Jones E, et al. Impact of a deep learning assistant on the histopathologic classification of liver cancer. NPJ Digit Med 2020;3:23 View Article PubMed/NCBI
  47. Zhang ZM, Tan JX, Wang F, Dao FY, Zhang ZY, Lin H. Early Diagnosis of Hepatocellular Carcinoma Using Machine Learning Method. Front Bioeng Biotechnol 2020;8:254 View Article PubMed/NCBI
  48. Chen S, Zhang Z, Wang Y, Fang M, Zhou J, Li Y, et al. Using Quasispecies Patterns of Hepatitis B Virus to Predict Hepatocellular Carcinoma With Deep Sequencing and Machine Learning. J Infect Dis 2021;223(11):1887-1896 View Article PubMed/NCBI
  49. Tao K, Bian Z, Zhang Q, Guo X, Yin C, Wang Y, et al. Machine learning-based genome-wide interrogation of somatic copy number aberrations in circulating tumor DNA for early detection of hepatocellular carcinoma. EBioMedicine 2020;56:102811 View Article PubMed/NCBI
  50. Choi GH, Yun J, Choi J, Lee D, Shim JH, Lee HC, et al. Development of machine learning-based clinical decision support system for hepatocellular carcinoma. Sci Rep 2020;10(1):14855 View Article PubMed/NCBI
  51. Liu F, Liu D, Wang K, Xie X, Su L, Kuang M, et al. Deep Learning Radiomics Based on Contrast-Enhanced Ultrasound Might Optimize Curative Treatments for Very-Early or Early-Stage Hepatocellular Carcinoma Patients. Liver Cancer 2020;9(4):397-413 View Article PubMed/NCBI
  52. Dong Z, Lin Y, Lin F, Luo X, Lin Z, Zhang Y, et al. Prediction of Early Treatment Response to Initial Conventional Transarterial Chemoembolization Therapy for Hepatocellular Carcinoma by Machine-Learning Model Based on Computed Tomography. J Hepatocell Carcinoma 2021;8:1473-1484 View Article PubMed/NCBI
  53. Rahib L, Smith BD, Aizenberg R, Rosenzweig AB, Fleshman JM, Matrisian LM. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Res 2014;74(11):2913-2921 View Article PubMed/NCBI
  54. Franssen B, Jibara G, Tabrizian P, Schwartz ME, Roayaie S. Actual 10-year survival following hepatectomy for hepatocellular carcinoma. HPB (Oxford) 2014;16(9):830-835 View Article PubMed/NCBI
  55. Schoenberg MB, Bucher JN, Koch D, Börner N, Hesse S, De Toni EN, et al. A novel machine learning algorithm to predict disease free survival after resection of hepatocellular carcinoma. Ann Transl Med 2020;8(7):434 View Article PubMed/NCBI
  56. Ji GW, Fan Y, Sun DW, Wu MY, Wang K, Li XC, et al. Machine Learning to Improve Prognosis Prediction of Early Hepatocellular Carcinoma After Surgical Resection. J Hepatocell Carcinoma 2021;8:913-923 View Article PubMed/NCBI
  57. Huang Y, Chen H, Zeng Y, Liu Z, Ma H, Liu J. Development and Validation of a Machine Learning Prognostic Model for Hepatocellular Carcinoma Recurrence After Surgical Resection. Front Oncol 2020;10:593741 View Article PubMed/NCBI
  58. Jiang YQ, Cao SE, Cao S, Chen JN, Wang GY, Shi WQ, et al. Preoperative identification of microvascular invasion in hepatocellular carcinoma by XGBoost and deep learning. J Cancer Res Clin Oncol 2021;147(3):821-833 View Article PubMed/NCBI
  59. Liang JD, Ping XO, Tseng YJ, Huang GT, Lai F, Yang PM. Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation using support vector machines with feature selection methods. Comput Methods Programs Biomed 2014;117(3):425-434 View Article PubMed/NCBI
  60. Ji GW, Zhu FP, Xu Q, Wang K, Wu MY, Tang WW, et al. Machine-learning analysis of contrast-enhanced CT radiomics predicts recurrence of hepatocellular carcinoma after resection: A multi-institutional study. EBioMedicine 2019;50:156-165 View Article PubMed/NCBI
  61. Shan QY, Hu HT, Feng ST, Peng ZP, Chen SL, Zhou Q, et al. CT-based peritumoral radiomics signatures to predict early recurrence in hepatocellular carcinoma after curative tumor resection or ablation. Cancer Imaging 2019;19(1):11 View Article PubMed/NCBI
  62. Saillard C, Schmauch B, Laifa O, Moarii M, Toldo S, Zaslavskiy M, et al. Predicting Survival After Hepatocellular Carcinoma Resection Using Deep Learning on Histological Slides. Hepatology 2020;72(6):2000-2013 View Article PubMed/NCBI
  63. Yamashita R, Long J, Saleem A, Rubin DL, Shen J. Deep learning predicts postsurgical recurrence of hepatocellular carcinoma from digital histopathologic images. Sci Rep 2021;11(1):2047 View Article PubMed/NCBI
  64. Liao H, Xiong T, Peng J, Xu L, Liao M, Zhang Z, et al. Classification and Prognosis Prediction from Histopathological Images of Hepatocellular Carcinoma by a Fully Automated Pipeline Based on Machine Learning. Ann Surg Oncol 2020;27(7):2359-2369 View Article PubMed/NCBI
  65. Saito A, Toyoda H, Kobayashi M, Koiwa Y, Fujii H, Fujita K, et al. Prediction of early recurrence of hepatocellular carcinoma after resection using digital pathology images assessed by machine learning. Mod Pathol 2021;34(2):417-425 View Article PubMed/NCBI
  66. Lin H, Zeng L, Yang J, Hu W, Zhu Y. A Machine Learning-Based Model to Predict Survival After Transarterial Chemoembolization for BCLC Stage B Hepatocellular Carcinoma. Front Oncol 2021;11:608260 View Article PubMed/NCBI
  67. Peng J, Kang S, Ning Z, Deng H, Shen J, Xu Y, et al. Residual convolutional neural network for predicting response of transarterial chemoembolization in hepatocellular carcinoma from CT imaging. Eur Radiol 2020;30(1):413-424 View Article PubMed/NCBI
  68. Jin Z, Chen L, Zhong B, Zhou H, Zhu H, Zhou H, et al. Machine-learning analysis of contrast-enhanced computed tomography radiomics predicts patients with hepatocellular carcinoma who are unsuitable for initial transarterial chemoembolization monotherapy: A multicenter study. Transl Oncol 2021;14(4):101034 View Article PubMed/NCBI
  69. Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep Learning-Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer. Clin Cancer Res 2018;24(6):1248-1259 View Article PubMed/NCBI
  70. Liu J, Chen Z, Li W. Machine Learning for Building Immune Genetic Model in Hepatocellular Carcinoma Patients. J Oncol 2021;2021:6676537 View Article PubMed/NCBI
  71. Bedon L, Dal Bo M, Mossenta M, Busato D, Toffoli G, Polano M. A Novel Epigenetic Machine Learning Model to Define Risk of Progression for Hepatocellular Carcinoma Patients. Int J Mol Sci 2021;22(3):1075 View Article PubMed/NCBI
  • Journal of Clinical and Translational Hepatology
  • pISSN 2225-0719
  • eISSN 2310-8819
Back to Top

Current Status and Analysis of Machine Learning in Hepatocellular Carcinoma

Sijia Feng, Jianhua Wang, Liheng Wang, Qixuan Qiu, Dongdong Chen, Huo Su, Xiaoli Li, Yao Xiao, Chiayen Lin
  • Reset Zoom
  • Download TIFF