Introduction
Acute-on-chronic liver failure (ACLF) is a clinically critical illness characterized by acute exacerbations of underlying chronic liver diseases with short-term high mortality.1,2 The etiology of underlying chronic liver diseases and precipitating events are distinct between Eastern and Western ACLF, which contributes to the heterogeneity of this syndrome.3 In Eastern ACLF, especially in China, hepatitis B virus related acute-on-chronic liver failure (HBV-ACLF) is the most common type.4
There are a variety of emerging therapies for HBV-ACLF, such as extracorporeal liver support device,5,6 glucocorticoid,7,8 granulocyte colony-stimulating factor (G-CSF),9 and cell therapies,10,11 but their efficacy requires further validation. Liver transplantation (LT) remains the only definite treatment to reduce the mortality of advanced HBV-ACLF12 but is limited by a lack of organ donors, huge financial cost of the procedure, and high mortality on the waiting list. In the European Association for the Study of the Liver–Chronic Liver Failure (EASL-CLIF) Acute-on-Chronic Liver Failure in Cirrhosis (CANONIC) study, ACLF patients had a 28-day mortality of 33.9%, and only 7.6% received LT.13 As a result, it is critical to precisely predict the short-term outcome of HBV-ACLF at the early stage of disease to make an accurate and prompt clinical decision of LT.
A number of clinical prediction models (CPMs) have been used to predict the short-term prognosis of HBV-ACLF utilizing laboratory and clinical variables that can be easily obtained in clinical practice. Some were specifically developed for HBV-ACLF, while others were originally developed for end-stage liver diseases [for instance, the model for end-stage liver disease (MELD) score,14 MELD-sodium (MELD-Na) score15 and Child-Turcotte-Pugh (CTP) score16], acute liver failure [King’s College Criteria (KCC)17], and other critical illness with organ failures [sequential organ failure assessment (SOFA)18]. Despite the number of available CPMs, there is no consensus on the use of optimal models to predict HBV-ACLF outcome. In addition, there are major concerns about the heterogeneity of study populations as well as model quality. Therefore, in the study, we systematically assessed both the performance and quality of available HBV-ACLF CPMs. We also analyzed the factors associated with heterogeneity and their predictive performance among different studies.
Methods
Study search and selection
A keyword search was carried out on articles related to HBV-ACLF published in PubMed from January 1995 to April 2020. The search strategy was developed as follows: (HBV OR hepatitis B) AND (severe flares of chronic hepatitis B OR chronic severe hepatitis B OR severe flare-up, chronic hepatitis B OR hepatic failure OR severe hepatitis B OR severe acute chronic hepatitis B (CHB) exacerbation OR hepatic decompensation OR severe acute exacerbation OR liver failure OR acute-on-chronic liver failure OR ACLF OR acute liver failure) AND (mortality OR prognosis OR outcome). Two reviewers (YX and LY) independently screened the searched articles based on the title, abstract, and full text sequentially. Disputes were resolved by negotiation between the two reviewers.
We included articles reporting the development of an HBV-ACLF-specific CPM or those assessing the predictive performance of previously established CPMs in non-HBV-ACLF-specific patients.
In addition, the included studies had clearly defined endpoints and reported the statistical modeling approaches if an HBV-ACLF-specific CPM was developed. For inclusion, the CPM had to contain at least two independent variables.
The exclusion criteria were as follows: (1) other types of publications, such as letters and reviews; (2) samples including patients younger than 18 years of age or pregnant women; (3) reports of biomarker-based prediction models; (4) reports of cost-benefit models; (5) experimental studies; or (6) decision-analysis studies.
Data extraction
We extracted the following information for each of the included articles: (1) year of publication; (2) study design; (3) study registration if reported; (4) diagnostic criteria for HBV-ACLF; (5) baseline characteristics of the study population; (6) sample size; (7) number of deaths or LT if reported; (8) variables included in the new CPMs; (9) statistical approaches for model development; and (10) model validation.
All information was independently extracted by the two reviewers, and disputes were resolved by negotiation between them.
Model assessment
Quality of HBV-ACLF-specific models
As shown in Supplementary Table 1, a scoring system was established by weighting study design, number of patients recruiting centers, sample size, adjustment of confounding factors, reporting of LT, and model validation. Studies with scores of 5–6 were considered high quality, 3–4 medium quality, and 1–2 low quality.
Performance of the CPMs
The performance of the CPMs was evaluated by discrimination and calibration.19 Discrimination referred to how well the model distinguished individuals at high risk of an event from those at low risk of an event.19 Calibration referred to the accuracy of absolute risk estimation.19 To measure model discrimination, we extracted the area under the receiver operating characteristic curve (AUROC) from each study. Quantitative pooled analysis of the discrimination performance of a specific model reported in several studies was performed by summary receiver operating characteristic (SROC) curves using Review Manager 5.3. To measure calibration, information on the Hosmer-Lemeshow test was extracted.
Ethics approval and consent to participate
The ethics committee of the First Affiliated Hospital of Zhejiang University reviewed and approved this study. Written consent from patients or their authorized representatives was waived.
Results
Characteristics of all CPMs
A total of 4,261 related studies were retrieved from PubMed based on the keyword search. According to the inclusion and exclusion criteria, 52 studies were selected after being screened by the title, abstract, and full text (Fig. 1). A total of 52 articles were extracted, of which 31 developed HBV-ACLF-specific CPMs and the other 21 assessed previously established CPMs. As shown in Figure 2, the number of publications is rapidly increasing each year. The studies were published in a number of academic journals (n=30), the most frequent being Chinese Journal of Hepatology [5 (9.62%)], followed by Medicine (Baltimore) [n=4 (7.69%)].
The diagnosis of HBV-ACLF in these studies was made mainly based on the Asian Pacific Association for the Study of the Liver (APASL) consensus for ACLF (51.92%) or the Chinese Medical Association (CMA) liver failure guidelines (40.38%). Among all studies, the sample size ranged from 46 to 1,202 patients. Significant heterogeneity was observed in patient characteristics among the different studies, as shown by the sex proportion (male/female) (ranging from 2.96 to 12.19), incidence of cirrhosis (24–100%), incidence of hepatic encephalopathy (10–51%), incidence of ascites (36–91%), and mean MELD score (20.97–29.00). The type of precipitating event was reported in seven studies (13.5%), with flare-up of hepatitis B being the major event in each study. Mortality varied among the different studies, with 3-month mortality ranging from 26% to 87%.
Regarding reporting of LT, 18 studies did not mention LT (34.62%), 21 excluded patients receiving LT (40.38%), and 5 defined LT and death as a composite endpoint (9.62%). LT was regarded as the censored event in six studies (11.54%). Patients with LT were defined as survivors in one study (1.92%). In one study, patients who received LT within 3 months were considered dead and more than 3 months as surviving.
In 8 studies (15.4%), dynamic parameters were used for modeling. ΔMELD or ΔMELD-Na calculated as the difference between MELD or MELD-Na at two time points was most frequent. One parameter was constructed based on the daily levels of predictive variables for 7 days after diagnosis combined with baseline risk factors. In the other studies, only baseline parameters were used.
Characteristics of HBV-ACLF-specific CPMs
Thirty-one CPMs were established specifically for HBV-ACLF (Table 1).
Table 1Patient characteristics of HBV-ACLF-specific CPMs
References† | Model | ACLF diagnostic criteria | Sample | Death events | Endpoint time | Basic characteristics of the study population at admission
|
---|
Age in years | Sex, male/female | Cirrhosis, n/total | Ascites, n/total | HE, n/total | TB in mmol/L | INR | MELD score |
---|
[1] | Ke’s model | CMA | 205 | 104 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
[2] | Li’s model | CMA | 409 | 215 | NA | 42±12 | 378/31 | NA | NA | NA | NA | NA | NA |
[3] | Sun’s model | CMA | 204 | 118 | 90-day | 46.8±13.2 | 170/34 | 110/204 | NA | 86/204 | 318.6±175.8 | NA | 26.0±9.0 |
[4] | LRM | APASL | 452 | 175 | 90-day | 45.6±11.5 | 361/91 | 138/452 | 334/452 | 119/452 | NA | NA | NA |
[5] | He’s model | CMA | 172 | 75 | 90-day | 45.16±11.21 | 144/28 | 132/172 | 96/172 | NA | 297.8±109.3 | 2.4±0.7 | 26.4±4.2 |
[6] | TPPM | APASL | 248 | 133 | 90-day | 42.27±11.98 | 225/23 | 68/248 | 152/248 | 95/248 | 270.9±140.3 | 2.0±0. 5 | 20.97±5.83 |
[7] | Zheng’s model | APASL | 726 | 371 | 90-day | 43.5±11.6 | 635/91 | NA | 530/726 | 251/726 | NA | NA | NA |
[8] | ALPH-Q | APASL | 214 | 81 | 90-day | NA | 160/54 | 99/214 | 123/214 | 45/214 | NA | NA | NA |
[9] | Yan’s model | APASL | 432 | 209 | 90-day | 46.9±13.3 | 329/103 | 239/432 | 348/432 | 115/432 | 351 (210) | 2.8 (1.6) | 27.8 (8.3) |
[10] | Yi’s model | APASL | 392 | 218 | 90-day | NA | 323/69 | NA | NA | 165/392 | NA | NA | NA |
[11] | Li’s model | CMA | 338 | 129 | 90-day | 44.7±10.1 | 268/70 | 222/338 | 220/338 | 54/338 | NA | NA | NA |
[12] | HBV-ACLFs | EASL-ACLF | 300 | 150 | 28-day | 46.5±11.3 | 233/67 | 300/300 | 229/300 | 71/300 | 453.2±278.7 | 3.2±2.1 | NA |
[13] | HAM | APASL | 530 | 190 | 90-day | 41 (median) | 489/41 | 246/530 | 264/530 | 95/530 | NA | NA | NA |
[14] | Chen’s model | APASL | 551 | 241 | 90-day | NA | 465/86 | 217/551 | NA | NA | NA | NA | NA |
[15] | MELD-LAC | AASL | 236 | 106 | 90-day | NA | 197/39 | 131 / 236 | NA | NA | NA | NA | NA |
[16] | HINAT ACLF | APASL | 573 | 153 (28-day), 219 (90-day) | 28-day, 90-day | 43.5±11.5 | 478/98 | NA | 374/573 | 117/573 | 313.0±144.7 | 2.3±0.8 | NA |
[17] | Lei’s model | CMA | 138 | NA | the time of discharge or in-hospital death of the patient | 45.80±11.01 | 111/27 | 51/138 | 96/138 | NA | NA | NA | NA |
[18] | Lin’s model | APASL | 456 | 176 | 90-day | NA | 383/73 | NA | 228/456 | 46/456 | NA | NA | NA |
[19] | Shi’s model | APASL | 384 | 75 (30-day), 106 (60-day), 125 (90-day), 127 (180-day) | 30-day, 60-day, 90-day, 180-day | NA | 303/81 | 177/384 | 236/384 | 93/384 | NA | NA | NA |
[20] | Xue’s model | APASL | 305 | 87 | 30-day | NA | 257/48 | 89/305 | 212/305 | 92/305 | NA | NA | NA |
[21] | Gong’s model | CMA | 184 | 75 | 90-day | NA | 157/27 | NA | 122/184 | NA | NA | NA | NA |
[22] | Lin’s model | APASL | 370 | 110 | 90-day | NA | 314/56 | 88/370 | 248/370 | 103/370 | NA | NA | NA |
[23] | HINT | APASL | 635 | 204 | 30-day | 46.31±11.87 | 538/97 | 455/635 | 239/635 | 108/635 | 319.1 (220.9, 421.0) | 2.02 (1.71, 2.55) | 23.07±5.95 |
[24] | COSSH-ACLF | EASL-ACLF | 657 | 233 (28-day), 313 (90-day) | 28-day, 90-day | NA | 586/71 | 466/657 | 366/657 | 130/657 | NA | NA | NA |
[25] | CTP-ABIC | CMA | 222 | 80 | 90-day | NA | 197/25 | 168/222 | 151/222 | 44/222 | NA | NA | NA |
[26] | Gao’s model | APASL | 1,202 | 329 (28-day), 456 (90-day) | 28-day, 90-day | NA | 980/222 | 382/1,202 | 772/1,202 | 282/1,202 | NA | NA | NA |
[27] | APM | APASL | 405 | NA | 28-day | NA | 358/47 | 176/405 | 144/405 | 52/405 | NA | NA | NA |
[28] | ANN | APASL | 402 | 160 | 90-day | 47.2±13.3 | 316/86 | NA | NA | NA | 297.5±169.3 | 2.9±1.7 | 28.2±6.2 |
[29] | ANN | APASL | 684 | 175 (28-day), 251 (90-day) | 28-day, 90-day | 43.9±11.6 | 582/102 | NA | 405/684 | 122/684 | 323.5±148.4 | 2.3±0.8 | 22.9 (20.0, 26.5) |
[30] | CART | NA | 777 | 316 | 90-day | NA | 610/167 | 371/777 | NA | NA | NA | NA | NA |
[31] | CART | EASL-CLIF | 489 | 191 (28-day) | 28-day | NA | 424/65 | 234/489 | 234/489 | 63/489 | NA | NA | NA |
The diagnosis of HBV-ACLF in these studies was made mainly based on the APASL consensus [n=18 (58.06%)] or the CMA liver failure guidelines [n=8 (25.81%)]. EASL-ACLF criteria were used in four studies [n=2 (6.45%)] and Chinese Group on the Study of Severe Hepatitis B-acute-on-chronic liver failure (COSSH-ACLF) in one study [n=1 (3.23%)]. One study [n=1 (3.23%)] adopted the diagnostic criteria of acute liver failure proposed by the American Association for the Study of Liver Disease (AASLD). One study did not mention specific diagnostic criteria [n=1 (3.23%)].
As shown in Supplementary Table 1, 17 studies had a quality score of 0–2 (low quality), 12 had a score of 3–5 (medium quality), and only 2 had a score of 6–8 (high quality). Most were retrospective [n=26 (83.87%)] and single-center [n=30 (96.77%)], and only one was pre-registered. In terms of variable screening, most studies used regression approaches [n=26 (83.87%)]. The logistic regression model [n=14 (45.16%)] and the Cox hazard proportional model [n=12 (38.71%)] were the two methods most frequently used to identify risk variables. Two studies (6.45%) did not mention a clear variable screening method. Among the clinical variables consisting of CPMs, serum bilirubin (67.74%), international normalized ratio (INR) (54.84%), and hepatic encephalopathy (51.61%) were most frequent (Table 2). In terms of model formula, most CPMs were calculated as the results of multivariate logistic regression or Cox proportional hazard model as follows: (regression coefficients β1)×(variable 1)+(regression coefficients β2)×(variable 2)+(regression coefficients β3)×(variable 3)+….+constant (if logistic regression) (n=19 (61.29%). Three (9.68%) were calculated based on the sum of a series of categorical variables, the values of which were equally assigned [such as the Child-Turcotte-Pugh (CTP) score]; moreover, 5 (16.13%) were represented in the form of a nomogram, 2 (0.06%) were represented as an artificial neural network, and 2 (0.06%) were represented as a classification and regression tree.
Table 2Variables consisting of model and screening approaches
New CPMs | Variables | Methods |
---|
Ke’s model | TB; PTA; WBC; serum creatinine; maximum depth of ascites; HE score; singultus score; digestive tract hemorrhage score | Not mentioned |
Li’s model | HE; serum creatinine; PTA; TB; infection; liver size; ascites fluid level | Clinical experience |
Sun’s model | HR; LC; hepatitis B e antigen; ALB; PTA | Logistic regression |
LRM | HE; HR; LC; hepatitis B e antigen; PTA; Age | Logistic regression |
He’s model | HE; serum creatinine; INR; TB at the end of 2 weeks of treatment; cholinesterase | Logistic regression |
TPPM | TB; INR; complications; HBV DNA | Logistic regression |
Zheng’s model | TB; serum creatinine; PTA; HE; the maximum depth of ascites; WBC | Not mentioned |
ALPH-Q | age; LC; PT; HE; QTc | COX regression |
Yan’s model | age; HE score; MELD | COX regression |
Yi’s model | HE; lnPTA2; lnINR2; lnTB2 (PTA2, INR2 and TB2 corresponded to those parameters at two weeks of treatment). | Logistic regression |
Li’s model | age; Family history of HBV; HE; HR; WBC; PLT; INR; TB; TBA; CHE; serum creatinine; serum sodium; HBV DNA; hepatitis B e antigen | Logistic regression |
HBV-ACLFs | age; serum creatinine; WBC | COX regression |
HAM | MELD; HE; AFP; WBC; age | Logistic regression |
Chen’s model | MELD, age, sodium | Logistic regression |
MELD-LAC | LAC, MELD | Logistic regression |
HINAT ACLF | HE, INR, NLR | COX regression |
Lei’s model | NLR; serum levels of gamma-glutamyltransferase; ALB; sodium; artificial liver support therapy | Logistic regression |
Lin’s model | age; LAAR; MELD | COX regression |
Shi’s model | age; TB; serum sodium; PTA | COX regression |
Xue’s model | TB; ALB; INR; Blood neutrophils percentage count; HE; Suspicion of infection | Logistic regression |
Gong’s model | NLR; age; TB | COX regression |
Lin’s model | TB; evolution of bilirubin; PTA; PLT; anti-HBe | Logistic regression |
HINT | HE; INR; neutrophil count; TSH | COX regression |
COSSH-ACLF | INR; HBV-SOFA; Age; TB | COX regression |
CTP-ABIC | CTP; ABIC | COX regression |
Gao’s model | age; TB; ALB; INR; HE | COX regression |
APM | AFP; HE score; serum sodium; INR | COX regression |
ANN | serum sodium; TB; age; PTA; Hb; hepatitis B e antigen | Univariate analysis and Artificial neural network |
ANN | TB, PTA, serum sodium, HE, hepatitis B e antigen, GGT, ALP, age | Univariate analysis and Artificial neural network |
CART | TB, age, serum sodium, INR | Univariate Logistic regression and Classification and regression tree |
CART | HE, PT, TB | Logistic regression and Classification and regression tree |
A total of 19 CPMs (61.29%) were validated, including 1 model that was validated by two cohorts. Single-center and multicenter validation cohorts were used in 14 and 6 studies, respectively (a single-center cohort and a multicenter cohort were used for the CPM with two validation cohorts). Eight of fourteen single-center validation cohorts were derived from the same center as the modeling cohorts, and the other six cohorts were derived from external centers. The validation cohort was prospective in five studies (26.32%) and retrospective in fourteen studies (73.68%). The patients in the model cohort and validation cohort were recruited during the same period in two studies but not in the other sixteen studies; one study did not mention the timing of recruitment. The sample size of the validation cohort was generally smaller than the derivation cohort and ranged from 88 to 300 patients.
Characteristics of non-HBV-ACLF-specific CPMs
A total of 21 studies evaluated the performance of CPMs that were non-specific for HBV-ACLF. Eighteen were single-center studies (85.7%) and three were multicenter studies (14.3%). Ten models developed for other diseases were evaluated, including KCC for acute liver failure, age-bilirubin-INR-creatinine (ABIC) score for alcohol liver diseases, albumin-bilirubin (ALBI) score for liver cancer, CTP, modified Child-Turcotte-Pugh (mCTP) score, MELD, MELD-Na, updated MELD (UpMELD), and MELD excluding the international normalized ratio (MELD-XI) score for end-stage liver diseases.
Model performance
Among the 52 selected studies, 50 evaluated model predictive performance. Forty-six studies reported the AUROC, four studies reported the C-Index, and only five studies reported the Hosmer-Lemeshow test to assess model calibration.
Table 3 presents the discriminative performance of each CPM. The AUROC of all CPMs varied between 0.521 and 0.970, the sensitivity between 34% and 100%, and the specificity between 2.60% and 93.31%. The AUROC of 31 CPMs specific for HBV-ACLF ranged from 0.63 to 0.97, the sensitivity from 44.44% to 92.6%, and the specificity from 42.3% to 95.31%. As shown in Table 2, the MELD score was the most widely used CPM (44 studies), followed by the MELD-Na score (21 studies) and the CTP score (19 studies). The capacity of discrimination of MELD varied widely among different studies, as indicated by the AUROC (between 0.58 and 0.94), sensitivity (between 43.70% and 100%), specificity (between 63.8% and 90.2%), and optimal cut-off point (between 21 and 32 points). Likewise, a large variation in predictive performance was seen in the MELD-Na score [AUROC (between 0.563 and 0.922), sensitivity (between 41.90% and 86.4%), specificity (between 61.9% and 86.7%), and optimal cut-off point (between 22.35 and 34.28)] and the CTP score [AUROC (between 0.553 and 0.878), sensitivity (between 34% and 99.35%), specificity (between 39.71% and 84%), and optimal cut-off point (between 9 and 12.5 points)].
Table 3Discriminative performance of CPMs
Model | AUROC/C-Index | Sensitivity | Specificity | Cut-off | References† |
---|
MELD | 0.58–0.94 | 43.70–100% | 63.8–90.2% | 21–32 | [3–6,8–10,12,13,15–46,51], |
Ke’s model | NA | NA | NA | NA | [1] |
KCC | 0.642–0.783 | 41–59% | 2.6–87.7% | 0–0.5 | [32,36] |
CTP | 0.553–0.878 | 34–99.35% | 39.71–84% | 9–12.5 | [4,8–10,16–18,20,23,24,29,32,36,42,45–48], |
MELD-Na | 0.563–0.922 | 41.9–86.4% | 61.9–86.7% | 22.35–34.28 | [5,13,14,16–18,20,22,24–29,34,37,39,46,47,49,52] |
Li’s model | 0.953 | 97% | 82% | 9.5 | [2] |
Sun’s model | 0.647–0.891 | 68.6–72.3% | 52.1–52.5% | −2.554 | [3,4,13] |
Zhang’s model(LRM) | 0.68–0.914 | 64–92.6% | 42.3–95.1% | –0.3264–0.5176 | [3,4,8,13,30,36,41] |
MELD-Na | 0.521–0.886 | 41.9–78.21% | 50.5–90.16% | 25.6–32 | [10,12,13,14,28,36,42,49,50] |
He’s model | 0.85±0.03 | NA | NA | NA | [5] |
iMELD | 0.540–0.864 | 54.7–89.58% | 56.16–85% | 34.705–52 | [5,10,13,14,17,28,31,36,37,39,42] |
MESO | 0.571–0.905 | 38.7–80.77% | 75.25–91.80% | 1.986–21.61 | [5,10,13,28,42] |
TPPM | 0.786–0.970 | 84.09–89.6% | 61.54–94.7% | 0.22 | [6,25,38] |
Zheng’s model | 0.900–0.970 | NA | NA | NA | [7] |
UpMELD | 0.687 | 44.7% | 87.2% | 5.5 | [39] |
MELD-XI | 0.647 | 55.3% | 71.8% | 20.5 | [39] |
UKMELD | 0.766 | 57.6% | 81.6% | 45.5 | [39] |
ALPH-Q | 0.837–0.896 | 78–78.7% | 85.1% | 6.778 | [8] |
Yan’s model | 0.853–0.867 | 72–76% | 84.8–89.2% | 4.66 | [9] |
SOFA | 0.705–0.751 | 54.2–60% | 80.4–84.7% | 6.5 | [9,16] |
CLIF-SOFA | 0.711–0.876 | 54.3–80.14% | 64.56–91.1% | 7–8.5 | [9,16,23,44,50] |
Yi’s model | 0.930±0.016 | NA | NA | NA | [10] |
iMELD-C | 0.776–0.862 | 69.23–89.58% | 78.71–80.33% | 49.306–52.157 | [10] |
LRM | 0.93 | 86% | 87.1% | 3.16 | [11] |
HBV-ACLFs | 0.704 (C-Index) | NA | NA | NA | [12] |
CLIF-C ACLFs | 0.632–0.873 | 61.86–93.65% | 63.7–78.6% | 36.78–43.76 | [12,16,23–27,29,31,44,46] |
HAM | 0.868–0.894 | 84.9–91.5% | 70.9–75% | −1.191 | [13] |
mCTP | 0.74 | 91% | 48.8% | 14 | [42] |
ALBI | 0.583–0.784 | 62.2–65.9% | 67.2–81.4% | –1.119–0.95 | [17,43,45] |
ALBI+MELD | 0.912 | 76.7% | 90.9% | NA | [43] |
Chen’s model | 0.867 | NA | NA | NA | [14] |
MELD-LAC | 0.859 | 91.5% | 80.1% | −0.4741 | [15] |
HINAT ACLF | 0.839–0.855 | 82% | 74.5% | 4.6 | [16] |
CLIF-C OF | 0.656–0.906 | 53.9–92.6% | 72.9–78.8% | 8.5–10.5 | [16,24,25,44,45,46,50] |
Lei’s model | 0.656 | 62.2% | 64.1% | NA | [17] |
Lin’s model | 0.854–0.890 | NA | NA | NA | [18] |
Shi’s model | 0.790–0.799 (C-Index) | NA | NA | NA | [19] |
Xue’s model | 0.813–0.848 | 44.44% | 93.63% | NA | [20] |
ABIC | 0.695–0.829 | 54.4–73.8% | 81.7% | 9.16–9.44 | [45,48] |
Gong’s model | 0.63–0.742 | NA | NA | NA | [21] |
Lin’s model | 0.79–0.86 | 67.3% | 91% | −0.73 | [22] |
HINT | 0.889–0.917 | 74.60–79.43% | 84.56–95.31% | −0.77 | [23] |
COSSH-ACLF | 0.718–0.898 | 54.9–89.04% | 55.56–91.78% | 3.7–6.4 | [23–27,31,50] |
CLIF AD | 0.775 | NA | NA | NA | [46] |
CTP-ABIC | 0.927 | 90% | 80.3% | 9.08 | [48] |
AARC-ACLFs | 0.790 | NA | NA | NA | [25] |
Gao’s model | 0.58–0.80 (C-Index) | NA | NA | NA | [26] |
APM | 0.747–0.790 | 73.2% | 71.5% | 2.56 | [27] |
ANN | 0.765–0.869 | NA | NA | NA | [28] |
ANN | 0.754–0.913 | NA | NA | NA | [29] |
CART | 0.896–0.905 | 69.7–85.2% | 80.1–93.5% | NA | [30] |
CART | 0.820–0.824 | 88.2–88.6% | 62.7–68.5% | NA | [31] |
In addition, we performed a pooled analysis of diagnostic accuracy of several common CPMs. As shown by the summary receiver operating characteristic (SROC) curves in Figure 3, the overall discriminative performance of the MELD score and chronic liver failure-sequential organ failure assessment (CLIF-SOFA) score seemed to be higher than those of the CTP score and MELD-Na score.
Impact of ACLF severity and diagnostic criteria on model performance
To further analyze the factors contributing to the large variation in the predictive performance of a specific model among different studies, we compared the accuracy of MELD in HBV-ACLF defined by different diagnostic criteria. In APASL-defined ACLF patients, the AUROC of the MELD score was between 0.580 and 0.940, the sensitivity was between 43.7% and 88.9%, the specificity was between 67.2% and 90.2%, and the best cut-off point was between 21.57 and 29.6 points. In CMA-defined ACLF patients, the AUROC was between 0.612 and 0.906, the sensitivity was between 51% and 100%, the specificity was between 70.2% and 91.4%, and the best cut-off point was between 21 and 32 points.
Next, we assessed the relationship between the mean MELD value of patients at admission and the AUROC value of the MELD score. As shown in Figure 4, we found that the lower the mean MELD value of HBV-ACLF patients at admission, the greater the AUROC value. This suggested a negative correlation between disease severity at admission and the discriminative capacity of the MELD score.
Discussion
In this study, we systematically summarized the available clinical prediction models for HBV-ACLF and performed an extensive review of each study with regard to modeling data, modeling approach and model performance. Although the number of HBV-ACLF-specific CPMs has increased rapidly in the past 10 years, there are major concerns about the quality and reproducibility of most of them. Our analysis showed that the development of most HBV-ACLF-specific CPMs was flawed in the quality of modeling data. Most studies were retrospective in nature, recruited patients from a single center, and had limited sample sizes. The model proposed by the Chinese Group on the Study of Severe Hepatitis B (COSSH) consortium is the only CPM that was developed on the basis of national, multicenter, and prospective cohort data. Nevertheless, the COSSH HBV-ACLF model is not fully validated, as the validation cohort is single center and not from external study centers. Another frequent weakness is the absence of information on LT or inappropriate handling of LT data. Generally, LT is regarded as a competing event with death. However, a competing risk model in survival analysis has seldom been used. Few of the studies reported the indication of LT when adopting the use of a composite endpoint that combined death and LT. Either using an LT-free cohort or defining LT as a censored event may underestimate the mortality of the overall population and introduce bias in model development.
The MELD score is recognized as the mainstay for evaluating end-stage liver disease.20 It was originally developed to predict the short-term prognosis of cirrhotic patients undergoing transjugular intrahepatic portosystemic shunt (TIPS).14 The present analysis showed that MELD is the most commonly used CPM for predicting HBV-ACLF outcome. However, a large variation in the discriminative performance of MELD as indicated by AUROC, sensitivity and specificity was observed in different studies. This variation raises the concern that the heterogeneity of the study populations may impact model performance. The population heterogeneity may be due to the use of different diagnostic criteria in various studies (Table 4). The current analysis suggests that the use of MELD in APASL- and CMA-defined HBV-ACLF patients can obtain comparable discriminative performance because both diagnostic criteria identify ACLF patients characterized by high bilirubin and coagulopathy. On the other hand, our findings reveal a wide range of AUROC values for the MELD score despite using the same inclusion criteria for HBV-ACLF. Even when specific criteria are used, HBV-ACLF cases represent a heterogeneous population. Defining the population is confounded by the type of precipitating events (for instance, flare-up of hepatitis, use of hepatotoxic drugs, large alcohol consumption and so on) and the severity of underlying chronic liver diseases (non-cirrhotic chronic liver disease or compensated cirrhosis).3,21,22 Our findings showed that a lower MELD at admission has higher predictive power in HBV-ACLF, and the use of MELD in those with ultra-high MELD scores achieves high predictive performance as well.23 These findings suggest that the severity of HBV-ACLF is another important confounding factor of model performance and that preferential inclusion of patients at both ends of the severity spectrum would overestimate the predictive capacity of models. In addition, both 28-day and 90-day mortality were used as primary endpoints in different studies, thus contributing to varying degrees of predictive performance. Death events occurred frequently between 28 days and 90 days post-admission but were less frequent after 90 days in APASL-defined ACLF.24–26 The CANONIC study, which defined 28-day mortality as the primary endpoint, also reported much higher mortality at 90 days in patients with ACLF grade 1 or 2.13 Therefore, the use of 90-day mortality as the primary endpoint better fits the natural history of ACLF.
The present study identified common variables used in CPMs, in addition to the components of MELD. The presence of hepatic encephalopathy (HE) was frequently reported to be an independent variable associated with poor outcome.27 In addition, indicators of systemic inflammation, such as white blood cells (WBC) count, neutrophil percentage, and neutrophil-to-lymphocyte ratio (NLR), are common risk factors for short-term death.28 Other common variables included age, presence of ascites, serum sodium and hepatitis B e antigen presence. On the other hand, one of the MELD parameters, serum creatinine, was less frequently reported as an independent risk factor in HBV-ACLF. As a result, the overall predictive performance of MELD in HBV-ACLF is not satisfactory, and consistent with this finding, recent studies have shown limited capacity of MELD-Na in identifying ACLF patients at high risk of death on LT waiting lists.29–31 By contrast, a MELD-based scoring system that integrates HE and age outperforms the MELD score in predicting 90-day mortality of HBV-ACLF.32 In addition to the variables constituting the CPMs, model performance is determined by the weighting of specific variables. For example, although MELD does not include important criteria such as HE and ascites, the CTP with these parameters performed less well overall than the MELD score in which each variable is equally weighted.
In conclusion, a growing number of HBV-specific CPMs have been developed in recent years, but most are flawed in either the quality of the modeling data, the integrity of the modeling approach, or external validation. The MELD score is the most commonly used CPM, although it is non-HBV-specific. However, there is significant heterogeneity in the predictive performance of the MELD score among different studies due to the confounding effect of disease severity. Therefore, the clinical utility of CPMs in predicting the short-term prognosis of HBV-ACLF remains to be undefined. There is redundancy in the current HBV-ACLF CPMs, and there is an urgent need to establish high-quality prognostic models to better guide clinical practice. The development of future HBV-ACLF-specific CPMs should include the following elements to ensure the reliability of the model: (1) unified HBV-ACLF diagnostic criteria with a defined endpoint; (2) high-quality and unbiased modeling and validation data from prospective, large-sample, multicenter cohorts, as well as real-world validation; (3) selection of a couple of non-redundant and easily accessible variables for inclusion in the model via a well-adjusted process; (4) appropriate handling of events competing with death; (5) assessment of model discrimination and calibration; and (6) appropriate presentation of clinical utility.
Table 4Similarities and differences of ACLF diagnostic criteria
| CMA | APASL | EASL-CLIF | NACSELD | COSSH |
---|
Definition | Severe liver damage caused by various insults on the basis of chronic liver disease, representing a clinical syndromes mainly manifesting as coagulopathy, jaundice, hepatic encephalopathy, ascites, etc. | Acute hepatic insult manifesting as jaundice and coagulopathy. Complicated within 4 weeks by ascites and/or encephalopathy in a patient with previously diagnosed or undiagnosed chronic liver disease associated with high mortality. | An acute deterioration of pre-existing chronic liver disease usually related to a precipitating event and associated with increased mortality at 3 months due to multisystem organ failure. | A syndrome characterized by acute deterioration in a patient of cirrhosis due to infection presenting with two or more extrahepatic organ failure. | A complicated syndrome with a high short-term mortality rate that develops in patients with HBV-related chronic liver disease regardless of the presence of cirrhosis and is characterized by acute deterioration of liver function and hepatic and/or extrahepatic organ failure. |
Proposing time | 2006 (updated on 2014) | 2009 (updated on 2019) | 2013 | 2014 | 2017 |
Chronic liver disease | compensated chronic liver disease | Non-cirrhotic chronic liver disease and previously compensated cirrhosis | Decompensated cirrhosis | Decompensated cirrhosis | Non-cirrhotic chronic liver disease and cirrhosis |
Acute precipitating events | Acute hepatic insults | Acute hepatic insults | Any and frequently without identifiable events | Infection | Any and frequently without identifiable events |
Etiology | All | All | All | All | HBV |
Definition of liver failure | PTA ≤40% and serum bilirubin ≥10 mg/dL or daily rise ≥1 mg/dL | INR ≥1.5 and serum bilirubin ≥5 mg/dL | Serum bilirubin ≥12 mg/dL | None | Serum bilirubin ≥12 mg/dL |
Supporting information
Supplementary Table 1
(A) Scoring system criteria for assessing quality of HBV-ACLF-specific models; (B) Quality score of each HBV-ACLF-specific model.
(DOCX)
Supplementary File 1
Supplementary reference list.
(DOCX)
Abbreviations
- AARC-ACLFs:
APASL ACLF research consortium-ACLF
- AASL:
American Association for the Study of Liver Failure
- ABIC:
age-bilirubin-INR-creatinine
- ACLF:
acute-on-chronic liver failure
- AFP:
alpha-fetoprotein
- ALB:
albumin
- ALBI:
albumin-bilirubin
- ALP:
alkaline phosphatase
- ANN:
artificial neural network
- APASL:
Asian Pacific Association for the Study of the Liver
- APLH-Q:
age-PT-LC-HE-QTc
- APM:
artificial liver support system prognosis model
- AUROC:
area under the receiver operating characteristic curve
- CART:
classification and regression tree
- CHE:
cholinesterase
- CLIF:
chronic liver failure
- CLIF AD:
chronic liver failure-consortium acute decompensation
- CLIF-C ACLFs:
chronic liver failure-consortium acute-on chronic liver failure score
- CLIF-C OF:
chronic liver failure-consortium organ failure
- CLIF-SOFA:
chronic liver failure-sequential organ failure assessment
- CMA:
Chinese Medical Association
- COSSH:
Chinese Group on the Study of Severe Hepatitis B
- CPMs:
clinical prediction model
- CTP:
Child-Turcotte-Pugh
- EASL-CLIF:
European Association for the Study of the Liver–Chronic Liver Failure
- G-CSF:
granulocyte colony-stimulating factor
- GGT:
γ-glutamyltransferase
- HAM:
HBV-ACLF MELD
- HB:
hemoglobin
- HBV:
hepatitis B virus
- HBV-ACLF:
hepatitis B virus related acute-on-chronic liver failure
- HE:
hepatic encephalopathy
- HINAT ACLF:
HE-INR-NLR -age-TB ACLF
- HINT:
HE-INR-neutrophil count-thyroid stimulating hormone
- HR:
hepatorenal syndrome
- ICU:
intensive care unit
- iMELD:
integrated MELD model
- iMELD-C:
iMELD plus complications
- INR:
international normalized ratio
- KCC:
King’s College Criteria
- LAAR:
liver to abdominal area ratio
- LAC:
lactic acid
- LC:
liver cirrhosis
- LRM:
logistic regression model
- LRM-Z:
Z logistic regression model
- LT:
liver transplantation
- mCTP:
modified Child-Turcotte-Pugh
- MELD:
model for end-stage liver disease
- MELD-LAC:
MELD-lactate
- MELD-Na:
MELD-sodium
- MELD-XI:
MELD excluding the international normalized ratio
- MESO:
model for end-stage liver disease score to serum sodium ratio index
- NACSELD:
North American Consortium for the Study of End-Stage Liver Disease
- NLR:
neutrophil–lymphocyte ratio
- PLT:
platelet
- PT:
prothrombin time
- PTA:
prothrombin activity
- QTc:
QT interval corrected for heart rate
- SOFA:
sequential organ failure assessment
- SROC:
summary receiver operating characteristic curve
- TB:
total bilirubin
- TBA:
total bile acid
- TPPM:
Tongji prognostic predictor model
- TSH:
thyroid-stimulating hormone
- UpMELD:
updated MELD
- UKMELD:
United Kingdom MELD
- WBC:
white blood cells
Declarations
Data sharing statement
No additional data are available.
Funding
This work was supported by grants from the Chinese National Natural Science Foundation (Nos. 81670567 and 81870425) and the Fundamental Research Funds for the Central Universities.
Conflict of interest
The authors have no conflict of interests related to this publication.
Authors’ contributions
Conceptualization of the idea and design of the study (JS, YS), drafting of the manuscript (XY, YL), revision of the manuscript (YS, JS), and search and selection, data extraction, analysis, and interpretation (XY, YL, HT, XX, KG, JY). All authors read and approved the final manuscript.