Introduction
Severe acute liver injury (SLI) represents a critical stage in the progression of acute liver disease. It is generally defined as acute liver injury characterized by a marked elevation of serum transaminases accompanied by severe coagulopathy (international normalized ratio [INR] ≥ 1.5) in patients without pre-existing chronic liver disease.1–3 The majority of patients with SLI may recover with appropriate treatment.4,5 However, once SLI progresses to hepatic encephalopathy (HE), the disease enters the stage of acute liver failure (ALF), which is associated with an extremely high short-term mortality.6,7 Consequently, early identification of high-risk patients during the SLI stage is of comparable clinical importance to prognostic assessment after the onset of ALF.
Currently, the main prognostic assessment tools for patients with ALF in clinical practice include the Acute Liver Failure Study Group Prognostic Index (ALFSG-PI), King’s College Criteria (KCC), and the Model for End-Stage Liver Disease (MELD) score.8 Even though no single biomarkers have consistently outperformed these prognostic models such as the KCC and MELD score,9 all these models have their clinical limitations. The MELD score was not originally developed for ALF and does not incorporate HE, a key determinant of prognosis.10 The KCC, while highly specific in non–acetaminophen-related ALF, suffers from low sensitivity, potentially delaying liver transplantation in high-risk patients.10–12 Although the ALFSG-PI demonstrates good predictive performance,13 its complexity and reliance on multiple variables limit its applicability in emergency settings and primary care institutions. Moreover, most existing prognostic models14–17 were developed exclusively for ALF and do not target at identifying risk factors for the earlier SLI stage, during which timely clinical intervention may alter the disease trajectory. Given the dynamic progression from SLI to ALF after hospital presentation, we enrolled patients with SLI, including those who fulfilled diagnostic criteria for ALF, to develop an integrated and simplified prognostic model applicable across both stages. This study aimed to establish a practical tool based on readily available clinical parameters to facilitate early risk stratification, dynamic prognostic assessment, and timely decision-making regarding intensive management and liver transplantation.
Methods
Patients and data collection
This study was conducted as a retrospective investigation divided into two phases. In the first phase (training cohort), we consecutively enrolled patients admitted to our center between July 1, 2020, and May 31, 2025. Included patients had no known history of chronic liver disease and presented with a prothrombin activity (PTA) ≤ 40% or an INR ≥ 1.5, with or without HE. The second phase (validation cohort) included patients who presented to our center between June 1, 2025, and December 25, 2025, and met the same inclusion criteria in order to further validate the predictive performance of the model.
The exclusion criteria were as follows: (1) malignancy; (2) human immunodeficiency virus infection; (3) hematologic disorders affecting coagulation or active bleeding; (4) concomitant end-stage chronic pulmonary, cardiac, or renal disease; (5) pregnancy in female patients; and (6) incomplete baseline clinical and laboratory data. The diagnosis of ALF was established according to the criteria proposed by the American College of Gastroenterology (ACG) in 202318 and the 2024 Chinese Medical Association (CMA) guidelines for liver failure.19
Baseline data on admission were extracted from the electronic medical record system, including demographic characteristics (age and sex), etiology, HE grade assessed using the West Haven criteria,20 and relevant laboratory parameters. The use of vasoactive agents was recorded. The etiology of ALF was determined based on a comprehensive evaluation of medical history, incorporating laboratory findings, imaging studies, and histopathological examination (if available). Cases with an unidentified etiology were classified as having unknown etiology. Patients were followed until recovery, death, liver transplantation, or 90 days after admission, whichever came first. The MELD score was calculated as follows:21 3.78 × ln (bilirubin [mg/dL]) + 11.2 × ln (INR) + 9.57 × ln (creatinine [mg/dL]) + 6.43 (constant for liver disease etiology). The KCC22 and ALFSG-PI13 were calculated according to previously published methods.
This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of Beijing YouAn Hospital, Capital Medical University (No. 2020-079). Written informed consent was obtained from all patients.
Statistical analysis
All statistical analyses were performed using R software (version 4.2.2). Continuous variables were presented as mean ± standard deviation or median (interquartile range), as appropriate, and were compared using the independent-samples t-test or the Mann–Whitney U-test. Categorical variables were expressed as counts (percentages) and were compared using the chi-square test or Fisher’s exact test. The 90-day transplant-free survival (90d-TFS) was set as the primary endpoint. In the training set, variables associated with 90d-TFS in univariate Cox proportional hazards regression analysis (P < 0.05) were entered into a multivariate Cox regression model using a forward stepwise approach. Hazard ratios (HRs) and 95% confidence intervals (CIs) were calculated. Optimal cutoff values for INR and platelet count (PLT) were determined using X-tile software (Version 3.6.1, Yale University, New Haven, CT, USA). This software systematically evaluated all potential tertile split points and assessed survival differences among subgroups using the log-rank test. The cutoff value corresponding to the maximum χ2 value (minimum P-value) was selected as the optimal threshold, thereby achieving optimized grouping of continuous variables based on the 90d-TFS outcomes. After grouping, survival curves were generated using the Kaplan–Meier method, and the statistical significance of survival differences between groups was verified by the log-rank test. Model discrimination was assessed by the area under the receiver operating characteristic curve (AUC), and comparisons between AUCs were performed using the DeLong test. A two-sided P < 0.05 was considered statistically significant.
Results
Baseline clinical characteristics
The training cohort comprised 302 patients with SLI/ALF, and the validation cohort included 38 patients. Baseline clinical characteristics of the two cohorts are shown in Table 1. In the training cohort, the mean age was 50.15 ± 17.22 years; 128 patients (42.4%) were male and 174 (57.6%) were female. On admission, 96 patients (31.8%) met the ACG criteria for ALF, and 74 (24.5%) met the CMA criteria. During the 90-day follow-up period, 190 patients (62.9%) achieved transplant-free survival, whereas 112 patients (37.1%) died or underwent liver transplantation. Comparisons of clinical characteristics between patients stratified by 90-day outcomes (transplant-free survival [TFS] vs. death or liver transplantation) in the training cohort are presented in Table 1. There were no statistically significant differences in age or gender distribution between the two groups (P > 0.05). Compared with the TFS group, the death/transplantation group had significantly higher levels of total bilirubin (TBIL), INR, creatinine (CR), and white blood cell count (WBC), whereas alanine aminotransferase (ALT), albumin (ALB), PTA, hemoglobin (HB), and PLT levels were significantly lower (all P < 0.05). With respect to prognostic indicators, the MELD score and the proportion of patients meeting the KCC were significantly higher in the death/transplantation group than in the TFS group (both P < 0.001), whereas the ALFSG-PI score was significantly lower (P < 0.001). In addition, the proportion of patients with HE grade ≥ 2 was markedly higher in the death/transplantation group than in the TFS group (57.1% vs. 8.4%, P < 0.001).
Table 1Baseline clinical features of patients with SLI/ALF
| Training cohort
| Validation cohort (n = 38) |
|---|
| Total (n = 302) | TFS (n =190) | Death/LT (n =112) | P-value |
|---|
| Gender(M/F) | 128 (42.4%)/174 (57.6%) | 85 (44.7%)/105 (55.3%) | 43 (38.4%)/69 (61.6%) | 0.281 | 16 (42.1%)/22 (57.9%) |
| Age(years) | 50.15 ± 17.22 | 50.42 ± 18.42 | 49.70 ± 15.03 | 0.596 | 48.13 ± 15.64 |
| ALT(U/L) | 558.50 (194.00–1,612.75) | 788.50 (236.75–2,088.00) | 436.00 (155.25–1,039.25) | 0.006 | 787.50 (282.25–3,055.50) |
| AST(U/L) | 512.00 (194.25–1,252.25) | 539.50 (250.75–1,359.75) | 386.50 (174.00–1,024.50) | 0.073 | 586.00 (197.5–1,955.00) |
| TBIL (mg/dL) | 18.40 ± 8.99 | 17.28 ± 9.06 | 20.29 ± 8.59 | 0.003 | 15.53 ± 7.92 |
| DBIL (mg/dL) | 12.80 ± 6.46 | 12.66 ± 6.64 | 13.03 ± 6.18 | 0.502 | 10.63 ± 6.03 |
| ALB(g/L) | 32.18 ± 5.40 | 33.01 ± 5.17 | 30.78 ± 5.51 | 0.001 | 32.40 ± 5.46 |
| CR (µmol/L) | 57.00 (44.00–76.00) | 56.00 (44.00–69.00) | 60.0 (45.25–96.50) | 0.032 | 56.50 (45.50–78.00) |
| WBC (×109/L) | 7.69 (5.68–10.47) | 7.05 (5.46–9.48) | 8.79 (6.34–11.38) | 0.003 | 8.36 (6.00–13.49) |
| HB (g/L) | 126.00 (108.00–140.00) | 130.00 (115.00–142.25) | 114.00 (102.00–134.75) | <0.001 | 124.50 (96.50–146.50) |
| PLT (×109/L) | 141.5 (93.25–203.25) | 161.00 (110.75–214.25) | 111.00 (55.75–164.75) | <0.001 | 148.50 (106.00–187.00) |
| PTA (%) | 31.75 (20.00–40.00) | 36.05 (28.00–43.00) | 21.00 (13.32–33.00) | <0.001 | 26.50 (15.75–42.00) |
| INR | 2.24 (1.81–3.34) | 2.04 (1.73–2.58) | 3.23 (2.20–4.84) | <0.001 | 2.60 (1.86–4.04) |
| Vasopressor Y/N | 44 (14.6%)/258 (85.4%) | 9 (4.7%)/181 (92.3%) | 35 (31.2%)/77 (68.8%) | <0.001 | 8 (21.1%)/30 (78.9%) |
| MELD | 22.70 (18.87–28.24) | 20.61 (17.68–24.97) | 27.83 (22.72–35.81) | <0.001 | 23.01 (19.03–29.85) |
| KCC Y/N | 163 (54.0%)/139 (46.0%) | 72 (37.9%)/118 (62.1%) | 91 (81.2%)/21 (18.8%) | <0.001 | 20 (52.6%)/18 (47.4%) |
| ALFSG-PI | 0.38 (0.18–0.48) | 0.43 (0.32–0.54) | 0.16 (0.05–0.34) | <0.001 | 0.34 (0.19–0.47) |
| ALF1 Y/N | 96 (31.8%)/206 (68.2%) | 27 (14.2%)/163 (85.8%) | 69 (61.6%)/43 (38.4%) | <0.001 | 16 (42.1%)/22 (57.9%) |
| ALF2 Y/N | 74 (24.5%)/228 (75.5%) | 14 (7.4%)/176 (92.6%) | 60 (53.6%)/52 (46.4%) | <0.001 | 12 (31.6%)/26 (68.4%) |
| HE grade | | | | <0.001 | |
| Non- HE | 206 (68.2%) | 163 (85.8%) | 43 (38.4%) | | 22 (57.9%) |
| Grade 1 | 16 (5.3%) | 11 (5.8%) | 5 (4.5%) | | 2 (5.3%) |
| Grade 2 | 28 (9.3%) | 7 (3.7%) | 21 (18.8%) | | 5 (13.2%) |
| Grade 3 | 17 (5.6%) | 4 (2.1%) | 13 (11.6%) | | 4 (10.5%) |
| Grade 4 | 35 (11.6%) | 5 (2.6%) | 30 (26.8%) | | 5 (13.2%) |
Etiological distribution
The etiologies of SLI/ALF were classified into six categories. Drug-induced liver injury/failure was the most common cause, accounting for 109 cases (36.1%), of which 57% were attributed to traditional Chinese medicine. This was followed by unknown etiology (105 cases, 34.8%) and viral hepatitis (79 cases, 26.2%), with hepatitis E virus infection (40 cases) and hepatitis B virus infection (31 cases) being the predominant viral etiologies. Collectively, autoimmune, alcoholic, and inherited metabolic liver diseases combined constituted only 2.9% of cases (n = 9), with detailed breakdowns presented in Supplementary Table 1.
Prognostic factor analysis and construction of the HIP model
Univariate Cox proportional hazards regression analysis identified several variables that were significantly associated with 90d-TFS, including TBIL, ALB, CR, WBC, HB, PLT, PTA, INR, and the presence of HE grade ≥ 2 (all P < 0.05). Variables with P < 0.05 in the univariate analysis were subsequently entered into a multivariate Cox regression model.
Multivariate analysis demonstrated that ALB (HR = 0.953, 95% CI: 0.920–0.988), PLT (HR = 0.995, 95% CI: 0.993–0.998), INR (HR = 1.118, 95% CI: 1.050–1.191), and HE grade ≥ 2 (HR = 5.187, 95% CI: 3.403–7.907) were independent prognostic factors. Detailed results are shown in Table 2. Although ALB remained statistically significant in the multivariate analysis, it was not included in the final model because it was readily influenced by exogenous ALB supplementation and did not significantly improve the overall model performance.
Table 2Univariate and multivariate Cox regression analysis of 90d-TFS in patients with SLI/ALF
| Parameters | Univariable
| Multivariable
|
|---|
| HR (95% CI) | P-value | HR (95% CI) | P-value |
|---|
| ALT(U/L) | 1.000 (1.000–1.000) | 0.216 | | |
| TBIL (mg/dL) | 1.025 (1.005–1.045) | 0.016 | | |
| ALB(g/L) | 0.942 (0.910–0.976) | 0.001 | 0.953 (0.920–0.988) | 0.008 |
| CR (µmol/L) | 1.002 (1.000–1.003) | 0.045 | | |
| WBC (×109/L) | 1.060 (1.030–1.096) | <0.001 | | |
| HB (g/L) | 0.990 (0.983–0.998) | 0.011 | | |
| PLT (×109/L) | 0.993 (0.990–0.996) | <0.001 | 0.995 (0.993–0.998) | <0.001 |
| PTA (%) | 0.939 (0.921–0.950) | <0.001 | | |
| INR | 1.053 (1.035–1.071) | <0.001 | 1.118 (1.050–1.191) | <0.001 |
| HE grade ≥ 2 | 7.836 (5.345–11.489) | <0.001 | 5.187 (3.403–7.907) | <0.001 |
Based on the three independent predictors—PLT, INR, and HE—a prognostic model, designated as the HIP model (derived from the initials of the three predictors), was established. The detailed scoring algorithm for this model is presented in Table 3. According to clinical stratification and model fitting, HE grades < 2 and ≥ 2 exhibited significant prognostic differences and better therapeutic efficacy. Therefore, a two-tier scoring system (1 point and 3 points) was adopted in this study to improve prognostic discrimination, with no intermediate 2-point score included. The total score of the HIP model ranged from 3 to 9 points, with higher scores indicating a greater risk of poor 90d-TFS prognosis.
Table 3Risk factor scoring assignment
| Risk factor | Categories | Points |
|---|
| INR | <2.1 | 1 |
| 2.1–3.2 | 2 |
| ≥3.2 | 3 |
| PLT(×109/L) | ≥200 | 1 |
| 80–200 | 2 |
| <80 | 3 |
| HE grade | <2 | 1 |
| ≥2 | 3 |
Performance of the HIP model
Kaplan–Meier survival analyses demonstrated significant differences in 90d-TFS among patient subgroups stratified by PLT, INR, and HE. When stratified by HE severity using grade 2 as the cutoff, patients with HE grade < 2 had a 90d-TFS of 78.38%, whereas those with HE grade ≥ 2 had a markedly lower 90d-TFS of 20.00% (P < 0.001; Fig. 1A). For stratification by INR using cutoff values of 2.1 and 3.2, the 90d-TFS rates were 82.31% (INR < 2.1), 64.84% (INR 2.1–3.2), and 29.63% (INR ≥ 3.2) (all P < 0.001; Fig. 1B). Correspondingly, PLT stratification (cutoffs: 80 × 109/L, 200 × 109/L) showed a positive correlation between PLT levels and 90d-TFS, with rates of 79.27% (PLT ≥ 200 × 109/L), 65.03% (PLT 80–200 × 109/L), and 33.33% (PLT < 80 × 109/L) across subgroups (P < 0.001; Fig. 1C).
Furthermore, stratification according to the HIP model showed that patients with scores of 3–4, 5–6, and 7–9 had 90d-TFS rates of 84.78%, 67.50%, and 22.62%, respectively (all P < 0.001; Fig. 1D). Receiver operating characteristic curve analyses for the prediction of 90d-TFS demonstrated that, in the overall study population, the HIP model achieved an AUC of 0.82, which was significantly higher than that of the MELD score (AUC: 0.76, P = 0.019) and the KCC (AUC: 0.72, P = 0.002), and comparable to that of the ALFSG-PI (AUC: 0.80, P = 0.429) (Fig. 2A).
For patients who met the ACG criteria for ALF, the AUC of the HIP model (0.80) remained significantly superior to that of the MELD score (0.67, P = 0.021) and the KCC (0.45, P < 0.001). By contrast, no statistically significant difference was observed between the HIP model and the ALFSG-PI (AUC: 0.72, P = 0.115) in this cohort (Fig. 2B). For patients who met the CMA criteria for ALF, the HIP model yielded an AUC of 0.76, which did not differ significantly from that of the MELD score (AUC: 0.71, P = 0.351) or the ALFSG-PI (AUC: 0.72, P = 0.584), but was significantly higher than that of the KCC (AUC: 0.45, P < 0.001) (Fig. 2C). To further evaluate the predictive performance of the HIP model across different etiologies, subgroup analyses were conducted in patients with drug-induced and viral hepatitis-induced SLI/ALF. The HIP model demonstrated consistent and robust predictive power for 90d-TFS in both subgroups, with an AUC of 0.81 for drug-induced etiology (P > 0.05; Fig. 2D) and 0.88 for viral hepatitis-induced etiology (P < 0.05; Fig. 2E). These findings suggest that the predictive power of the HIP model remains stable irrespective of the underlying etiology. The predictive performance of the above models was further validated in an independent validation cohort, and the results demonstrated that the HIP model maintained robust efficacy, with an AUC of 0.85 (Fig. 2F).
Discussion
Based on clinical data from patients with SLI/ALF, we constructed a simplified prognostic scoring system—the HIP model—incorporating three readily accessible parameters: INR, PLT, and HE grade. The HIP model demonstrated good discriminative ability for predicting 90d-TFS, with an AUC of 0.82. Its predictive performance was significantly superior to that of the MELD score and KCC, and comparable to the more complex ALFSG-PI model. Importantly, the HIP model maintained consistent prognostic validity across ALF subgroups stratified by diagnostic criteria, supporting its generalizability in diverse clinical settings.
The severity of HE is increasingly recognized as a critical prognostic determinant in ALF,23,24 a finding further corroborated by the results of our study. In the present cohort, patients with HE grade ≥ 2 exhibited a substantially lower 90d-TFS rate (20.00%) compared to those with HE grade < 2 (78.38%), reflecting a marked escalation in short-term mortality risk beyond grade 1 HE. Multivariate Cox regression identified HE grade ≥ 2 as an independent predictor of 90-day mortality in SLI/ALF patients, aligning with prior reports25,26 and reinforcing its prognostic significance.
Current guidelines from the ACG and the European Association for the Study of the Liver define ALF by the presence of any degree of HE in severe liver injury.2,18 However, grade 1 HE, as per the West Haven criteria, is often subjective with considerable interobserver variability27 and is associated with a significantly lower mortality risk compared to higher grades.28 Consequently, the inclusion of all grade 1 HE cases within the ALF definition may lead to overestimation of disease severity and potential overtreatment. In contrast, Chinese liver failure guidelines specify HE grade ≥ 2 for the diagnosis of ALF,19 a threshold that offers improved risk stratification. HE grade ≥ 2 is characterized by more definitive neurological impairment and correlates strongly with intracranial hypertension and elevated short-term mortality.29,30 Importantly, it has been identified as the leading independent predictor of mortality risk in hospitalized cirrhotic patients.31 Our results support the clinical rationale for this diagnostic cutoff. Thus, integrating HE grade ≥ 2 into the HIP model not only enhances its statistical reliability but also improves its clinical utility by enabling more precise identification of high-risk patients requiring urgent liver transplantation assessment.
In addition to HE, thrombocytopenia emerged as an independent predictor of adverse 90-day outcomes in SLI/ALF patients, highlighting its potential as an early risk stratification marker. This finding is consistent with previous work by Mu et al.32 in viral hepatitis-induced ALF, which demonstrated an association between low PLTs and increased short-term mortality. Furthermore, a large-scale retrospective analysis of 1,598 ALF patients linked thrombocytopenia to the development of multiple organ failure and poorer clinical outcomes.33 The pathophysiological mechanisms underlying thrombocytopenia in acute liver injury are multifaceted, potentially involving systemic inflammation, endothelial dysfunction, and complex interplay between coagulation and inflammatory pathways.34,35 Further studies incorporating inflammatory and coagulation-related biomarkers are warranted to elucidate the dynamic role of platelets in disease progression and their potential as therapeutic targets.
Elevated INR directly reflects impaired hepatic synthetic function and represents a well-established indicator of hepatocellular injury severity. Accordingly, INR constitutes a key component of several prognostic models, including the KCC and ALFSG-PI.13,22 Compared with the MELD score and the KCC, the HIP model exhibited superior clinical applicability in the SLI/ALF context. The MELD score,36 originally developed for end-stage liver disease, does not incoporate HE as a predictor—a pivotal prognostic variable in ALF—thus limiting its utility in this acute setting.10,37 Although the KCC38 was developed specifically for ALF, its derivation primarily from acetaminophen overdose cohorts in Western populations restricts its generalizability to etiologically distinct populations, and its performance in non-acetaminophen ALF remains inconsistent across studies.12,39 Comparative evaluations of MELD and KCC in ALF have yielded conflicting results,40–42 with MELD demonstrating limited prognostic accuracy in viral hepatitis-induced ALF, while KCC shows suboptimal sensitivity and unsatisfied predictive performance.42 A recent meta-analysis further suggests that the performance of both scores may be etiology-dependent, with KCC favoring acetaminophen-related cases and MELD performing better in non-acetaminophen ALF.11
Although the ALFSG-PI model43 showed commendable predictive performance in our cohort, it shares key predictors, including HE and INR, with the HIP model. However, its reliance on multiple variables and computational complexity may restrict its routine application in time-sensitive or resource-limited settings. In contrast, the HIP model, with only three simple parameters, exhibited robust predictive ability in SLI patients, performing comparably to ALFSG-PI and significantly better than MELD and KCC.
Early risk stratification by the HIP model provides valuable guidance for clinical management. Patients with a HIP score of 3–4 can be managed primarily with conservative medical treatment. For those with a score of 5–6, conservative medical management should be combined with dynamic clinical evaluation and the consideration of liver transplantation if necessary. In patients with a score of 7–9, placing their name on the waiting list for urgent liver transplantation is strongly recommended in addition to standard medical care. Throughout the clinical course, repeated assessment, score monitoring, and prompt adjustment of therapeutic strategies are essential.
Although we have developed a significant model for predicting 90d-TFS in patients with SLI/ALF, several limitations should be acknowledged. The primary limitation of this study is the relatively small sample size of the validation cohort, which underscores the need for further external validation in larger, more ethnically diverse patient populations. Additionally, the model was developed based on a Chinese cohort, whose etiological profile differs from that of Western populations, where acetaminophen overdose is the predominant cause. This discrepancy may limit the generalizability and applicability of the HIP model to other geographic regions. Therefore, future multicenter, prospective studies encompassing diverse geographic regions and etiological backgrounds are warranted to confirm the model’s broader applicability and clinical utility.