Introduction
Oxaliplatin (OXA) is a third-generation platinum-based antitumor drug that has a broad antitumor spectrum. It is often used in combination with other antitumor agents such as 5-fluorouracil and irinotecan. It is recommended by the National Comprehensive Cancer Network Clinical Practice Guidelines in Oncology for the first-line treatment of colorectal cancer, gastric cancer, and other digestive system tumors.1 With the widespread use of OXA in clinical practice, its adverse drug reactions have become increasingly prominent, and the current research mainly focuses on peripheral neurotoxicity, myelosuppression, gastrointestinal reactions, and hypersensitivity.2–5 Similar findings as above were obtained in a multicenter post-marketing safety evaluation of OXA covering 3,687 patients, and the effects of OXA on liver function were found to be of particular concern.6
Since 2004, several clinical studies have reported that patients with OXA frequently experienced adverse effects of liver injury (LI), typically characterized by hepatic sinusoidal injury, splenomegaly, decreased platelet count and noncirrhotic portal hypertension, which can progress to nodular regenerative hyperplasia with long-term treatment.7–10 LI also decreased hepatic functional reserve and aggravated the postoperative course of colorectal cancer patients after hepatectomy, and may affect intraoperative bleeding, postoperative morbidity, and overall survival.9,11,12 LI can further progress to liver fibrosis and liver cirrhosis, both of which would be detrimental to the patients’ health.13 The risk of OXA-induced LI (OILI) can be greatly reduced if people potentially at high risk of OILI can be identified and then treated and prevented accordingly in advance.
However, clinical studies on OILI mainly focus on case reports or short-term retrospective analysis with limited samples. Although there is a preliminary understanding of the clinical features and disease characteristics, studies on its prediction or risk factors are rare, which mainly concentrates on examination indicators such as including platelet count, hyaluronic acid in blood, spleen volume, and ATP7B polymorphism.14–17 There is a lack of exploration in terms of patient and medication characteristics, and there is also a lack of clinical prediction tools for OILI.
In recent years, artificial neural networks (ANNs) have been increasingly used in medical research for disease classification, diagnosis, and prediction. It has the advantages of good fault tolerance, high adaptivity, self-learning, and ability to handle high nonlinearity, which can effectively model the complex relationships among factors and between factors and LI. Therefore, based on a previous safety evaluation of OXA, OILI was explored in depth.6 ANN and logistic regression (LR) models were selected to predict the risk of OILI, and the performance of the two models was evaluated and compared, in the expectation of identifying patients at high risk of OILI, achieving timely intervention and appropriate management, and improving the safety of OXA administration.
Methods
Research design and data sources
This multicenter observational study was conducted in 10 tertiary hospitals in Hubei Province. The clinical data of all patients receiving OXA-based between May and November 2016 were prospectively registered by the central monitor method. The data included demographic information, health status, disease history, comorbidities, medication, pre- and post-chemotherapy medical examination, and adverse events. The investigators received uniform and standardized training prior to the study to ensure registration integrity and data quality. In addition, an oncologist and a clinical pharmacist were designated in each subcenter for data collection and integration. This study conforms to the Declaration of Helsinki and was approved by the Medical Ethics Committee of Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology (No. TJIRB20160504).
Case selection and identification
Patients that fit the following criteria were included: (1) with OXA-based chemotherapy regimens; (2) ≥18 years of age; and (3) with Karnofsky performance status (KPS) scores ≥70 (able to take care of themselves and above). The exclusion criteria were: (1) pretreatment diagnosis of LI or liver insufficiency; (2) liver-related diseases and severe manifestations during chemotherapy including hepatitis other than hepatitis B, nonalcoholic fatty liver, liver cancer, etc.; (3) other severe organ dysfunction; and (4) incomplete information in the medical record.
A total of 3,315 of the 3,687 cases obtained from the prospective registry met the inclusion and exclusion criteria. Of the 3,315 cases, those with liver function indicators that deviated from the normal range were judged by attending physicians as having abnormal liver function (n=186). According to the common terminology criteria for adverse events (CTCAE) version 5.0, the study patients were suspected of having LI of grade I or above when any of the liver function indicators (aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase and total bilirubin) on medical examination were abnormal. Two clinical pharmacists used the updated Roussel Uclaf causality assessment method (RUCAM) to assess the causality between OXA and LI in suspected cases of LI.18 Cases with a total score ≥3 (possible) were considered to have OILI (n=121). The remaining cases were considered as not experiencing OILI (n=3,194). A flow chart of case selection and identification is shown in Figure 1. Of the 3,315 cases, 2,116 were men (63.83%), and 1,199 were women (36.17%), with a male to female ratio of 1.76:1. They were mainly middle-aged, with ages ranging from 20 to 82 years. The baseline characteristics of the patients are listed in Supplementary Table 1.
Establishment of ANN and LR models
Because of the imbalance of the patient data, the data were divided into training set (70%) and test set (30%) by stratified randomization based on the incidence of OILI. For any of the 22 variables studied, there was no significant difference between the training and test sets (p>0.05), implying that the two data sets were well-balanced in the distribution of factors (see Supplementary Table 2).
In this study, we constructed a three-layer ANN model that included 22 predictive variables as input units. The optimal number of hidden units was obtained after several calculations and attempts, and its activation function was the hyperbolic tangent. The output layer took the occurrence of OILI as the output unit, and its activation function was softmax. The LR model was built using the training set to predict the risk of OILI. A total of 22 factors were input as model predictive variables, then the variables were screened by the forward conditional method. Whether OILI occurred was considered as the outcome variable. The omnibus tests Hosmer–Lemeshow tests were used to evaluate the overall model and goodness-of-fit.
Model performance evaluation
The developed ANN and LR models were used to make predictions for each patient in the 30% test set, and the performance of both models was evaluated based on the test set. By composing a confusion matrix, the predicted results were compared with the actual results. Validity evaluation indicators, including sensitivity/true positive rate, specificity/true negative rate, and accuracy, were first computed to evaluate and compare the performance of the two predictive models. The area under the receiver operating characteristic (ROC) curve (AUC) and integrated discrimination improvement (IDI) were calculated and plotted to assess the ability of the overall model to distinguish between outcomes that occurred with OILI and without OILI. Calibration plots for both models were plotted to evaluate the consistency of the observed probability with the predicted probability of OILI.
Statistical analysis
For continuous variables, normality was determined with the one-sample Kolmogorov-Smirnov Test. Variables with normal distributions were reported as means ± standard deviations and compared with t-tests, Those without normal distributions were reported as medians (upper quartile-lower quartile) and compared with Mann-Whitney U tests. Categorical variables were reported a frequency and proportion, and compared with Pearson’s chi-square or Fisher’s exact tests. ANN and LR models were developed. To evaluate model performance, McNemar’s and chi-square tests were used to compare the validity of the two models, and AUC and IDI were compared with the z statistic. The statistical analysis, model establishment, and performance evaluation were performed with SPSS software (version 25.0; IBM Corp., Armonk, NY, USA) and MedCalc (version 20.106). Because of the very low proportion of missing data, missing data were excluded from all analyses. A p-value <0.05 was considered significant.
Results
OILI
In the study, the incidence of OILI was 3.65%. The median RUCAM score for cases of OILI was 6 (4–9). Detailed information about patients with OILI was shown in Supplementary Table 1. The characteristics of patients with OILI were compared with those without OILI. The results showed that there were no significant differences between the two groups in sex, height, body mass index (BMI), KPS score, history of hepatitis B, diabetes, hypertension, cerebral infarction, single dose of OXA, 5-HT3 receptor antagonists, and liver protective drugs. Compared with patients without OILI, those with OILI were older (64.00 vs. 57.00 years, p=0.000), lighter (55.00 vs. 58.00 kg, p=0.023), had higher proportions of renal calculi (3.31% vs. 0.63%, p=0.010), gastritis (4.13% vs. 1.35%, p=0.033), and duodenal ulcer (2.48% vs. 0.31%, p=0.010). The incidence of OILI was not identical among chemotherapy regimens. Patients with OILI had longer chemotherapy cycles (4.00 vs. 3.00, p=0.007) and higher total OXA doses (600.00 mg vs. 440.00 mg, p=0.016) than those without OILI. Notably, patients with OILI were more likely to use prophylactic proton pump inhibitors (72.73% vs. 58.42%, p=0.002), glucocorticoid drugs (43.80% vs. 16.06%, p=0.000) and antihistamine drugs (29.75% vs. 16.34%, p=0.000). The chemotherapy regimens and types of prophylactic drugs are shown in Supplementary Table 3.
ANN model of the risk of OILI
All of the 22 predictive variables were included in the ANN model. The model contained an input layer, a hidden layer, and an output layer. Except for the bias units, the model contained 41 input units, eight hidden units, and two output units. The structure of the ANN model was shown in Supplementary Figure 1. Then, variable importance analysis showed that age, BMI, chemotherapy regimens, weight, single dose of OXA, total dose of OXA and height (top 7) were relatively important variables in the ANN model. The normalized importance of each of these variables exceeded 50%. The normalized importance of variables is shown in Figure 2. We compared the accuracy of prediction with OILI and without OILI using a cumulative gains chart. The result showed that OILI patients and patients without OILI were separated to a good extent (Fig. 3).
LR model for the risk of OILI
The results of the LR model fitting using the training set are shown in Table 1. The 22 predictive variables were included in the LR model, and the variables were screened by the forward conditional method. Omnibus tests of the LR model showed that p=0.000, implying the overall significance of the models. The goodness-of-fit of the model was determined by the Hosmer–Lemeshow test, and the result was p=0.181, indicating that the information in the patient data was adequately extracted and the model fit well. The structure of the LR model is shown in Supplementary Figure 2. Seven risk factors, age, chemotherapy regimen, number of chemotherapy cycles, single dose of OXA, total dose of OXA, glucocorticoid drugs, and antihistamine drugs were found to be significantly associated with OILI.
Table 1Results of the LR model (training set)
Factor | β | S.E. | Wald | DF | Sig. | Exp(β) | 95% CI for Exp(B) |
---|
Age | 0.076 | 0.016 | 22.105 | 1 | 0.000 | 1.079 | 1.045–1.114 |
Chemotherapy regimensa | | | 81.109 | 6 | 0.000 | | |
XELOX | 2.135 | 0.333 | 41.211 | 1 | 0.000 | 8.453 | 4.406–16.221 |
GEMOX | 3.816 | 0.559 | 46.553 | 1 | 0.000 | 45.429 | 15.179–135.961 |
SOX | 0.892 | 0.551 | 2.623 | 1 | 0.105 | 2.440 | 0.829–7.183 |
OXA and raltitrexed | 1.615 | 0.562 | 8.272 | 1 | 0.004 | 5.029 | 1.673–15.119 |
OXA | 3.128 | 0.535 | 34.155 | 1 | 0.000 | 22.840 | 7.999–65.215 |
Other | 1.972 | 0.489 | 16.283 | 1 | 0.000 | 7.182 | 2.756–18.712 |
Chemotherapy cycles, n | −0.390 | 0.180 | 4.677 | 1 | 0.031 | 0.677 | 0.475–0.964 |
Single dose of OXA in mg | −0.027 | 0.006 | 24.049 | 1 | 0.000 | 0.973 | 0.962–0.984 |
Total dose of OXA in mg | 0.004 | 0.001 | 10.207 | 1 | 0.001 | 1.004 | 1.001–1.006 |
Glucocorticoid drugs | 1.170 | 0.274 | 18.207 | 1 | 0.000 | 3.223 | 1.883–5.517 |
Antihistamine drugs | 0.790 | 0.302 | 6.831 | 1 | 0.009 | 2.204 | 1.219–3.987 |
Constant | −5.766 | 1.327 | 18.884 | 1 | 0.000 | 0.003 | |
Model performance evaluation and comparison
Two models were included in the test set for prediction, and the classification threshold was 0.5. The confusion matrix (Fig. 4) and the ROC curves of the two models were plotted (Fig. 5). Comparing the validity evaluation indicators and AUC values (Table 2), it was found that there were no significant differences in sensitivity, specificity, and accuracy in the ANN model and the LR model. The ANN model had a higher AUC (p=0.019), and the IDI was 0.129 (z=3.481 p=0.000), indicating that the ANN model was relatively strong in discriminating OILI. The calibration plots are shown in Figure 6. In general, compared with the dot, the crosses were slightly closer to the 45° line, and the predicted probability of the ANN model and the observed probability are slightly better matched, indicating a slight improvement in calibration with the ANN model.
Table 2Model performance comparison of ANN and LR (test set)
Indicators | ANN model | LR model | p |
---|
Sensitivity | 27.78% | 16.67% | 0.219a |
Specificity | 99.64% | 99.37% | 0.688a |
Accuracy | 96.63% | 96.37% | 0.768b |
AUC | 0.920 (0.899–0.937)c | 0.833 (0.806–0.857)c | 0.019d* |
Discussion
Current studies of OILI often focus on sinusoidal obstruction syndrome (SOS). Several studies have found that the incidence of SOS or sinusoidal dilatation caused by OXA in patients with colorectal liver metastasis (CRLM) was 18.9–79.0%, and may be accompanied by increased liver transaminases and may lead to acute liver failure.10,19–21 SOS requires invasive procedures or postoperative liver pathological histological examination to clarify the diagnosis. However, except for patients with liver metastases requiring surgical resection. As most cancer patients do not undergo the above examination, OILI is still mainly judged by blood tests. The incidence of OILI in this study was 3.65%, which was lower than that of the current clinical research. It may be that some patients with early mild hepatic sinusoidal injury do not show significant hepatocyte destruction and transaminase release. In addition, this study covered more patients treated with OXA-based chemotherapy regimens, including patients with colorectal, gastric, and esophageal cancers at various stages, whereas current studies focused on patients with CRLM (stage IV), who were more prone to liver function damage.
In this study, an ANN model and a LR model were developed to predict the risk of OILI using patient and medication characteristics. We compared the predictive performance of the two models. In terms of discrimination, both models had similar sensitivity, specificity, and accuracy. They had good discriminative ability, with an AUC>0.8. Sensitivity, specificity, and accuracy were only the attribute indices for this random sample, but the AUC incorporates all samples and reflects overall predictive performance and is more robust. According to its AUC, the ANN model had better discriminative ability than the LR model. An IDI>0 also supports that view. As for calibration, the results of the calibration plot indicated that the predictions of the ANN model were more consistent with the observations compared with the LR model, achieving a better calibration capability.
In contrast to LR models, ANN models can detect complex nonlinear relationships between predictive and outcome variables and all possible interactions. The establishment of ANN models requires only a few priori assumptions, little knowledge about data distribution, and less professional judgment in variable selection. LR models, on the other hand, have clear advantages mainly in terms of variable interpretation, assessing the causal relationship between predictive and outcome variables, and providing regression coefficients and odds ratios. ANNs act as a black box model with no direct realistic explanation for the weights in the network, making it difficult to determine the way in which the predictive variables act.22 The correlations and effects among the 22 variables involved in this study may be complex, multidimensional, and nonlinear. Therefore, based on the advantages and disadvantages of both models, the ANN model developed in this study significantly outperformed the traditional LR model in predicting the risk of OILI when the discriminative and calibration ability were given priority.
The top seven important variables of the ANN model overlapped highly with the seven risk factors finally included in the LR model, implying that age, chemotherapy regimens, chemotherapy cycles, single, and total dose of OXA may be associated with the occurrence of OILI. Information on the causal relationship between predictive and outcome variables was provided in a complementary manner with the help of the LR model for the clinical interpretation of the variables.
Kopanoff et al.23 and Nolan et al.24 have long suggested that the risk of DILI increases with age. Similarly, this study found that age was a dangerous factor for OILI, which may be related to the decline of physical function, liver, and kidney metabolism in older patients. However, some studies have different findings. Both Sobrane et al.16 and Wakiya et al.25 investigated patients who received OXA-based chemotherapy before hepatectomy for CRLM and concluded that the occurrence of LI did not differ significantly among middle-aged patients. That may be because the patients included in their studies were older than ours. This study included patients aged from 20 to 82 years of age, with a wide range and a large sample size, so this significant difference is plausible.
This study found that, compared with the most commonly used FOLFOX regimen, patients receiving XELOX, GEMOX, SOX, OXA, and raltitrexed, and OXA had a higher risk of OILI. Kim et al.26 compared splenomegaly, liver enzyme levels, and hepatic parenchymal heterogeneity in gastric cancer patients (n=151) receiving XELOX or SOX, and concluded that SOX exacerbated OILI. A study in China comparing the difference in hepatic dysfunction between the two groups (n=90), found eight cases in the XELOX group and two cases in the FOLFOX group (p=0.044), which was similar to the results of this study.27 In addition, Degirmencioglu et al.28 compared the hepatotoxicity in colon cancer patients receiving FOLFOX and XELOX (n=243). There were three cases in the XELOX group and seven in the FOLFOX group (p=0.520). The majority of the current studies have reported the differences among different OXA-based chemotherapy regimens in terms of hematological toxicity and neurotoxicity. Only a few studies have focused on LI or hepatotoxicity, but the results have been inconsistent. Clinical data from some studies has suggested a possible association between patterns of LI and specific chemotherapy drugs.29 For example, fluorouracil and irinotecan may promote nonalcoholic fatty liver disease, and OXA may cause sinusoidal injury. Chemotherapy drugs other than OXA included in chemotherapy regimens can have adverse effects on the liver, and thus may influence the manifestation of OILI. There were too few cases of GEMOX and OXA alone in this study, so these two results may be less reliable. But overall, it can be concluded that the manifestation of OILI varied with different chemotherapy regimens. The influence and mechanism of other chemotherapy drugs in regimens for OILI are still unclear and deserve further exploration.
In terms of dosage, the apparent interpretation based on the results of the LR model was that chemotherapy cycles and single dose of OXA were protective factors for OILI, and the total dose of OXA increased the risk of OILI. That may not be the case. We suspect that OILI may be an idiosyncratic drug-induced LI (iDILI), the occurrence of which is dose-independent.30,31 There are several hypotheses on the pathogenesis of iDILI.32,33 The inflammatory stress hypothesis is related to the activation of cell death signaling pathways by inducing oxidative stress.34,35 That hypothesis is supported by several studies that reported oxidative stress was associated with OILI.36–38 In addition, a case of LI after OXA-induced thrombocytopenia was reported in 2020, which the authors believed belonged to iDILI.39 In this study, chemotherapy cycles, a single dose of OXA, and the total dose of OXA had small regression coefficients and coefficient symbols that were difficult to explain from a professional perspective in the LR model, and were ranked in the middle and the back in the importance analysis of the ANN model. Based on the above conjectures and elaborations, the results related to dosage were understandable. Few studies have explored the relationship between dosage and OILI, and it is difficult to make a suitable cross-comparison.25 For the time being, we kept the conjecture of this study.
The advantage of this study was that the model was trained and tested using information from more than 3,000 multicenter cancer patients, which greatly improved the predictive ability and stability of the model. This study also made full use of the information of patient and medication characteristics, which was convenient and accessible. Before the overall chemotherapy regimens are determined, physicians only need to input relevant items of patient characteristics acquired during the process and possible medication plans into the ANN model. The risk of OILI can be automatically calculated, and potentially high-risk groups can be identified before chemotherapy. It is important to note that the model is a comprehensive approach to analysis, given the factors included in the model as well as the common criteria employed. For some atypical individuals, such as patients with severe liver function damage or severe liver disease, this model might not be accurate. The predictive results and the relatively important predictors of this model only serve as a reference for clinical decision making, reminding physicians that patients might be at risk of OILI. The treatment regimen might be modified before chemotherapy, or active measures such as liver function monitoring should be taken after chemotherapy. Model predictions cannot completely replace physicians in making the final decision.
This study had some limitations. The criteria for LI selected in the study may not cover all LI. For example, changes in liver function indicators in SOS patients may not meet the threshold of the criteria. Regarding the sample data, the data came from only a single province and the sample size was limited, so it is expected that future research can encompass a larger and more comprehensive study. Additionally, the imbalance of the data was prominent. The suboptimal learning ability of ANNs for unbalanced outcome samples resulted in a relatively low positive predictive value. In addition, a limited number of positive samples and lack of external validation of the models lead to potentially poor external validity. As for predictive factors, the models only considered patient and medication characteristics, while some factors known to have predictive value but poor availability were not taken into account, such as genes, hyperglycemia, spleen volume, thrombocytopenia, and liver volume.39–42 Besides, modeling excluded the complex effects of liver diseases other than the common hepatitis B, so it may not be applicable to such groups. Although the history of hepatitis B was not significant, possibly because there were fewer cases of it in this study, chronic liver diseases are considered to be important risk factors contributing to DILI.43 Future studies should seek more valid and comprehensive predictors and collect larger and more comprehensive samples to establish models with reliability, efficiency, and operability.