Methods
Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)16 was adopted as a guide for performing the meta-analysis.
Literature search
The PICOS standard was used to guide the search.17 We conducted a comprehensive search for studies on the diagnostic validity of all diffusion-related parameters for the preoperative evaluation of MVI in HCC in Web of Science, EMBASE, PubMed/MEDLINE, and the Cochrane Library until January 17, 2021. EndNote X9 software (Thomson Reuters, NY, USA) was used for efficient filtering. Details of the search strategy are shown in Supplement 1. Additionally, all references shown in the listed literature were manually checked.
Inclusion and exclusion criteria
The inclusion criteria were as follows. (1) The diagnostic performance of MVI in HCC was evaluated using the ADC parameters of DWI or intravoxel incoherent motion (IVIM) in the original quantitative study. (2) The data provided by the study were sufficient to construct a diagnostic 2 × 2 table. (3) The article was published in English; and 4) at least 30 HCC patients were included. The exclusion criteria were as follows: (1) nonhuman research; and 2) literature published in formats including reviews, patents, guidelines, chapters, case reports, conference abstracts, letters, or editorials.
Data extraction and quality assessment
Two observers with more than 6 years of experience in liver imaging performed the data extraction and quality assessment. For any disagreements in the above process, consensus was obtained with the help of a third radiologist with 16 years of liver imaging experience, as needed. The extracted data included basic data (true positives, false negatives, false negatives, and true negatives) and additional data (patient characteristics, imaging characteristics, and study characteristics) for meta-regression.18 If multiple diagnostic performance data were provided in the original study, we chose the best outcome. All extracted data were entered into Microsoft Excel 2016 for further analysis. Quality assessment was performed with Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2).19 The bias risk was rated as low, unclear, or high, and the clinical applicability concern was rated as low, unclear, or high, with converted scores of 1, 2, or 3, respectively.20 In subsequent meta-regression analyses, the total scores of each study served as covariates to quantitatively represent the general risk of bias and applicability.
Pathological MVI in HCC
MVI is defined as the presence of tumors in endothelial cells by microscopy, including the portal vein and hepatic vein.21 In all included studies, MVI was divided into two groups (positive MVI or negative MVI).
Statistical analysis
QUADAS evaluations of the included articles were performed with Review Manager 5.3 software (Cochrane Collaboration, Copenhagen, Denmark). Threshold effect assessment was conducted using Meta-Disc 1.4 software provided by Ramon y Cajal Hospital, Madrid, Spain.22 The pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio and their corresponding 95% confidence intervals (CIs) were determined by STATA 14 (StataCorp, College Station, Texas, USA) software using the MIDAS command.23,24 Both the I2 statistic (I2>50%) and Cochran Q (p<0.05) were used to determine the possible occurrence of between-study heterogeneity.25,26 Meta-regression was conducted to assess study heterogeneity.
Results
Literature search
After a comprehensive search, 435 records were retrieved. Initially, 148 records were removed by the automatic find-duplicate function embedded in Endnote X9 software, and 44 conference abstracts, 21 reviews, five meta-analyses, two case reports, two letters, and two editorials were excluded because their improper publication type. For a more accurate screening, we read the full texts of the remaining 211 records. After excluding seven non-English studies, those without sufficient data to extract, and 191 unrelated studies, nine studies eventually remained (Fig. 1).
Quality assessment and data extraction
The detailed data of the included articles are shown in Table 113-15,27–32 and Supplement 2, including patient characteristics (country/region, year, average age, number of lesions, average size), research characteristics (research design, number of readers, recruitment methods, time interval between the index test and the reference standard, blindness to the index test during the reference test, blindness to the reference test during the index test), and imaging characteristics (MR manufacturer, MR field, MRI sequence, quantitative parameters, details of b values). The methodologic quality of all studies based on QUADAS-2 is shown in Figure 2. Generally, there was a low bias risk and minimal concern of clinical applicability in most studies. Five studies had an unclear bias risk because it was not certain whether consecutive patients were recruited. Unclear applicability concerns were present in four studies because of the relative simplicity of the patient inclusion and exclusion criteria. Only one study had an unclear bias risk because it was not clear whether the reference results were blinded during the index test. Five studies had an unclear bias risk because they did not report the status of blinding of the index test results during the reference test in the reference standard. Additionally, there were concerns of the applicability of two studies because of insufficient information on pathological MVI. Regarding flow and timing, one study that did not report the interval between the reference standard and MRI examinations, and three with time intervals greater than 1 month were regarded as having an unclear bias risk and a high bias risk, respectively.
Table 1Characteristics of the included studies
Patient characteristics
| Study characteristics
| Imaging characteristics
| Reference |
---|
Region | Age, years | Size, mm | Lesions, n | Study design | Consecutive | Readers, n | Blind 1 | Blind 2 | TI (days) | Risk score | Application score | Sequence | Parameter | Field (T) | Manufacturer | b-value feature, s/mm2 |
---|
China | 51.97 | 63.1 | 135 | p | Yes | 2 | Yes | Yes | 14 | 4 | 3 | IVIM | ADC/D | 3 | GE | 0, 10, 20, 40, 80, 100, 150, 200, 400, 600, 800, 1,000, and 1,200 | Wei et al.27 |
Japan | 66.7 | 20.72 | 73 | R | Yes | 2 | Yes | Unclear | 85 | 7 | 4 | DWI | ADC | 1.5/3.0 | Siemens/GE | 0, 1,000 | Okamura et al.15 |
China | 52 | 19 | 94 | R | Unclear | 2 | Yes | Unclear | 14 | 6 | 3 | DWI | ADC | 1.5 | Siemens | 0, 500 | Rao et al.28 |
Korea | 56 | 34.06 | 67 | R | Unclear | 1 | Yes | Yes | 45 | 7 | 4 | DWI | ADC | 3 | Siemens | 50, 400, 800 | Suh et al.14 |
China | 54.14 | 39.43 | 100 | unclear | Yes | Unclear | Unclear | Unclear | Unclear | 7 | 4 | DWI | ADC | 3 | GE/Philips | 0, 100, 600 | Wang et al.32 |
China | 51.51 | 27.9 | 41 | p | Yes | 2 | Yes | Unclear | 30 | 5 | 3 | IVIM | ADC/D | 3 | Philips | 0, 10, 20, 40, 80, 200, 400, 600, 1000 | Li et al.29 |
China | 53.2 | 14.4 | 109 | R | Unclear | 2 | Yes | Yes | 16 | 5 | 4 | DWI | ADC | 1.5 | Siemens | 0, 500 | Xu et al.13 |
China | 59 | 57 | 318 | R | Unclear | 2 | Yes | Unclear | 7 | 6 | 5 | DWI | ADC | 1.5 | GE | 0, 800 | Zhao J et al.30 |
China | 50.6 | 56.7 | 51 | R | Unclear | 2 | Yes | Yes | 58 | 7 | 3 | IVIM | ADC/D | 3 | GE | 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 1,000 | Zhao W et al.31 |
Diagnosis of MVI in HCC
Overall, the accuracy of the ADC value in predicting the presence of MVI was evaluated in nine studies including 988 HCCs. The details of pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio were shown in Table 2. The area under the receiver operating characteristic curve (AUROC) of the ADC value was 0.78 for diagnosing MVI in HCC (Fig. 3). The forest plots showed that the between-study heterogeneities of ADC presented sensitivity (p=0.10, I2=40.64%) and specificity (p <0.01, I2=70.59%, Fig. 4).
Table 2Diagnostic accuracy of the apparent diffusion coefficient value for microvascular invasion of hepatocellular carcinoma
| Studies, n (patients, n) | AUROC | Sensitivity, % (95% CI) | Specificity, % (95% CI) | positive likelihood ratio, (95% CI) | Negative likelihood ratio, (95% CI) | Diagnostic odds ratio, (95% CI) |
---|
ADC | 9 (988) | 0.78 (0.74, 0.81) | 0.73 (0.68, 0.78) | 0.70 (0.62, 0.77) | 2.4 (2.0, 3.1) | 0.38 (0.32, 0.46) | 6 (5, 9) |
The Begg’s funnel plot for ADC in predicting MVI (Z=2.40, p=0.016) are presented in Figure 5, which suggests slight asymmetry in the data. Therefore, in evaluating our funnel plots, we can report only that there may have been publication bias, which is difficult to quantify, and that no major publication bias was detected. There was no significant threshold effect (p=1.0) for ADC in diagnosing MVI in HCC.
Meta-regression
Subgroup meta-regression analyses were performed with the following nine covariates: number of lesions (<90 or ≥90), average age (<54 or ≥54 years), average size (<30 or ≥30 mm), interval between the index test and the reference test (<30 or ≥30 days), blinding of the index test during the reference test (yes or unclear), blinding of the reference test during the index test (yes or unclear), concern of applicability score (<4 or ≥4), risk of bias score (≤6 or >6) and MR parameter (DWI or IVIM). The results are shown in Table 3. The time interval (p<0.01 by the joint model) was a significant cause of heterogeneity. Studies with time intervals ≥ 30 days had significantly lower sensitivity and higher specificity than those with time intervals of <30 days (72% vs. 75% and 70% vs. 72%, respectively, p<0.01). In contrast, there were no sources of heterogeneity for mean age (p=0.21), mean size (p=0.08), number of lesions (p=0.91), blinding of the reference standard during the index test (p=0.70), blinding of the index test during the reference test (p=0.49), QUADAS risk of bias score (p=0.35), QUADAS applicability concern score (p=0.11) and MR sequence (p=0.57).
Table 3Covariate meta-regression results of apparent diffusion coefficient value for microvascular invasion of hepatocellular carcinoma
Covariate | Subgroup | Studies, n | p | Summary sensitivity, % (95% CI) | p1 | Summary specificity, 5 (95% CI) | p2 |
---|
Age, years | <54 | 5 | 0.21 | 0.77 (0.72, 0.83) | 0.05 | 0.69 (0.59, 0.78) | 0.05 |
| ≥54 | 4 | | 0.68 (0.61, 0.76) | | 0.72 (0.62, 0.82) | |
Size, mm | <30 | 4 | 0.08 | 0.79 (0.73, 0.84) | 0.04 | 0.65 (0.55, 0.76) | 0.01 |
| ≥30 | 5 | | 0.69 (0.62, 0.76) | | 0.74 (0.65, 0.82) | |
Included lesions, n | <90 | 4 | 0.91 | 0.72 (0.64, 0.80) | 0.00 | 0.70 (0.59, 0.81) | 0.12 |
| ≥90 | 5 | | 0.74 (0.68, 0.81) | | 0.70 (0.60, 0.80) | |
Blind to reference | Yes | 8 | 0.70 | 0.74 (0.68, 0.79) | 0.18 | 0.71 (0.63, 0.78) | 0.72 |
| Unclear | 1 | | 0.71 (0.57, 0.85) | | 0.65 (0.44, 0.86) | |
Blind to index test, n | Yes | 4 | 0.49 | 0.73 (0.66, 0.80) | 0.00 | 0.66 (0.56, 0.77) | 0.02 |
| Unclear | 5 | | 0.74 (0.67, 0.82) | | 0.73 (0.64, 0.82) | |
Time interval, days | <30 | 4 | 0.00 | 0.75 (0.68, 0.82) | 0.03 | 0.72 (0.61, 0.83) | 0.24 |
| ≥30 | 4 | | 0.72 (0.64, 0.80) | | 0.70 (0.58, 0.82) | |
Risk score | ≤6 | 5 | 0.35 | 0.75 (0.69, 0.82) | 0.02 | 0.73 (0.63, 0.82) | 0.32 |
| >6 | 4 | | 0.71 (0.63, 0.78) | | 0.67 (0.56, 0.78) | |
Applicability concern score | <4 | 4 | 0.11 | 0.78 (0.69, 0.87) | 0.14 | 0.74 (0.65, 0.84) | 0.29 |
| ≥4 | 5 | | 0.71 (0.63, 0.78) | | 0.67 (0.58, 0.76) | |
Sequence | DWI | 6 | 0.57 | 0.74 (0.68, 0.80) | 0.03 | 0.67 (0.59, 0.75) | 0.01 |
| IVIM | 3 | | 0.72 (0.62, 0.83) | | 0.75 (0.65, 0.86) | |
In addition, studies with a large number of lesions had the same specificity (70% vs. 70%, p=0.12) and a significantly higher sensitivity (74% vs. 72%, p<0.01) than those with a small number of lesions. Studies with a low risk of bias reported a significantly higher sensitivity than studies with a high risk of bias (75% vs. 71%, p=0.02). Studies with small sample sizes reported a significantly higher sensitivity and a significantly lower specificity (65% vs. 74%, p=0.01) than those with large sizes (79% vs. 69%, p=0.04). Studies with unclear blinding of the index test reported significantly higher sensitivity and specificity than studies with blinding of the index test (74% vs. 73%, p<0.01 and 73% vs. 66%, p=0.02, respectively). Studies using the DWI parameter reported a significantly higher sensitivity and a significantly lower specificity (67% vs. 75%, p=0.01) than those using the IVIM parameter (74% vs. 72%, p=0.03).
Discussion
This meta-analysis included. nine original articles with 988 HCCs and assessed the diagnostic performance of the ADC value for predicting MVI, with a pooled sensitivity, specificity, and AUROC of 73%, 70% and 0.78, respectively. Our meta-analysis indicated that the ADC value had moderate accuracy in predicting MVI in HCC, which was consistent with the findings of several high-impact original studies,27–30 with a sensitivity and specificity of 71–76% and 65–66%, respectively. Several hypothetical reasons may contribute to the relation between the ADC value and the MVI status in HCC. First, MVI is more common in HCCs with higher histologic grades,33 and the ADC value has been reported to accurately assess the histologic grade of HCC.18 Therefore, it is possible that the ADC value could be used to predict MVI in HCC. Second, tumor embolism in MVI-positive hepatic vascular branches, such as the portal vein, hepatic vein and intracapsular vessel, can limit the diffusion of water molecules to some extent.34 Additionally, the presence of MVI can further increase the infiltration of tumor cells, provide more nutrients needed for proliferation, increase the tumor-cell density, and further limit the diffusion of water molecules.27 However, because of the intrinsic inability to separate the effects of capillary perfusion and molecular diffusion, the diagnostic performance of the ADC value is not as promising in predicting MVI in HCC.
Several studies have investigated the diagnostic accuracy of other diffusion parameters, including the mean apparent kurtosis coefficient and tissue diffusivity (D-value). Wang et al.35 and Cao, et al.36 found that higher mean kurtosis values were potential predictors of MVI in HCC, with sensitivity, specificity and AUROC values of 70%, 77%, and 0.784 and 68.4%, 75%, and 0.77, respectively, which were comparable to the diagnostic performance of the ADC value calculated in this study. Diffusion kurtosis imaging (DKI), which reflects the heterogeneity and irregularity of tissue components, is an extension of diffusion tensor imaging for the detection of non-Gaussian water diffusion.37 The higher mean kurtosis values may be caused by a more complex microenvironment with denser cellular structures and more irregular and heterogeneous lesions introduced by MVI, such as neoplastic cells, necrosis, and inflammation.35 In theory, IVIM can distinguish true water molecule diffusion from microcapillary perfusion. Therefore, the true tissue molecular diffusivity (D) calculated by the IVIM technique is more effective than the ADC value in probing the small differences in water molecule diffusion induced by MVI.38,39 To date, three published studies have examined the diffusion parameter of the D-value, and the results are inconsistent.27,29,31 Wei et al.27 found that the D-value was better than the ADC value for assessing MVI in HCC, with sensitivity, specificity and AUROC values of 78.2%, 75%, and 0.815, respectively. Zhao et al.31 showed that the D-value had a moderate diagnostic performance for assessing MVI in HCC, with sensitivity, specificity and AUROC values of 66.7%, 88.9%, and 0.753, respectively. With limited studies and varied results, more studies are needed in the future to evaluate and confirm the diagnostic performance of the D-value and apparent kurtosis coefficient for MVI in HCC.
In addition to quantitative parameters, many qualitative parameters are available for evaluating MVI, such as non-smooth tumor margins, irregular rim-like enhancement in the arterial phase, peritumoral arterial phase hyperenhancement, and peritumoral hepatobiliary phase hypointensity on MRI. In a high-quality meta-analysis, Hong et al.40 summarized multiple MRI features and concluded that rim arterial enhancement, arterial peritumoral enhancement, peritumoral hypointensity in the hepatobiliary phase (HBP), and non-smooth margins were significant predictors of MVI of HCC, with sensitivities and specificities of 36.4% and 87.9%, 49.7% and 81.5%, 44.2% and 91.1%, and 67.1% and 60.7%, respectively. A recent meta-analysis41 of the evaluation of non-smooth tumor margins and peritumoral hypointensity in the HBP to preoperatively diagnose the presence of MVI in HCC obtained similar results with sensitivity, specificity and AUROC values of 73%, 61%, and 0.74 and 43%, 90%, and 0.76, respectively. Those parameters were similar to the ADC value in predicting MVI in HCC with moderate accuracy, but qualitative parameters are subjective. Even for experienced radiologists, there is a difference.42 The ADC value can be quantitative or qualitative. As a quantitative parameter, compared with qualitative parameters, it is not limited by subjective differences of interpretation and has stronger practicability. In addition, the ADC value is one of the most commonly used clinical indicators and can be obtained without the use of contrast agents. Accordingly, we believe this work is necessary and useful, especially for those who are unable to undergo MRI enhancement for many reasons.
Compared with deep learning and radiomics, the diagnostic performance of the ADC value was inferior to that of the fusion-deep supervision net based on ADC, with sensitivity, specificity and AUROC values of 67.06% vs. 75.29%, 70.43% vs. 79.13%, and 71.24 vs. 79.69, respectively.32 Additionally, the diagnostic performance of the ADC value was inferior to that of radiomics models based on MRI. Feng et al.43 found that a radiomics model based on HBP images had a relatively high performance in predicting MVI in HCC, with sensitivity, specificity and AUROC values of 90%, 75%, and 0.83, respectively, in the validation cohort. A recent study44 found that a radiomics model based on DWI combined with multiple phases of gadoxetate disodium-enhanced MRI had a higher performance in diagnosing MVI in HCC, with sensitivity, specificity and AUROC values of 96%, 86%, and 0.918 in the validation cohort. Therefore, the use of deep learning and radiomics can improve the diagnostic performance of MVI in liver cancer and is a direction for future research.
In this study, meta-regression analyses of nine covariates showed that only the time interval was a significant source of heterogeneity. Studies with time intervals <30 days exhibited significantly higher sensitivity and specificity than those with time intervals ≥30 days. The shorter the time interval between MRI examination and surgical resection, the more closely the imaging features reflect the tumor parenchyma, and the better the diagnostic effectiveness. Therefore, an appropriate time interval should be considered in future studies. Additionally, studies with tumor sizes of <30 mm exhibited significantly higher sensitivity and lower specificity than studies with tumors ≥30 mm. Generally, the larger the diameter of HCCs, the greater the possibility of the existence of MVI45–47 and the easier it is to observe the imaging features; thus, the diagnostic efficiency of MVI is better. However, that result was not obtained in the study. A possible reason is that the larger the HCC is, the more prone it is to bleeding and necrosis, which affect the ADC measurement. Meta-regression was conducted for bias risk, and studies with low risk had higher sensitivity than studies with high risk (75% vs. 71%, p=0.02), which showed that improving the quality of test studies is essential. Meta-regression was also conducted for the blind to index test, and studies with unclear blinding had better sensitivity and specificity than those with blinding (74% vs. 73%, p <0.01 and 73% vs. 66%, p=0.02). The measurement bias of the pathological results probably occurred because reviewers had some knowledge of the signal intensity. Meta-regression was also conducted for the number of included lesions, and studies with a large number of lesions had higher sensitivity than those with a small number of lesions (74% vs. 72%, p <0.01), which indicated that increasing the number of samples is essential. Those covariates should be used to reduce heterogeneity as much as possible in further studies.
There were several limitations. First, most of the studies included in the meta-analysis were retrospective, which may cause patient and imaging technique selection biases. Second, we discussed the diagnostic accuracy of the D-value in predicting MVI in HCC in our discussion but did not include it in our meta-analysis because of the limited number of studies. Finally, only original articles in English were included in the meta-analysis. In conclusion, our meta-analysis found that the ADC value had moderate accuracy in noninvasively predicting pathological MVI in HCC. In future studies, artificial intelligence such as radiomics studies and deep learning based on a combination of multiple MRI sequences or more MRI features, including the D-value of IVIM, could be performed to investigate and verify potential improvements for predicting MVI in HCC.