Introduction
Primary liver cancer is currently the seventh most frequently occurring cancer and the second most common cause of cancer mortality in the world.1,2 Hepatocellular carcinoma (HCC) accounts for >80% of primary liver cancers worldwide.3 Early diagnosis of HCC can significantly improve survival, with liver imaging playing a critical role in detecting and diagnosing HCC early, especially the contrast-enhanced magnetic resonance imaging (MRI).4 There are several clinical practice guidelines for HCC, such as guidelines endorsed by the American Association for the Study of Liver Diseases (AASLD), European Association for the Study of the Liver (EASL), and National Comprehensive Cancer Network (NCCN).4–6
The Liver Imaging Reporting and Data System (LI-RADS) is a comprehensive system endorsed by the American College of Radiology (ACR) for standardizing the terminology, interpretation and reporting of liver imaging in patients at risk for or with HCC.7 In the LI-RADS v2018 computed tomography (CT)/MRI manual, the entire spectrum of hepatic lesions and pseudolesions that may occur in patients at high-risk of HCC, each LI-RADS category, and the major and ancillary features visible on CT and MRI are addressed in detail, with basic concepts, systematic descriptions, and numerous schematic diagrams and examples.8 Therefore, LI-RADS can be used for radiologist education and training in addition to clinical care, as it is designed to increase the knowledge of radiologists, improve radiologists’ diagnostic skills and reduce imaging interpretation variability and errors.8 Consequently, the dissemination and application of LI-RADS are very important for the diagnosis of HCC. However, there are few studies concerning the value of systematic LI-RADS training for HCC diagnosis, with very limited knowledge about the necessity and effect of LI-RADS training.
Therefore, the goal of this study was to explore whether the systematic LI-RADS MRI v2018 training can effectively improve the diagnostic performance of radiologists with different experiences for HCC in high-risk patients. In addition, we assessed the interobserver agreements of the LR category for all participants before and after systematic LI-RADS training.
Methods
Ethics statement
This prospective single-center study was approved by the Institutional Review Board of our hospital (2020-P2-220-01), and informed consent was obtained from all enrolled radiologists. The requirement for informed consent from patients was waived, as they were retrospectively reviewed and enrolled. This study was performed within 6 months at the hospital of the lead author, from August 2019 to January 2020.
Patient selection
Consecutive liver MRI reports from August 2016 to July 2017 were reviewed and filtered using the terms “LI-RADS” or “LR” in our picture archiving and communication system (PACS) (DJ Health Union Systems Corporation, Shanghai, China). The inclusion criteria were as follows: 1) patients with a high risk of HCC, including those with cirrhosis or chronic hepatitis B viral infection; and 2) patients with at least one hepatic observation in the LR category. The exclusion criteria were as follows: 1) patients without the above risk factors, those <18 years-old, and those with cirrhosis due to congenital hepatic fibrosis or vascular disorder; 2) patients who had accepted any locoregional or systemic treatment concerning hepatic observations; 3) patients with more than three hepatic observations; and 4) MR examinations that did not satisfy the technical recommendation of LI-RADS v2018 or those with poor image quality as assessed by three experienced radiologists (with 11 [AHR], 15 [HX] and 32 [ZHY] years of experience in abdominal imaging).8 As these consecutive cases were reported according to LI-RADS v2014 or v2017, all hepatic observations were firstly recategorized by two experienced radiologists working together (with 11 [AHR] and 15 [HX] years of experience in abdominal imaging) and according to LI-RADS v2018. In cases of disagreement on LI-RADS category, a third radiologist with 32 years of experience (ZHY) decided the final LI-RADS category. Finally, 70 hepatic MRI observations from 61 patients with a high risk of HCC were enrolled in this study, with 10 observations per LR category (LR-1/2/3/4/5/LR-M/LR-TIV) (Fig. 1). All these three radiologists were specialists of the LI-RADS CT/MRI algorithm, had adopted the LI-RADS algorithm in routine work for more than 5 years and were very familiar with the update and revisions of v2018.
Subjects
A total of 30 Residents or Fellows with different levels of experience in abdominal MRI diagnosis coming from other hospitals/institutions in China to our department as visiting scholars for at least 6 months were included in this study. All participants were asked to complete a questionnaire to collect baseline demographic information at the beginning of this study. The contents of the questionnaire included the classification and category of their hospitals/institutions, experience in abdominal MRI (years), number of abdominal MRI reports reviewed per day, and extent of knowledge about LI-RADS before training. A total of 10 participants who failed to complete the entire training procedure were excluded. Finally, 20 participants with different experiences were enrolled in this study (Fig. 2).
MRI protocol
All patients underwent MR examinations with 1.5-T (Signa HDxt 1.5T; GE Healthcare, Chicago, IL, USA) or 3.0-T (Discovery 750w from GE Healthcare; MAGNETOM Prisma from Siemens AG, Munich, Germany; Ingenia from Philips Healthcare, Amsterdam, The Netherlands) MRI scanners with an 8/16-element phased array coil. The liver MRI technique is summarized in Supplemental Table 1 (online). All patients underwent MRI using gadobenate dimeglumine (Magnevist; Bayer Schering Pharma AG, Berlin, Germany), which was intravenously injected at a dose of 0.1 mmol/kg and a rate of 2 mL/s followed by a normal saline flush. After the administration of contrast agent, dynamic T1-weighted imaging (T1WI) was obtained in the late arterial phase (30–40 s after injection), portal venous phase (60–70 s after injection), equilibrium phase (3–4 m after injection), and delayed phase (5–8 m after injection).
Systematic LI-RADS MRI v2018 training procedure
The CT/MRI LI-RADS algorithm has been adopted daily at the Radiology Department of our institution since October 2015, from v2014 to v2018. The LI-RADS CT/MRI v2018 training procedure included three thematic lectures given by a professor (ZHY, PhD, MD) with 32 years of experience, who specialized in imaging diagnosis of liver neoplasms and was well versed in the application of the LI-RADS CT/MRI algorithm. The major topics of the lectures included an introduction of the LI-RADS categories and explanations of the major and ancillary features, and the typical manifestations of each category and feature, with plenty of cases (Supplementary File 1). Of note, the three lectures were almost the same, except a few subtle changes according to reader feedback. Electronic instructional materials, including slideshows, journal articles, and recorded lectures, were shared with the participants to facilitate the training process. Each seminar lasted for 2.5 h, with an interval of a month. After the former two seminars, the participants had a month to learn, practice, and adopt the LI-RADS MRI v2018 algorithm in daily work. During these 2 months, they reported the MR of routine patients, including LI-RADS practice in proper patients, and this was also a part of the training. Moreover, formal discussions concerning LI-RADS in specific cases proceeded twice per week, and each discussion lasted for ≥30 m during the 2 months. In addition, informal discussions were carried out whenever necessary during the training procedure. The flow chart of the systematic LI-RADS training procedure is displayed in Figure 2.
Imaging interpretations
All MRI data were transferred to the workstations, and imaging analyses were anonymously performed on PACS. All MR images were interpreted separately by 20 participants twice according to the LI-RADS v2018 algorithm, once before the training and once after the 3rd systematic LI-RADS training.8 The participants were informed about the localization and size of hepatic observations, which was preliminarily provided by one of our radiologists with 11 years of experience (AHR). All image interpretations, both before and after training, were recorded as structured LI-RADS template reports (Supplemental Table 2, online), which were designed before training. All participants were blinded to any clinical information, the number of each LR category, the imaging reports, and the pathological results. The order of MRI exams to be reviewed was randomized for each participant. However, for the assessment of threshold growth, a prior examination (CT or MRI) was used when available. All hepatic observations were interpreted based on major and ancillary features in combination according to LI-RADS v2018.8 The ancillary feature of ultrasound visibility as a discrete nodule was not used, while the tiebreaking rules were used at the participants’ discretion if needed.
Reference standard
The high risk for HCC and final clinical diagnoses of 61 patients with 70 liver observations are displayed in Table 1. Of these, 52 patients underwent a single observation, while the other nine patients underwent two observations. For observations with histopathological diagnoses, pathological diagnoses were used as the gold standard. For those who were diagnosed with HCC without histopathology, follow-up imaging demonstrated substantial growth associated with arterial phase hyperenhancement and washout or enhancement of the capsule.9 The reference standards for LR-1/2/3 observations were based on typical imaging findings or the absence of progression to a malignant category (LR-4, LR-5, LR-M or LR-TIV) during the follow-up period.10,11 These patients were followed-up for at least 2 years. The LR category and diagnostic methods of all hepatic observations are displayed in Table 2.
Table 1Characteristics of hepatic observations and risk factors of HCC
Characteristic | Total of 61 patients and 70 observations |
---|
Age in years | 37–84, average 59.5±10.1 | – |
Sex | Male | 47 (77.1%) |
Female | 14 (22.9%) |
Risk factors | HBV | 45 (73.8%) |
HCV | 3 (4.9%) |
HBV+HCV | 2 (3.4%) |
Alcoholic liver cirrhosis | 3 (4.9%) |
HBV+alcoholic liver cirrhosis | 3 (4.9%) |
NAFLD/NASH | 1 (1.6%) |
PBC | 1 (1.6%) |
Cryptogenic cirrhosis | 3 (4.9%) |
Observation characteristic | HCC | 36 (51.4%) |
iCCA | 5 (7.2%) |
HChC | 1 (1.4%) |
Epithelioid hemangioendothelioma | 1 (1.4%) |
Benign lesions | 27 (38.6%) |
Table 2LR category and diagnostic method of enrolled hepatic observations
LR category | Diagnostic method | Diagnosis | Number | Total |
---|
LR-1 | Imaging+clinical | Cyst/perfusion alteration/hemangioma | 8/1/1 | 10 |
Pathology | – | – | |
LR-2 | Imaging+clinical | RN/DN/cyst/hemangioma | 5/2/1/2 | 10 |
Pathology | – | – | |
LR-3 | Imaging+clinical | RN/DN/HCC/coagulative necrosis/chronic fibrosis | 1/3/2/1/1 | 10 |
Pathology | DN/HCC | 1/1 | |
LR-4 | Imaging+clinical | HCC | 7 | 10 |
Pathology | HCC | 3 | |
LR-5 | Imaging+clinical | HCC | 1 | 10 |
Pathology | HCC | 9 | |
LR-M | Imaging+clinical | HCC/iCCA | 1/1 | 10 |
Pathology | HCC/iCCA/HChC | 3/3/2 | |
LR-TIV | Imaging+clinical | HCC | 5 | 10 |
Pathology | HCC/iCCA/epithelioid hemangioendothelioma | 1/3/1 | |
Statistical analysis
Raw data and cleaned data were stored in Excel, and statistical analysis was performed with Stata statistical software version 13.1 (https://www.stata.com/ ). The distribution of ordinal categorical data between groups was compared by the Wilcoxon rank-sum (Mann-Whitney) test after rank transformation. Proportions were compared using the chi-squared test. Indicators of diagnostic accuracy were calculated for each participant before and after training using the formulas as follows.12 In this study, LR-5 was used as a predictor of HCC.8,13 Compared to the final diagnosis of the sampled MRIs, the number of true positive (TP), false positive (FP), true negative (TN), and false negative (FN) findings were extracted, and a 2×2 table was constructed. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), coincidence rate, positive likelihood ratio (+LR) and negative likelihood ratio (−LR) were calculated (Supplementary File 1). The means with 95% confidence intervals (95% CIs) of these indicators were calculated separately for all participants. The hierarchical summary receiver operating characteristic package was used to calculate the pooled estimates of the different operators. The inter-observer agreement among radiologists with different experience levels was calculated directly. If all radiologists classified the same LR category for one hepatic lesion, then it was considered as consensus. As long as there was a different LR category assessed by any one of these radiologists, it was considered as inconsistent. Then, the percentage and 95% CI were calculated. A p-value of less than 0.05 was regarded as statistically significant.
Results
Characteristics of all participants
The demographic information of the participants enrolled in this training program is described in Table 3. In China, hospitals are classified into three tiers, each with three sub-levels (A, B and C), and the highest ranking is 3A. All 20 participants were general radiologists, and none of them were primarily liver specialists. These participants were further classified into junior and senior subgroups according to their seniorities (Supplemental Table 3).
Table 3Basic characteristics in related experience of enrolled 20 participants
Characteristics | Total |
---|
Sex |
Male | 12 (60%) |
Female | 8 (40%) |
Age in years | 36.75 ± 4.99 |
≤35 | 10 (50%) |
35–50 | 10 (50%) |
Post-graduate year | |
≤5 | 7 (35%) |
5–10 | 7 (35%) |
>10 | 6 (30%) |
Classification of hospitals/institutions |
3A | 13 (65%) |
3B | 3 (15%) |
2A | 4 (20%) |
Experience of abdominal MRI in years |
<5 | 9 (45%) |
5–10 | 9 (45%) |
≥10 | 2 (10%) |
Number of abdominal MRI reports per day |
<5 | 12 (60%) |
5–10 | 6 (30%) |
≥10 | 2 (10%) |
Extent of knowledge about LI-RADS before training |
Very familiar, adopt in MRI reports | 0 (0%) |
General understanding, did not use in MRI reports | 8 (40%) |
Heard of, did not use in MRI reports | 10 (50%) |
Not familiar at all | 2 (10%) |
Interobserver agreements of the LR category before and after systematic training
The comparison results of interobserver agreement of LR category for overall, junior and senior radiologists before and after systematic LI-RADS training are demonstrated in Supplemental Table 4. Before LI-RADS training, the participants had a relatively low level of agreement on the diagnosis of 70 hepatic observations on MRI. The diagnosis of only 17 hepatic observations was agreed upon by all 20 radiologists, making their interobserver agreement 0.243 (0.148–0.360). After systematic LI-RADS training, a total of 33 hepatic observations reached a consensus with an interobserver agreement of 0.471 (0.351–0.594) for all participants, including 24 observations they did not regard as HCC and 9 observations they agreed on as an HCC diagnosis. After systematic LI-RADS training, the interobserver agreements of the LR category for overall, junior and senior participants are significantly increased (p<0.001).
Diagnostic performance for HCC before and after systematic training
The comparison results of the diagnostic performance of the overall, junior and senior participants for HCC before and after systematic LI-RADS training are shown in Table 4 and Supplemental Table 5. The sensitivity of their diagnosis of HCC improved from 0.43 (0.37–0.50) to 0.54 (0.51–0.56), and the PPV improved from 0.74 (0.70–0.78) to 0.81 (0.79–0.84) (p<0.001). The diagnostic performances of both junior and senior radiologists were all increased after systematic training of LI-RADS (junior, p=0.037; senior, p=0.004). The area under the curve also improved with statistical significance among overall, junior and senior participants after training (p<0.001) (Figs. 3–5; Supplemental Figs. 1, 2).
Table 4Comparison of diagnostic performance of all participants for HCC before and after systematic LI-RADS training
| Before training | After training |
---|
Sensitivity | 0.43 (0.37–0.50) | 0.54 (0.51–0.56) |
Specificity | 0.86 (0.82–0.89) | 0.88 (0.86–0.90) |
PPV | 0.74 (0.70–0.78) | 0.81 (0.79–0.84) |
NPV | 0.62 (0.60–0.64) | 0.67 (0.66–0.68) |
Coincidence rate | 0.65 (0.62–0.67) | 0.71 (0.70–0.73) |
+LR | 3.60 (2.62–4.58) | 5.14 (4.28–6.01) |
−LR | 0.66 (0.60–0.72) | 0.53 (0.50–0.55) |
AUC | 0.64 (0.62–0.67) | 0.71 (0.70–0.72) |
p-value | <0.001 |
Discussion
In this study, 20 participants with different abdominal imaging experiences, serving as visiting scholars in our department, underwent systematic training with the newest version of the LI-RADS algorithm, and their interobserver agreements and diagnostic performance outcomes for diagnosing HCC on MRI before and after training were compared. Our results showed that the interobserver agreement for the LR category for all participants was significantly increased after systematic training. The diagnostic performance of all participants for HCC was significantly increased after systematic training.
In this study, we performed systematic LI-RADS training v2018 with 20 participants at an academic radiology department. Our institution has a national key cultivation discipline of gastroenterology and hepatology and a liver transplant center, with sufficient patients with focal liver lesions undergoing MR examinations. In addition, LI-RADS has been introduced and adopted in daily work in our radiology department for 5 years, with updates to the newest version of the algorithm.8,14–16 Therefore, our lead radiologists have considerable experience with LI-RADS and have devoted much effort to disseminating LI-RADS in China. In addition, our institution is a teaching hospital, so we attach great importance to the training of residents and the continuing medical education of visiting scholars.
Davenport et al.17 compared the repeatability of diagnostic features and different scoring systems for HCC on MRI between five fellowship-trained radiologists and five novice radiology residents at a liver transplantation center. They reported a fair overall inter-reader agreement (0.35 [95% CI: 0.34, 0.37]) for LI-RADS v2013.1, which was slightly lower than our results after training. However, they did not perform systematic training for radiologists, and the participants were given only 1 h of lecture-based and hands-on instructions concerning each liver observation scoring system.17 They did not compare the diagnostic performance outcomes of the experts and novice radiologists. LI-RADS is currently consistent with the AASLD and NCCN guidelines and fully integrated into AASLD clinical practice guidance.14 AASLD does not have an official definite scoring system, and the Organ Procurement and Transplantation Network (commonly referred to as OPTN) is a unique system for transplantation adopted in the USA.4,18 Therefore, we only evaluated the systematic training effect of LI-RADS in this study. In our study, the interobserver agreements of all radiologists for the LR category were increased after systematic training. Our results of the inter-reader agreement for the LR category are slightly greater than those of Kang et al.9 and Fowler et al.19
There are relatively few studies concerning the dissemination of HCC diagnosis guidelines. Elmohr et al.20 discussed the feasibility and efficacy of the concept of teaching teachers in disseminating and motivating the application of the LI-RADS v2018 clinical practice guideline. They used different teaching methods for different continents and countries, with a total of 8,342 attendees participating in their study. We implemented a systematic training program with 20 participants with different experience levels using the hybrid method of classroom training combined with a one-on-one model. Our results reveal that the systematic training model can effectively improve the diagnostic performance of the attendees for HCC. All these visiting scholars came from different provinces and districts in China, and they may subsequently disseminate the LI-RADS v2018 clinical practice guideline in their own hospitals/institutions. Unfortunately, we did not study how much of this training was retained at 6 months or 1 year after the training.
In this study, the improvement in the interobserver agreement and diagnostic performance for HCC after training for all participants was real and expected. The possible reasons for the improvements of the interobserver agreement and diagnostic performance in doctors are based on learning and training. However, it is rather modest. Less than half of the observations were correctly classified, possibly because all participants were general radiologists and not liver specialists, although LI-RADS is supposed to be a standard and straightforward tool. Another reason may be that all participants came from different classifications of hospitals and reported an average number of abdominal MRIs of only 3.9±3.31.
Limitations
This study had several limitations. First, the number of lesions assessed was too small. We included only 10 hepatic observations per LR category, which is not enough for robust analysis. We would include more cases to verify and improve the reliability of the result in a future investigation. Second, this was a single-center retrospective study, and selection bias of patients inevitably exists. Third, the three expert radiologists did not review the enrolled cases independently, and the inter-reader agreement between them was not assessed. However, our previous study displayed a good intraclass correlation coefficient (0.965 [95% CI: 0.956–0.972]) for the LR category among these three radiologists adopting LI-RADS v2018. Fourth, 85.2% (52/61) of the patients had a single observation in this study. This may not represent the daily routine in other hospitals; perhaps this is a bias related to the local recruitment of a transplantation center. Fifth, in terms of real-life applicability, a limitation may be that most radiology departments do not have 7.5 h available to devote to formal LI-RADS training didactics.
Conclusions
In conclusion, the systematic LI-RADS training can effectively improve the diagnostic performance and the interobserver agreements of radiologists with different experience levels for HCC, both for junior and senior radiologists.
Supporting information
Supplemental Table 1
MR parameters.
(DOCX)
Supplemental Table 2
LI-RADS v2018 MR report template.
(DOCX)
Supplemental Table 3
Basic characteristics in related experience of 20 radiologists enrolled, stratified by their seniority.
(DOCX)
Supplemental Table 4
Interobserver agreements of LR category for overall, junior and senior radiologists before and after systematic LI-RADS training.
(DOCX)
Supplemental Table 5
Comparison of diagnostic performances of overall, junior and senior radiologists for HCC between before and after systematic LI-RADS training.
(DOCX)
Supplemental Fig. 1
Hierarchical SROC curves for MRI diagnosis by junior radiologists (A, B).
Circles with numbers represent each participant, and dotted lines represent the credible interval. SROC, summary receiver operating characteristic curve; MRI; magnetic resonance imaging.
(TIF)
Supplemental Fig. 2
Hierarchical SROC curves for MRI diagnosis by senior radiologists (A, B).
Circles with numbers represent each participant, and dotted lines represent the credible interval. SROC, summary receiver operating characteristic curve; MRI; magnetic resonance imaging.
(TIF)
Supplementary File 1
CT/MRI LI-RADS algorithm.
(DOCX)
Abbreviations
- AASLD:
American Association for the Study of Liver Diseases
- CT:
computed tomography
- HCC:
hepatocellular carcinoma
- LI-RADS:
liver imaging reporting and data system
- MRI:
magnetic resonance imaging
Declarations
Data sharing statement
The data concerning LI-RADS v2018 training used in support of the findings of this study are included within the supplementary information file(s) accompanying this publication in the [Xia & He Publishing Inc. Journal of Clinical and Translational Hepatology].
Funding
This work was supported by funds from the National Natural Science Foundation of China (Nos. 61871276 and 82071876), Beijing Municipal Administration of Hospitals Clinical Medicine Development of Special Funding Support (No. ZYLX202101), Research Foundation of Beijing Friendship Hospital (No. yyqdkt2019-30), and Cultivation Scientific Research Foundation of Capital Medical University (No. 1210020247).
Conflict of interest
The authors have no conflict of interests related to this publication.
Authors’ contributions
Study concept and design (AHR, ZHY, ZCW), acquisition of data (AHR, NZ, TB), analysis and interpretation of data (AHR, HX, DWY), drafting of the manuscript (AHR), critical revision of the manuscript for important intellectual content (ZHY, ZCW), and administrative, technical, or material support, study supervision (ZHY).