Introduction
Hepatocellular carcinoma (HCC) is one of the most common and fatal malignancies worldwide.1,2 Its grim prognosis can be attributed to various risk factors,3,4 especially because HCC is frequently asymptomatic until it becomes large and affects the liver capsule or other intra-abdominal structures.5 As a result, screening has been important for facilitating early detection.5 Tumor marker expression has long been used to help with early HCC detection. Some reportedly useful tumor markers include α-fetoprotein (AFP), Golgi protein 73, osteopontin, abnormal prothrombin, phosphatidylinositol proteoglycan, and heat shock protein.6 Despite a wide range of available markers, no single tumor marker has shown to be of superior diagnostic value. For many years, AFP and abdominal ultrasound (US) were used to screen patients at risk for HCC development.7 AFP is a carcinoembryogenic glycoprotein and its coding gene is contained within the long arm of chromosome 4.6 Similar to other markers, several studies have reported poor specificity and sensitivity of AFP in HCC screening when used alone.8 This is in part due to the fact that a third of HCC patients, especially those with an early tumor stage have negative AFP expression as determined by serum cutoff levels.6 Another consideration is that AFP serum levels are elevated during hepatic inflammation. Thus, hepatocyte damage and regeneration even in the absence of HCC may contribute to the observed poor specificity. To increase specificity, a high diagnostic cut-off of AFP expression in serum (400 ng/ml) has been adopted in screening recommendations. However, the high cut-off has also resulted in low sensitivity. Consequently, the American Association for the Study of Liver Diseases (AASLD) and European Association for the Study of the Liver no longer recommend routine use of AFP as a single screening test. The AASLD 2018 guidelines noted, however, that there are data to suggest that longitudinal changes in AFP hold promise to increase sensitivity and specificity, but no specific guidance about AFP trend levels was published.9 We sought to understand whether a specific slope cut-off value for AFP expression could be of value for early diagnosis of HCC.
Methods
Patients/materials
A retrospective review of charts between July 7th, 2004 and June 1st, 2021 was performed at the University of Connecticut Health Center. The study received approval from the university’s IRB, granting an exempt status for the use of de-identified data, and adhered to the ethical guidelines of the Declaration of Helsinki (as revised in 2013). Study inclusion criteria were as follows: patients over the age of 18 yo confirmed to have HCC, cirrhosis, or both as determined either by imaging or biopsy according to AASLD diagnostic criteria for HCC. Exclusion criteria were as follows: vulnerable patients (<18 yo, prisoners, mentally handicapped), those with insufficient documentation of HCC,10 HCC therapy, liver surgery, ablation or embolization, a lack of necessary clinical data, and those without any elevated AFP values. Two groups of patients were formed: patients with diagnosed HCC (HCC group), and patients with cirrhosis, but no confirmed HCC (control group) by definitive imaging or biopsy, and by survival for at least 12 months following the original search without evidence of HCC by routine US surveillance.
Data collection
The following data were collected: age, AFP, aspartate aminotransferase (AST) and alanine aminotransferase (ALT) levels, liver histopathology, imaging records, dates of HCC diagnosis, and blood testing. An elevation was defined as a serum value that exceeded the upper limit of normal according to the hospital laboratory standards. A trend in values was defined as a set of at least two consecutive elevated levels at least three months apart. A peak value was defined as the highest available value for that case, and a peak trend was defined as the peak value plus the two consecutive values immediately preceding the peak. Single point peak values were excluded from trend analyses.
ICD-9 and -10 codes for primary liver cancer and hepatic cirrhosis were used to identify 2,456 unique patient charts of which 444 patients had a diagnosis of HCC, and 2,012 patients had cirrhosis but no HCC as determined by imaging and outcome.
For the HCC group, patients were further selected for having at least two elevated AFP values recorded separated by at least three months, but no more than 24 months from first to last values. This decreased the HCC patient pool to 34 patients. A detailed chart review of the 34 patients revealed that some of the patients could not be included in the study for the following reasons: six patients had HCC diagnosis entered in error; six patients had imaging findings equivocal for HCC, and no follow up imaging was done; three patients had an initial diagnosis of HCC which was ruled out later; three patients had confirmed HCC but had no AFP data around the time of the diagnosis; and one patient had radioablation. The remaining 15 patients had HCC confirmed by imaging and/or biopsy and were included in the study (Fig. 1).
For the control group, 2,012 patients with cirrhosis were identified. From these, 34 patients were randomly selected for the control group. Of these, patients were excluded for the following reasons: 12 patients had no radiographic evidence of cirrhosis, four patients lacked imaging or pathology records, and two patients had radiologic findings equivocal for HCC with no follow up imaging done. The 16 remaining patients were included because they had imaging and/or biopsy confirmed cirrhosis and documentation of no HCC at least 12 months following the original search by routine US surveillance (Fig. 1).
To focus on AFP trends prior to HCC diagnosis, a pre-diagnosis subgroup of nine HCC cases was selected from the original HCC group, each having three or more consecutive AFP values immediately prior to the date of diagnosis (with no other intervening values), of which two or more of the data points exceeded the upper limit of normal. In this pre-diagnosis HCC subgroup, no patients had AFP levels that reached or exceeded the diagnostic cut-off (400 ng/ml). Controls for this HCC subgroup consisted of cirrhotic patients who had two or more consecutive elevated AFP levels at any time. To assess inflammation in the HCC pre-diagnosis group, two or three consecutive AST and ALT values were recorded at the same or as close as possible to dates of the AFP tests.
Statistical analysis
The AFP data were log-transformed to correct for distributional skewness. Longitudinal AFP values were averaged within individuals and compared between HCC and cirrhotic non-HCC control groups using a Wilcoxon rank-sum test. Individual slopes of log (AFP) versus time from the HCC diagnosis or the last measurement in non-HCC cirrhotic controls were estimated using linear regression models and were compared between HCC and non-HCC control groups using a Wilcoxon rank-sum test. The slope estimation and comparison above were conducted using all available AFP levels, but only pre-diagnosis AFP levels were used in the HCC cases.
HCC cases and cirrhotic non-HCC controls then were modeled jointly using a linear mixed effects model with a random subject intercept, time, and group-time interaction to test the slope difference between the groups. As the slope estimates significantly differed, we further estimated the slope for log (AFP) in HCC cases prior to the diagnosis and that for cirrhotic non-HCC controls using linear mixed effects models with a random subject intercept and the fixed effect of time.
The discriminative power of the slope of log (AFP) for the status of HCC versus cirrhotic non-HCC controls was measured using area under the curve (AUC) analysis. Diagnostic values (sensitivity, specificity, positive predictive value, and negative predictive value) were calculated given a cut-off in the range from 0.1 to 5.3 (covering the whole range of slopes) with an increment of 0.1. Lastly, the association between log (AST) or log (ALT) and log (AFP) over time in HCC cases was assessed using a linear mixed effects model with a random subject intercept and time from the HCC diagnosis. p-values <0.05 were considered statistically significant. All statistical analyses were performed using R version 4.1.2.
Results
Mean AFP levels in HCC and cirrhotic non-HCC controls
To compare the natural history of AFP expression for our HCC and cirrhotic non-HCC control cases over many years versus previously published results, we examined all available AFP values for each case, calculated the means, and compared them between HCC and cirrhotic non-HCC controls. Both HCC and cirrhotic non-HCC control groups had varying etiologies of hepatitis that led to the development of cirrhosis (Table 1). The calculated p-value indicated no statistical significance in hepatitis etiologies between the two groups (p > 0.999). HCC cases had three to 14 AFP measurements and a mean follow-up of 4.09 years compared to two to 18 measurements with a mean follow-up of 6.97 years in cirrhotic non-HCC controls. For the 15 HCC cases, the mean AFPs for HCC group ranged from 2.60 to 12,200 ng/ml with a group mean of 1,200 and standard deviation (SD) of 31,200. The number of cases that reached diagnostic AFP expression (>400 mg/ml) at any time was 5 (33%). For the 16 cirrhotic non-HCC cases in the control group, the mean AFP ranged from 2.08 to 54.43 ng/ml, with a group mean of 8.15 and SD of 12.96. The difference in mean AFP between the HCC and the control groups was statistically significant (p < 0.05).
Table 1Hepatitis etiologies in HCC and non-HCC cirrhotic control groups
Hepatitis Etiology | HCC | Non-HCC Cirrhotic Controls |
---|
Hepatitis C | 8 (53.33%) | 7 (43.75%) |
Alcoholic Hepatitis | 3 (20%) | 3 (18.75%) |
Hepatitis B | 2 (13.33%) | 2 (12.5%) |
Hepatitis B and hemochromatosis | 1 (6.67%) | 0 (0%) |
NASH | 1 (6.67%) | 1 (6.25%) |
Autoimmune Hepatitis | 0 (0%) | 1 (6.25%) |
Hepatitis B and C | 0 (0%) | 1 (6.25%) |
Hepatitis C/NASH | 0 (0%) | 1 (6.25%) |
Mean AFP slopes in HCC and cirrhotic non-HCC controls
For the HCC group, the mean estimated slope of log (AFP) was calculated to be 0.65 (SD = 1.15; range: −0.73 to 3.31) versus −0.16 (SD = 0.42; range: −1.70 to 0.17) for the non-HCC group. There was a significant difference in the slope distribution between groups with a p-value of 0.011. The positive mean AFP slope of the HCC cases indicates that the AFP levels, even those only mildly elevated, trended upward with time. This is consistent with previous publications on the natural history of AFP expression in HCC patients.11,12 The fact that the AFP slopes of HCC and non-HCC cirrhotic controls were significantly different is not surprising as the HCC group included data from HCC cases with very high levels of AFP after diagnosis. However, the clinical utility of total mean slope as an indicator of the development of HCC is limited by a bias of normal values early in the screening process especially when the durations prior to diagnosis of HCC were long. To minimize this bias, we considered whether a model in which AFP expression taken immediately prior to the date of diagnosis of HCC (pre-diagnosis) might be useful. The idea is based on the assumption that most HCC patients will have had some trend of increased serum AFP expression, albeit often at non-diagnostic levels, before HCC is diagnosed. To more closely approximate the common situation in which HCC cases are diagnosed in the absence of diagnostic cut-off AFP expression, all HCC patients with AFP expression levels above the diagnostic cut-off were excluded from the pre-diagnosis subgroup described below.
Mean AFP slopes in pre-diagnosis HCC cases compared to cirrhotic non-HCC controls with any elevated AFPs
As a subgroup of the HCC cases, nine patients were selected for having consecutive elevated AFP values immediately prior to the date of diagnosis as defined previously in the Methods section. For the HCC group, the time between pre-diagnostic AFP dates ranged from 5.8 months to 15.3 months, and the time between the first and the last pre-diagnostic dates ranged from 12.7 to 22.3 months. The mean estimated log (AFP) slope for the HCC group was 1.49 (SD = 1.62, range: 0.20 to 5.37) while that of the controls was 0.31 (SD = 0.32, range: 0.01 to 0.91). The difference in the slope distribution between groups was statistically significant (p = 0.013). A linear mixed effects model showed a significant difference in the slope estimate between HCC and the cirrhotic non-HCC control group (group and time interaction p = 0.001, Table 2). Following this finding, linear mixed effects models were fitted for HCC cases and controls, separately. The slope estimate for log (AFP) in the HCC group prior to the diagnosis was 1.21 (95% CI 0.44 to 1.97, p = 0.005) while that of the cirrhotic non-HCC control was 0.15 (95% CI 0.08 to 0.23, p < 0.001).
Table 2Linear mixed effects model results for combined HCC plus cirrhotic non-HCC control group, and HCC, control groups, separately
| Beta | 95% Confidence Interval
| Pr (>|t|) |
---|
Lower | Upper |
---|
Combined HCC Plus Control Group | | | | |
(Intercept) | 3.36 | 2.89 | 3.83 | <0.001 |
Time | 0.15 | 0.05 | 0.26 | 0.005 |
HCC group (ref = controls) | 1.25 | 0.55 | 1.94 | 0.002 |
HCC group × time | 1.07 | 0.46 | 1.67 | 0.001 |
HCC Group Only | | | | |
(Intercept) | 4.60 | 4.02 | 5.19 | <0.001 |
Time | 1.21 | 0.44 | 1.97 | 0.005 |
Control Group Only | | | | |
(Intercept) | 3.36 | 2.91 | 3.81 | <0.001 |
Time | 0.15 | 0.08 | 0.23 | <0.001 |
The AUC for sensitivity plotted against a false positive rate or 1-specificity for log (AFP) of pre-diagnosis HCC (Fig. 2) was 0.844 indicating acceptable discrimination. Using a slope cut-off in the range of 0.25 to 0.32 to determine HCC (HCC if a slope > cut-off; otherwise, non-HCC) gave a sensitivity 0.89 and specificity 0.70 (positive predictive value 0.73 and negative predictive value 0.88).
Aminotransferase slopes compared to AFP slopes in pre-diagnostic HCC cases
To determine whether inflammation, and not HCC, was primarily responsible for the observed trend in log (AFP) in the HCC cases, AST and ALT values were determined at the closest time points to the dates of the AFP tests in the pre-diagnosis HCC cases. The mean log (AFP) was found to be decreased by 0.12 SD per SD increase in log (AST), while the mean log (AFP) was decreased by 0.04 SD per SD increase in log (ALT). Neither log (AST) nor log (ALT) slopes were significantly associated with log (AFP) slope (p = 0.825 and 0.825, respectively). The results support the notion that the observed AFP trend in pre-diagnosis HCC cases could not have been solely due to inflammation, and therefore, the evidence indicates that the majority of observed AFP expression was due to HCC.
Discussion
We previously attempted to evaluate AFP trend data using published AFP results from various reports in the literature.13 In that report, important ancillary data including the dates of HCC diagnosis, corresponding dates and values of aminotransferases, and follow up of controls were not available. Nevertheless, that AFP slope analyses showed a large difference in AFP slope in HCC cases compared to non-HCC controls, but failed to show statistical significance or an increase in sensitivity and specificity compared to those published for AFP cut-off values. In the current study, an increasing amount of new important patient data became available that included controls with cirrhosis and evidence of inflammation in both groups. This enabled us to determine a specific AFP slope cut-off value that was associated with the presence of early HCC before detection by US.
Tayob et al. conducted a prospective study on AFP expression following patients every 3–6 months for a median of 80 months.9 They used the parametric empirical Bayes screening algorithm to analyze AFP data, and were able to detect HCC 1.7–1.9 years earlier in the cirrhosis group and 1.4–1.7 years earlier in an advanced fibrosis group. Overall, the parametric empirical Bayes improved sensitivity of AFP from 60.4% to 77.1% in patients with cirrhosis and from 72.5% to 87.5% in patients with advanced fibrosis.8 This was a prospective study with 1,050 hepatitis C patients, so the focus on HCV cases may make their conclusions less applicable to cirrhosis of other etiologies.
Lee et al. used a longitudinal analysis approach to study AFP expression in a nested case-control study focused on 82 hepatitis C patients with HCC.14 They investigated AFP expression, standard SDs, rates of AFP increase, and used a multiple logistic regression that included patient-specific risk factors such as age, platelet count, and smoking status. They found that the SD of AFP, and the rate of AFP increase along with patient-specific risk factors, improved diagnostic sensitivity to 0.81 compared to 0.76 when a single AFP level was used. The study was a nested case-control of 82 patients all with hepatitis C.
The current results are consistent with those of Ricco et al. who also performed a thorough and well-designed retrospective analysis on AFP trends in a large sample of 418 control patients with cirrhosis undergoing surveillance for HCC.15 They included 124 patients documented to have HCC and 294 patients serving as controls who did not develop HCC for at least 12 months following the last recorded serum levels.15 They also recorded data points around the time of diagnosis but included some of the data points post-HCC diagnosis, and had two or three data points recorded per patient with testing intervals ranging from three to 96 months. Their conclusions that AFP trends of HCC cases as a group differed significantly from controls are consistent with, and support, our conclusions. However, they did not address possible contributions of inflammation to AFP expression, nor present a specific slope cut-off value for early diagnosis.
In the current study, the maximum time interval between the first and the last pre-diagnostic sampling dates was much smaller (22.3 months), and while a minimum of 2 consecutive elevated values was required to be considered a trend, only one case had two elevated values, and the rest had three or more in the pre-diagnosis HCC group.
It is important to note that the current study collected AFP data immediately prior to the date of HCC diagnosis while Ricco et al. recorded data points taken up to 4.7 months after HCC diagnosis. Their AFP testing intervals ranged from three to 96 months, whereas the current study required that AFP expression of the trend to be separated by no more than 24 months from first to last value, which is more consistent and comparable to previous guideline sampling intervals, and therefore, more easily compared to published results. It should be emphasized that in the current pre-diagnosis HCC subgroup, patients who had diagnostic levels of AFP were excluded in order to more closely simulate the common clinical scenario in which guideline diagnostic AFP cut-off levels are not reached at the time of diagnosis. Those patients were diagnosed solely by imaging or biopsy.
Ricco et al. combined all the AFP data points from their patients and used a curve-fit analysis.15 They observed that the differences in AFP values between HCC and controls peaked at 24 months prior to last sample, and then declined. In the current study, because of the small sample size and few data points per patient, a curve-fitting analysis was not possible. Nevertheless, the current results suggest that patients who develop two or more consecutive AFP elevations at least three months apart, but within 22.3 months, and who generate log (AFP) slopes greater than the slope cut-off of 0.32 may be at increased risk for HCC and may benefit from definitive diagnostic imaging. In other words, a doubling or greater of an AFP value six months after a previously elevated AFP level should be viewed with suspicion for the presence of early HCC even if the levels are well below guideline cutoff values.
Weaknesses of the current study include its retrospective design, small sample size, and a database from a single academic hospital. Strengths of the study include inclusion of cirrhotic controls having a variety of etiologies, inclusion of hepatitis controls, focus on cases diagnosed in the absence of cut-off AFP levels, determination of an AFP slope cut-off value associated with high risk of HCC, follow up data from up to 18 years of observation, and confirmation of a lack of appreciable contribution of inflammation to AFP expression.
Conclusions
The results of our current study suggest that measuring AFP values during HCC screening indeed has value if AFP longitudinal trends rather than single value elevations are used. The mean slope of log (AFP) for the pre-diagnosis HCC group was estimated to be 1.21 while that of the controls was 0.15, and that difference was statistically significant. A mean log (AFP) slope cut-off of 0.32 gave a sensitivity of 89% and specificity of 70%. Because AFP testing is inexpensive and readily available, routine regular AFP expression monitoring with AFP trend analyses could still be useful for the early detection of HCC. Although these results are highly statistically significant, a prospective multi-center should be undertaken to confirm these conclusions.
Abbreviations
- AASLD:
American Association for the Study of Liver Diseases
- AFP:
alpha-fetoprotein
- ALT:
alanine aminotransferase
- AST:
aspartate aminotransferase
- AUC:
area under the curve
- HCC:
hepatocellular carcinoma
- US:
ultrasound
Declarations
Acknowledgement
The support of the Herman Lopata Chair in Hepatitis Research (GYW) is gratefully acknowledged.
Ethical statement
The study received approval from the University of Connecticut Health Center IRB, granting an exempt status for the use of de-identified data, and adhered to the ethical guidelines of the Declaration of Helsinki (as revised in 2013).
Data sharing statement
No other data is available
Funding
None to declare.
Conflict of interest
The authors have no conflict of interests related to this publication.
Authors’ contributions
Proposed concept for the study, analyzed data, and revised the manuscript with critical revisions (GYW), collected and analyzed data, drafted the manuscript (AT), analyzed data (LDCG, CLK).