Introduction
Liver biopsy with subsequent histological scoring is the best measure of fibrosis; and, in patients with chronic hepatitis C (CHC), it provides important information relevant to treatment decisions.1–6 Fibrosis assessment can also determine whether the presence of significant fibrosis/cirrhosis warrants additional screening measures for varices and hepatocellular carcinoma (HCC). Despite its potential value, fibrosis assessment by liver biopsy has been declining. The advent of direct acting antivirals (DAA) for the treatment of hepatitis C virus (HCV) and their accompanying minimal side effects and high response rates have decreased the number of patients with CHC who need biopsies to identify advanced liver disease. In addition, liver stiffness, as evaluated by vibration controlled transient elastography (Fibroscan®), and lower cost noninvasive assays are increasingly being used as alternatives to liver biopsy.7,8 Selecting a reliable noninvasive test to evaluate hepatic fibrosis in CHC patients remains an ongoing challenge to clinicians.9–12 Serum markers have been evaluated to predict hepatic fibrosis, and, as compared to liver biopsy, they are safer, faster, and less expensive.13 While multiple serum marker based tests have been developed and evaluated, it remains difficult to define a perfect surrogate assay. An important observation for all assays (including biopsy based assays) is that errors in fibrosis scoring are less likely to occur at the two extremes (minimal fibrosis and maximum fibrosis/cirrhosis) than at the intermediate scoring range. Since the surrogate fibrosis markers can be reasonably used to define minimal fibrosis (F0-F1) or significant fibrosis (F2-4), they can also be used to determine which patients need a biopsy to more accurately stage their liver disease.
There are two categories of surrogate serum markers: panels using indirect markers of fibrosis that are typically associated with liver function and panels using direct serum proteins associated with fibrosis.14–18 Examples of the former are aspartate aminotransferase (AST) platelet ratio index (APRI) and Fibrosis-4 (FIB-4), and examples of the latter are FibroSpect® II (FSII) and FibroTest (Fibrosure in the USA). An important distinction between the two types of tests is that the serum surrogate marker tests can be calculated using standard laboratory values, while the FS II and Fibrosure assays require additional serum protein assays and patented algorithms that increase their cost relative to APRI and FIB-4. In addition, the use of assays from both categories has been proposed to improve scoring of fibrosis in patients.19–22
Most of the surrogate assays are evaluated using a representative population and do not take into account ethnic/racial disparity, which could lead to variable performance in a subpopulation of patients. Based on several instances where high FSII results led to biopsies in patients with minimal fibrosis, we evaluated the performance of fibrosis assays specifically in African Americans (AA).
Methods
Using data from the electronic medical records (EMR) at the Wayne State University Physician Group (WSUPG) Gastroenterology Clinic, patients who had FSII ordered were identified. The study protocol was approved by the Institutional Review Board at Wayne State University. Three hundred nineteen individuals (limited to AA and Cau only) with FSII values were obtained between January 2008 and June 2013, and, of those, 160 also had results from liver biopsy. Race was typically self-reported, and we used AA to indicate individuals who were classified as Black and Caucasians (Cau) as those who were classified as white. All other races, including Asian, Hispanic, and Middle Eastern, were a minority, and those individuals were excluded from this study. Patient characteristics, the results of laboratory, imaging, and endoscopic studies, histological studies, medical history, and risk factors for fibrosis were obtained from the EMR. Patients with CHC were both antibody and HCV RNA positive. Alcohol intake information was not collected due to the variability in the EMR data. Liver biopsy histological scoring for fibrosis was by Metavir (F0-F4). Data incorporated in the collection sheet were selected on the basis of chronological proximity to FSII (closest dates to FSII date were selected). The APRI was calculated using the formula: ((AST (IU/mL)/40 IU/mL)/Platelets (count × 103/mL)) × 100.23 The FIB-4 was calculated using the formula: (Age (years) X AST (IU/mL))/((Platelets (count × 103/mL)) X square root (ALT (IU/mL)).23 Both assays are continuous variables, and the literature based cutoffs to define minimal (F0-1) vs. significant (F2-4) fibrosis were 0.7 for APRI and 1.45 for FIB-4. Statistical evaluation was performed using a SAS based program (JMP®). Differences were evaluated using Student’s t test, analysis of variance (ANOVA) or chi-square, depending on the variable. For evaluating assay performance, 2 × 2 contingency tables and receiver operating characteristics (ROC) curves were also used.
Results
The majority of the 319 patients with FSII measurement were AA (275 = 86%;), similar in gender (AA = 56% male; Cau 60% male), had an average body mass index (BMI) of 25-29 kg/m2, and an average age of 58 years (Table 2). Liver relevant parameters, such as average alanine aminotransferase (ALT) and AST levels, were elevated, while average platelets and albumin were at the low range of normal. Patients with HCV had high levels of viremia, as indicated by an average value greater than 4 × 106 copies/mL. CHC was the primary etiology for liver disease and was more likely to be the reason for CHC in AA than in Cau (AA = 250; 91%; Cau = 29; 66%, p < 0.005).
Initial analysis compared all patients (regardless of etiology) with biopsy (n = 160) to all patients with an FSII result (n = 319) by race. Biopsy Metavir score regardless of etiology defined AA as having less fibrosis than Cau (1.5 ± 0.1 AA vs. 2.1 ± 0.3 Cau, p < 0.05 continuous variable Student-t-test; Pierson Chi-Square p < 0.005 as a nominal variable). In contrast, FSII results showed that AA had more fibrosis than Cau (60 ± 2 AA vs. 46 ± 4 Cau p < 0.005 by Student-t-test). The dichotomy in the results suggests that fibrosis scores in AA are higher by the Fibrospect II assay than by liver biopsy.
Given the racial difference in fibrosis scoring between the liver biopsy and the FSII assay, additional analyses were performed that were restricted to patients with matched biopsies and surrogate measurements and limited to the population of interest (AA HCV patients). Two additional surrogate assays (APRI and FIB-4) based on liver function relevant serum markers were included in the analysis to evaluate alternative assays to FSII. Rather than using numerical scores, the analysis was modified to reflect whether the scores indicated minimal fibrosis (F0-F1) or significant fibrosis (F2-F4). This scoring is preferred for FSII and is predicted to improve the accuracy of the scoring of all assays by taking into account fibrosis score, regardless if the assay is more accurate on either end of the scale than the middle (Fig. 1). The results are presented by race and compare AA to Cau patients. The numbers in parenthesis in the figure represent the number of patients in each group. The liver biopsy Metavir scores in Figure 1 indicated that similar numbers of patients were present in both categories, although AA were more likely to have lower scoring fibrosis than Cau. Thus, the population was not biased significantly towards either end of the fibrosis scoring. The greatest divergence in scores from the biopsy results was in the FSII assay for AA, where 70% of the patients scored as significant fibrosis (F2-F4). Compared to Cau, these numbers were identical. The APRI was closest to the biopsy in AA with 38% advanced fibrosis, while the FIB-4 was closer to the FSII with 65% significant fibrosis. Although the numbers were small, in Cau as compared to AA, the APRI score was the most likely to be different from the biopsy results.
Patient results for the three noninvasive assays were then plotted as their individual numerical scores for each stage of the liver biopsy scores in order to identify patients with discordant scores (Fig. 2). All three assays had increasing fibrosis with increasing Metavir scores, regardless of race, and the fit line was highly significant (p < 0.001). The FSII assay had the most over-scored AA patients (42/134 = 31%). This was different from the Cau patients, where only 8% were over-scored. The APRI was the best performing assay with respect to accurately scoring minimal fibrosis in AA (17/125 = 14% over scored), while the FIB-4 assay was similar to the FSII (42/125 = 34% over scored). In contrast, underscoring (i.e., identifying patients with significant fibrosis) was best for FSII (10/154 = 7%) as compared to APRI (27/125 = 22%) and FIB-4 (17/125 = 14%).
Since AA may have different cutoffs for the noninvasive assays than for the literature evaluated population that was predominately Caucasian, ROC curves were plotted, and the ROC tables were used to define the cutoffs and subsequent over scoring (false positives) for the three assays. Figure 3 presents the ROC curves, the AUC, the cutoffs (as determined by the statistical program), and the percentage of false positives using the AA specific cutoffs. Only the cutoff for FSII (52) was different from the literature cutoff and the test manufacturer values of 42. The APRI and FIB-4 were similar to those of the literature. Even when using the AA specific cutoff, the false positive rate was still 24% (data not shown).
Based on the observation that the FSII assay was the most sensitive assay at defining patients with significant fibrosis (i.e., low underscore rate) but overestimated fibrosis in 24%-31% of AA patients, we evaluated the hypothesis that using a noninvasive assay that was based on a different set of markers than the ones used in the FSII assay would improve the use of noninvasive markers for predicting fibrosis in the AA population. Thus, the initial analysis focused on reducing the number of false positive patients using the FSII assay. Based on the number of patients who by biopsy were false positive in FSII (oval in Fig. 4) but defined as minimal fibrosis by both the APRI and FIB-4 assays, we found that combining the assays improved the accuracy (specificity) of the FSII with respect to correctly scoring patients with minimal fibrosis. Using APRI, the false positive number of patients in the FSII assay (FOver) was reduced from 42 to 10 (7% of the total patients over-scored). For the FIB-4 assay, the false positive number of patients in the FSII assay (FOver) was reduced from 42 to 15 (10% of the total patients over scored). For the high fibrosis accurate patients (FAccH) in the FSII assay, a significant number was underscored by the APRI and FIB-4, such that using these assays to modify the patients scored as high fibrosis by FSII also reduced the number of correctly scored high fibrosis patients. The rectangular box represents patients with biopsy proven significant fibrosis who were underscored by FSII but correctly scored as significant fibrosis by APRI or FIB-4. Since there were few patients in this category (i.e., under-scored by FSII), using the APRI or FIB-4 in these high FSII patients would have minimal effect.
A useful method for assessing overall assay performance when comparing single assays vs. combinations is to define specificity and sensitivity in a 2 × 2 contingency table. For the contingency tables in Table 1, biopsy Metavir scores were used to define F0-F1 vs. F2-F4 using AA patients with CHC for whom both biopsy and noninvasive assays results were available. The literature cutoffs for FSII, APRI, and FIB-4 were used to define the F0-F1 and F2-F4 patients in the respective assays (Table 2a–c). Values in Table 2d and e were calculated first using the FSII assay due to its sensitivity for fibrosis and the high fibrosis scores were modified using the APRI or FIB-4 assay to reduce the false positives. The FSII assay was the most sensitive (84%) but the least specific. The APRI assay the most specific (75%) but had poor sensitivity. The best sensitivity, as defined by the fewest false positives, was achieved by combining the FSII and APRI. Unfortunately, this combination reduced the specificity of the assay for significant fibrosis as compared to FSII alone. Thus, the data confirm that combining the two assays (FSII and APRI) results in predicting AA patients who have minimal fibrosis with exceptional accuracy but at the cost of missing a number of patients with significant fibrosis.
Table 1.Demographics and laboratory values for patients in study
| AA | | | | Cau | | | |
| HCV | (N)# | Other etiology | (N)* | HCV | (N)* | Other etiology | (N)* |
Age (years) | 59 | (250) | 54 | (25) | 54 | (29) | 54 | (15) |
Gender (male) | 56% | (205) | 56% | (25) | 69% | (29) | 40% | (15) |
BMI (kg/m2) | 29 | (124) | 25 | (25) | 26 | (8) | 29 | (4) |
ALT (IU/ml) | 62 | (239) | 41 | (25) | 76 | (26) | 45 | (13) |
AST(IU/ml) | 61 | (239) | 60 | (25) | 54 | (26) | 39 | (13) |
Platelets (x106) | 213 | (233) | 204 | (22) | 216 | (26) | 212 | (12) |
Albumin(gm/dl) | 3.97 | (250) | 204 | (23) | 3.97 | (25) | 4.08 | (12) |
HCV RNA (x106 copies/ml) | 4.2 | (127) | | | 6.7 | (8) | | |
# number of patients in each group | | | | | | | | |
Table 2.Contingency tables (2 × 2) for biopsy defined positive (F2-4) vs. negative (F0-F1) among AA with CHC
Table 2a. Fibrospect II assay |
Table 2a | | Biopsy metavir score | | FibroSPECT II |
| | F2-F4 | F0-F1 | | Over-scored (false positive) | 42/134 | 31% |
FSII | F2-F-4 | 51 | 42 | 93 | Under-scored (false negative) | 10/134 | 7% |
F0-F1 | 10 | 31 | 41 | Sensitivity for significant fibrosis | 51/61 | 84% |
| | 61 | 73 | 134 | Specificity for significant fibrosis | 31/73 | 43% |
Discussion
The 2014 guidelines from the American Association for the Study of Liver Diseases (AASLD)/Infectious Diseases Society of America (IDSA) (Recommendations for Testing, Managing, and Treating Hepatitis C (http://www.hcvguidelines.org ) recommend evaluation for advanced fibrosis in all persons with HCV infection in order to facilitate an appropriate decision regarding treatment strategy and to determine the need for initiating additional screening measures. Additional screening measures could include evaluation of patients with advanced fibrosis for cirrhosis, varices, and HCC. Fibrosis screening is also useful in following the progression/regression of fibrosis with the goal of determining when patients with advanced disease can be defined as no longer needing routine screening. Unfortunately, there is no clear consensus on surrogate markers for fibrosis that can replace liver biopsy. Even more significant is that very little information is available about their performance in significant subsets of individuals, such as the AA population, which comprises the predominate group in many urban HCV treatment settings. This study focused on the performance evaluation of three noninvasive assays in AA. The three assays (FSII, APRI, and FIB-4) were compared to biopsy results to evaluate their performance. The combination hypothesis that using a fibrosis specific assay (FSII) with a serum liver function marker based assay (APRI or FIB-4) would improve accuracy relative to the individual tests in AA was also evaluated.
The assay used in our clinic representing a fibrosis specific assays was the FSII test.24 This test measures fibrosis specific proteins, uses a patented algorithm, and is performed by Prometheus Laboratories in the United States. FSII uses the combination of α2-macroglobulin (α2M), tissue inhibitor of metalloproteinase 1 (TIMP1), and hyaluronic acid (HA) to differentiate minimal (F0-F1) from advanced (F2-F4) fibrosis. This panel of markers were initially evaluated in 696 CHC patients where the predictive accuracy for significant fibrosis (F2–F4) based on area under the ROC (AUROC = 0.831) had a sensitivity of 77% and a specificity of 73%. A value ≥ 42 was proposed to be optimal for differentiating advanced fibrosis from minimal fibrosis. Many subsequent studies have validated this surrogate of biopsy in representative populations, including a prospective study on 108 CHC patients comparing serum FSII results with liver biopsy. The diagnostic value of FSII to detect significant fibrosis in that study as assessed by AUROC, which was 0.826, yielded a sensitivity of 72% and a specificity 74%. Unfortunately, as shown in the current study, the results from these population based studies, do not translate to the AA population, where we found that the AUROC was 0.69, and the sensitivity of 84% was tempered by the lower specificity of 43%. As demonstrated in this study, this lack of specificity is due to the fact that almost 1/3 of the AA patients with HCV are over-scored in the FSII assay. When our population was used to evaluate a possible modification in the cutoff for AA as compared to the general population, the AA specific cutoff of 52 (vs. 42) did not significantly improve the over-scoring. Thus, use of this assay could lead to the performance of unnecessary biopsies in AA patients with minimal fibrosis.
Since there were a number of studies suggesting that combining surrogate assays may improve results in representative populations, we evaluated the use of both APRI and FIB-4 as assays for measuring fibrosis in our AA population. We also evaluated whether they could be combined with FSII to increase accuracy in the AA population. APRI is one of the most studied panels of indirect markers.23 This score is based on the AST level and platelet count and is easy to calculate10,12,17,25,26. An APRI cutoff of 0.7 (most commonly used) had a sensitivity of 77% and a specificity of 72% for detecting significant fibrosis (METAVIR F2-F4) based on a meta-analysis of 40 studies of primarily Asian and Cau populations. In our AA population, the sensitivity was lower (53%) while the specificity was similar (75%) compared to Asian and Cau populations. In contrast, the FIB-4 in AA had a good sensitivity (70%) but poor specificity (48%). Vallet-Pichard et al. found that the FIB-4 index < 1.45 had a specificity of 98% and a sensitivity of 74% for excluding significant fibrosis in the French population.12 Thus, in our study with AA patients, all three assays did not perform as well as reported for a variety of different populations that did not include significant numbers of AA individuals with CHC.
Since our study focused on AA populations, we used ROC analysis to confirm that the cut-offs in the literature were appropriate for our AA population. The differences were minor for APRI and FIB-4(0.7 vs 0.8 and 1.45 vs 1.8). Regardless, when comparing AUROC, our AA population had values that were considerably lower than reported for the three assays in the literature, consistent with a poorer performance of the three assays in AA patients. We then evaluated the combination of FSII with the two assays. Combining either assay with FSII significantly improved the identification of patients that did not need a biopsy. Based on our study, it is possible that the more easily calculated APRI, which has a lower rate of over-scoring than FIB-4, is to be preferred to the FIB-4. However, there was no significant difference between the two when used in the combination study.
Based on these results, we propose that in the AA population, a positive FSII test (FSII ≥ 42) can be defined as significant fibrosis only if the APRI or FIB-4 scores are also elevated. Patients with high FSII but low APRI/FIB-4 can be confidently defined as having low fibrosis. Thus, they have no need for biopsy since combining the FSII and APRI/FIB-4 assays reduces the high false positive rate of the FSII assay in AA. In contrast, although the FSII assay is highly sensitive for measuring significant fibrosis, combining it with APRI or FIB-4 decreases the sensitivity for significant fibrosis due to exclusion of some patients with significant fibrosis but for whom the APRI/FIB-4 underscores. Thus, our primary conclusion is that these surrogate assays are useful in AA populations as guidelines to patient counseling but in a considerable number of patients they fail to accurately report fibrosis as standalone assays.
With respect to future methodology for measuring liver fibrosis, liver stiffness measurement using vibration-controlled transient elastography (VCTE) with FibroScan units is being utilized more often as a method for immediate assessment of liver stiffness.27–29 While FibroScan appears to be better than combining APRI and FSII with respect to specificity and sensitivity of a noninvasive assay for identifying patients with minimum fibrosis, it has not yet been evaluated fully in the AA population. It is also currently only available in a few large centers where the cost of purchase, training, and utilization can be justified by the large volume of patients. Since many AA are identified, treated, and evaluated for surveillance follow-up in practices that do not have access to such technology, the results of this study provide these physicians with relative confidence that they can accurately distinguish those patients with minimal fibrosis and counsel them accordingly. This study also suggests that the three surrogate assays do not perform as well in AA patients as in the various other populations reported in the literature. AA patients with significant fibrosis or indeterminable fibrosis should be referred to facilities where FibroScan determination of fibrosis could be performed in the event a liver biopsy is not an option.
Acknowledgements and Statements
The authors have no financial conflicts to report. The corresponding author has reviewed the STARD recommendations and confirms that the manuscript adheres to those suggestions.
Abbreviations
- α2M:
α2-macroglobulin
- AA:
African American
- AASLD:
American Association for the Study of Liver Diseases
- ALT:
alanine aminotransferase
- APRI:
AST platelet ratio index
- AST:
aspartate aminotransferase
- BMI:
body mass index
- Cau:
Caucasian
- CHC:
chronic hepatitis C
- DAA:
direct acting antivirals
- EMR:
electronic medical records
- FIB-4:
Fibrosis-4
- FSII:
FibroSpect II
- HA:
hyaluronic acid
- HCC:
hepatocellular carcinoma
- HCV:
hepatitis C virus
- IDSA:
Infectious Diseases Society of America
- ROC:
receiver operating characteristic
- TIMP:
tissue inhibitor of metalloproteinase 1
- VCTE:
vibration-controlled transient elastography
- WSUPG:
Wayne State University Physician Group
Declarations
Conflict of interest
None
Authors’ contributions
Data collection (MT, PT, SP, JA, DG,), data analysis (MT, PT, DG,), data interpretion (MT, PT, SP, JA, FA, ME), conception of the study (PT), article draft (MT), article revision (PT, FA, MM, ME), assistance in design and conception of the study (ME)