Introduction
Non-alcoholic fatty liver disease (NAFLD) is a non-communicable disease associated with at least a quarter of the world’s population. The disease is characterized by fat accumulation in >5% of the hepatocytes (steatosis) without significant alcohol intake. The average NAFLD prevalence rate is considerably higher in Asia (27%) compared to the United States (24%) and Europe (23%).1–3 NAFLD condition can be seen in both obese individuals as well as people having average weight. “Lean/non-obese NAFLD” is the medical term for NAFLD if seen in normal-weight individuals. Lean/non-obese NAFLD accounts for almost 20% prevalence within the NAFLD population and nearly 5% in the general population.4 Pathophysiological conditions such as central obesity, hyperglycemia, dyslipidemia, and hypertension are closely linked to NAFLD.5 Adipose tissue lipolysis is the primary source of hepatic lipid accumulation.6 During obesity, this basal lipolysis gets enhanced, resulting in an enhanced flow of free fatty acids to the liver (and skeletal muscle).7 This, in turn, promotes resistance to insulin.7,8 Alternate sources for lipid accumulation in the hepatocytes include de novo lipogenesis from excess dietary sugars and dietary fat.6 NAFLD patients may have stable disease, progress slowly or rapidly, or even regress,9 or some will develop hepatocellular carcinoma (HCC).10
From the stage of simple steatosis, NAFLD may progress to non-alcoholic steatohepatitis (NASH), fibrosis, cirrhosis, and end-stage liver disease, which ultimately necessitates liver transplantation.11 NASH is characterized by simple steatosis plus inflammation and hepatocyte ballooning. NAFLD progresses to NASH when the liver exceeds its capacity to metabolize free fatty acids and is characterized by the formation of harmful lipid species and activation of lipotoxic pathways, leading to hepatocellular injury, as observed both in humans and mice models.12–15 Repeated hepatocellular injuries lead to irregular wound healing, ultimately leading to fibrosis or scarring of liver tissue. During fibrosis, inflammatory molecules activate hepatic stellate cells to become myofibroblasts. These myofibroblasts migrate into the parenchyma of the liver and secrete the collagen-rich extracellular matrix that characterizes fibrosis or scarring.16 Scarring also gets increased due to inhibition in fibrolysis.16 Scarring leads to cirrhosis, characterized by thick, fibrous septae and architectural distortion. While steatosis and fibrosis can be reversed, it becomes irreversible once the cirrhosis stage is reached. Once cirrhosis is developed, it predisposes to portal hypertension and organ failure complications and progresses into HCC (Fig. 1).11
Consumption of high fructose, food rich in calories, and less physical activity undoubtfully increases the risk of NAFLD.17 NAFLD is positively associated with insulin resistance (IR) and other metabolic syndromes and is highly prevalent in adults and children.18 In addition to these environmental and metabolic dysregulations, both epigenetic changes (histone modifications, non-coding RNAs, etc.), and genetic (inherited genetic variants) factors are identified, which are responsible and/or leading risk factors for NAFLD progression into NASH, cirrhosis, and/or HCC.19 The role of epigenetics and gut microbiota in NAFLD progression is beyond the scope of this review, and the readers are advised to go through some of the recent reviews on this topic.20–23 Several genome-wide association studies (GWAS) were carried out over the past decade, which led to the identification of multiple genetic variants which predispose people to NAFLD. Aberrations in lipid processing play a central role in the pathogenesis of steatosis and progressive liver disease. It is not surprising that the variants that affect the function of genes, which are involved in the triglyceride (TG) synthesis, storage, and export, are known to modulate the individual’s susceptibility to NAFLD (Fig. 1).24 The other genetic variants that have been associated with NAFLD include the genes regulating fibrogenesis, oxidative stress, and insulin signaling (Table 1).25–71 Besides, aberrant expression of several adipokines and cytokines is also associated with NAFLD onset and progression. In this review, we discuss various classes of genetic variants that predispose individuals to NAFLD.
Table 1Candidate genes discussed in this review that are involved in the non-alcoholic fatty liver disease progression
S. No | Gene Name | Pathway Involved | References |
---|
1 | Adiponectin | Determine the Insulin Sensitivity | 25 |
2 | Angiotensin II | Involved in Fibrosis | 26 |
3 | Angiotensin II type 1 receptor | Involved in Fibrosis | 27 |
4 | Apolipoprotein C3 | Involved in Lipid Metabolism | 28 |
5 | Apolipoprotein E | Involved in Lipid Metabolism | 29 |
6 | CD14 | Regulates cytokine release | 30 |
7 | CYP2E1 | Involved in Oxidative Stress | 31 |
8 | ENPP1 | Involved in Lipid Metabolism | 32 |
9 | HFE | Regulates Iron Absorption | 33,34 |
10 | Glutathione-S-Transferase | Involved in Oxidative Stress | 35 |
11 | GCKR | Involved in Lipid Metabolism | 36–38 |
12 | HSD17B13 | Involved in Lipid Metabolism | 39–41 |
13 | IL-6 | Regulates Inflammation | 42,43 |
14 | IRS-1 | Involved in Insulin Signaling | 32 |
15 | KLF6 | Involved in Fibrosis | 44 |
16 | Leptin Receptor | Regulates Inflammation | 45 |
17 | MTP | Involved in Lipid Metabolism | 46,47 |
18 | MBOAT7 | Involved in Lipid Metabolism | 48–50 |
19 | NCAN | Remodeling of the extracellular matrix | 51 |
20 | PEMT | Involved in Lipid Metabolism | 52 |
21 | PIK3AP1 | Involved in PI3K Signaling | 53 |
22 | PNPLA3 | Controls Lipid Accumulation | 54–56 |
23 | PPARGC1A | Controls Lipid Accumulation | 57 |
24 | PPARα | Controls Lipid Accumulation | 58 |
25 | PPARg | Controls Lipid Accumulation | 59 |
26 | PXR | Involved in Lipid Metabolism | 60 |
27 | SOD2 | Involved in Oxidative Stress | 61 |
28 | STAT3 | Regulates Inflammation | 62 |
29 | TCF7L2 | Determine the Insulin Sensitivity | 63 |
30 | TNF-α | Regulates Inflammation | 64 |
31 | TGF-β1 | Regulates Inflammation | 26 |
32 | TM6SF2 | Involved in Lipid Metabolism | 65–67 |
33 | TRAIL | Regulates Inflammation | 68 |
34 | TRIB1 | Involved in Lipid Metabolism | 69,70 |
35 | UGT1A1 | Involved in Oxidative Stress | 71 |
Major genetic risk alleles identified by GWAS studies modulate the progression of non-alcoholic fatty liver disease
The heritable nature of NAFLD was proven in many studies.72,73 GWAS, Exome sequencing, and candidate gene studies have unearthed NAFLD susceptibility genes. Below, we explain the five significant gene products in detail, whose alteration in their genomic sequence majorly affects NAFLD’s development and progression.
PNPLA3 (Patatin-like phospholipase 3; adiponutrin, or calcium-independent phospholipase A2-epsilon, or acylglycerol O-acyltransferase)
Patatin-like phospholipase domain-containing protein 3 is a single-pass type II membrane protein. This enzyme is multifunctional and is shown to have at least three enzymatic activities, including triacylglycerol lipase, acylglycerol O-acyltransferase, and retinyl esterase.74–76 PNPLA3 is present mainly on the endoplasmic reticulum membrane and on the lipid droplets’ surface in hepatocytes, hepatic stellate cells, and adipocytes.77–78 Pingitore et al. demonstrated that overexpression of wild-type PNPLA3 protects against fibrosis by decreasing the secretion of tissue inhibitor of metalloproteinases 1 and 2 (TIMP1 and 2) and secretion of matrix metallopeptidase 2.77
To date, the C>G transversion polymorphism (rs738409[G]) encoding isoleucine to methionine (I148M) substitution in PNPLA3 has been correlated the best with susceptibility to non-alcoholic,54 as well as alcoholic79 liver disease (Fig. 2a). This transversion is observed at a high frequency in Hispanics, which subjects the liver to damage from fatty liver to NASH, fibrosis, and HCC.80 The presence of a homozygous minor allele (I148M; non-synonymous variant) in the PNPLA3 gene generally results in high hepatic fat content54,55 and high hepatic inflammation,81 as indicated by elevated alanine aminotransferase (ALT) and aspartate aminotransferase (AST) concentrations.56 PNPLA3 I148M variant has limited hydrolase activity, therefore halts retinyl ester release, and finally accumulating of TGs and retinyl esters in the hepatocytes and stellate cells, leading to fatty liver.76,82,83 Besides, PNPLA3 I148M variant protein accumulates on lipid droplets’ surface and obstructs lipid modifications in hepatocytes. Furthermore, it also inhibits other lipases’ activity (i.e., PNPLA2), and finally, reduces the turnover and release of TGs.84 This may be due to less availability of the I148M protein to ubiquitin ligases, leading to an impairment in 26S proteasomal degradation.83 It is important to note that environmental factors trigger this protein accumulation. A meta-analytic study of 24 studies with 9915 patients from diverse backgrounds identified that the PNPLA3 I148M variant was extensively associated with fibrosis (Fig. 2b).85 Another meta-analysis report of 16 studies showed that the presence of the PNPLA3 I148M variant had significantly increased steatosis and fibrosis. Guanine guanine (GG) homozygous genotype individuals demonstrated 77% more fat content86 and a 5-fold increase in disease development than the cytosine and cytosine (CC) nucleotides genotype.87 In contrast, another allele (rs6006460[T]) encoding serine to isoleucine (S453I) substitution in the same gene is most prevalent in the African-American population and is associated with diminished fat content in the liver.54
The PNPLA3 I148M variant has been associated with the clinical presentation of HCC, particularly with more advanced disease and a poorer prognosis.88,89 Allelic frequency calculations showed that rs738409 C>G shows more risk of HCC progression than the progression to cirrhosis.87,90 The unadjusted risk for HCC with the homozygote GG genotype was 12.19 (95% CI 6.89–21.58) over CC.87 Homozygous PNPLA3 148M individuals (along with long-standing advanced liver disease and cirrhosis) have a 2 to 16-fold increased risk of developing HCC.91 Similarly, a 16-fold higher HCC risk was observed in obese subjects with rs738409 C>G homozygosity.92
Transmembrane 6 superfamily member 2
Transmembrane 6 superfamily member 2 (TM6SF2) is mainly expressed in the liver and small intestine and regulates cholesterol synthesis and secretion of lipoproteins.65,93,94 The protein coded by TM6SF2 gets concentrated at the endoplasmic reticulum and golgi in the hepatocytes and aids in TG-rich lipoproteins secretion. A decreased expression of TM6SF2 upsurges hepatic TG content in cell culture models,93 suggesting that loss of TM6SF2 may be priming HCC development and progression.
Single nucleotide polymorphism (SNP) rs58542926 (c.449 C>T; causing E167K missense mutation) in TM6SF2 is strongly associated with NAFLD65 in both adults65,95–98 and pediatric99,100 cohorts. This non-synonymous polymorphism is second to that of PNPLA3 and seen in Europeans (∼7%), Hispanics (∼4%), African Americans (∼2–3%),65,101 as well as Indians.102 This SNP is associated with advanced hepatic fibrosis/cirrhosis.96 Due to the loss-of-function mutation of this variant protein, the reduced TM6SF2 mRNA and protein expression result in withholding lipids by impairing very-low-density lipoprotein (VLDL)-mediated lipid secretion (Fig. 3). Therefore, the predisposition of the TM6SF2 variant increases the susceptibility to liver damage.65,95,96,103,104 NAFLD with TM6SF2 E167K polymorphism comes under a distinct category. Even though these patients show increased liver fat content, they preserve the insulin sensitivity regarding lipolysis and hepatic glucose.66 Moreover, persons with TM6SF2 E167K polymorphism do not show hypertriglyceridemia.66 The TM6SF2 function is not well established; however, it is thought to be involved in VLDL secretion.65 Subjects carrying TM6SF2 rs58542926 C>T E167K variant had reduced circulating lipids and improved lipid profiles, and had reduced cardiovascular disease risk. Substitution of lysine with glutamic acid at residue 167 of TM6SF2, which reduces the protein production by at least 2-fold, decreased the VLDL secretion by 50%.65 TM6SF2 loss of function using Caco-2 cells in vitro and the zebrafish model in vivo demonstrated diminished lipid clearance and increased ER stress and TG accumulation in enterocytes.67 TM6SF2 is required for apolipoprotein B (APOB) protein stability.105 The E167K mutation in TM6SF2 cannot stabilize the APOB protein, resulting in hepatic lipid accumulation and low serum lipid levels.105
Liver-specific deletion of TM6SF2 in mice leads to impairment in VLDL secretion, resulting in hepatic steatosis and fibrosis, ultimately causing accelerated HCC development.106 Further, in these mice models, the tumor burden is inversely correlated with the TM6SF2 protein levels.106
Glucokinase regulatory protein
The glucokinase regulatory protein (GCKR) gene encodes glucokinase regulatory protein and is expressed mainly in the liver, pancreas, and colon. In the hepatocytes, GCKR shuttles between the cytoplasm and nucleus and regulates de novo lipolysis by controlling the glucose influx. GCKR is a potent glucokinase (GK) enzyme inhibitor and prevents the liver’s glucose storage and disposal. GCKR binds to the enzyme and forms an inactive complex. Fructose metabolites regulate the binding affinity of GCKR for GK.
The rs780094 (C>T) variant is present in an intronic region of GCKR. Still, it shows a high percentage of linkage disequilibrium with rs1260326, a functional non-synonymous substitution C>T encoding the GCKR protein with P446L substitution (Fig. 4).107 This SNP in the GCKR contributes to the risk of type-2 diabetes and dyslipidemia in multiple populations.36 The association between the rs1260326 C>T polymorphism and the NAFLD progression is demonstrated in various GWAS studies from East Asia,37,38,108 South Asia,109 and Europe.110,111 African American and Hispanic NAFLD patients also show concordant but nonsignificant trends in rs1260326 C>T polymorphism.101 The P446L substitution is closely associated with liver fat deposition and is one of NAFLD’s inheritable risk factors.112,113 GCKR rs1260326 P446L variant reduces the inhibitory activity of GCKR over glucokinase, which results in an enhancement in glucose flux and an increase in de novo lipogenesis.114,115 GCKR rs1260326 P446L variant negatively regulates the GK activity in response to fructose-6-phosphate. This means it increases glucose uptake by the liver and decreases the circulating glucose and insulin levels. In addition, the GCKR rs1260326 P446L variant also directs increased glycolysis and malonyl-CoA production, which is the substrate for lipogenesis and blocks fatty acid oxidation, thus, favoring hepatic fat accumulation.115 A meta-analysis of five studies revealed that the rs780094 variant in GCKR is positively associated with hepatic steatosis.116 Thus, not surprisingly, the same variant is also associated with a modest risk of having a fatty liver, whereas homozygous people are at a 1.2-fold higher risk of developing NAFLD in their lifetime.37
Membrane-bound O-acyltransferase domain-containing protein 7
The available data suggest that the TCM4 gene is not abundantly expressed in the human liver (Fig. 5).48 However, the available eQTL (Expression Quantitative Trait Loci) analysis suggests that rs641738 SNP present in the first exon of TCM4 gene leads to C>T missense mutation, and leads to reduced expression and activity of MBOAT7, and perhaps participates in the progression of liver disease.48,117–119
Membrane-bound O-acyltransferase domain-containing protein 7 (MBOAT7) is an integral membrane protein with LPIAT1 (lyso-phosphatidyl acyl-transferase1) enzymatic activity. MBOAT7 is mainly expressed in the hepatocytes, hepatic sinusoidal, and stellate cells and localized in the mitochondria-associated membrane. The mitochondria-associated membrane connects the endoplasmic reticulum and mitochondria, where biosynthesis of fats and formation of lipid droplets occurs.120,121 LPIAT1 is a critical enzyme of the phospholipid acyl-chain remodeling pathway called the “Lands’ Cycle”.122 LPIAT1 adds acyl-CoA to the second acyl chain of lysophospholipids, using arachidonoyl-CoA as an acyl donor. This way, it controls phospholipids’ desaturation and the quantity of free arachidonic acid, a predecessor of proinflammatory mediators.123
SNP rs641738 C>T results in diminished MBOAT7 protein expression and reduced desaturation of phosphatidyl-inositol in the hepatocytes. rs641738T SNP favors hepatic fat accumulation and contributes to various liver diseases, including NAFLD, fibrosis, and HCC.48–50 SNP rs641738 was initially associated with alcoholic cirrhosis.124 However, immediately after, it was also linked with steatosis and progressive NAFLD,48,98,125 as well as with markers of liver fibrosis.126 MBOAT7 rs641738 C>T polymorphism also correlates with high hepatic fat content and elevated serum ALT levels,119 and perhaps may also regulate IR and glucose metabolism in NAFLD patients.127
HSD17B13
The gene HSD17B13 encodes for 17β-HSD type 13 (17β-hydroxysteroid dehydrogenase type 13) enzyme with retinol dehydrogenase activity. It is involved in metabolizing retinoids, fatty acids, and sex steroids.128 The liver is the predominant site of HSD17B13 protein expression129 and is localized to hepatocytes lipid droplets.130
The rs72613567 variant in HSD17B13 is an insertion, wherein the ‘adenine’ nucleotide gets incorporated next to “thymine” (T) present at the donor splice site of exon 6 (T> Thymine and Adenine (TA)). This insertion leads to truncated and unstable protein with diminished enzymatic activity.39 The ‘TA’ minor allele is commonly seen in Europeans and East Asians, while it is seen to a lesser extent in Africans and Hispanic/Latino individuals. HSD17B13 rs72613567: TA is generally associated with lower serum AST and ALT levels and has an overall reduced risk of chronic liver disease (Fig. 6).39,40 Further, the rs72613567: TA is significantly associated with reduced occurrence of chronic liver disease, non-alcoholic cirrhosis, alcoholic liver disease, as well as alcoholic cirrhosis. Surprisingly, data from bariatric surgery patients showed that the HSD17B13 rs72613567: TA allele is associated with liver fibrosis but not with steatosis.41,131 The data from this study confirm that this SNP is probably not involved in steatosis development, but it regulates the disease progression. Although the HSD17B13 expression is high in NAFLD subjects, the precise biological function of HSD17B13 and the pathway/mechanism through which the rs72613567: TA variant contributes to fat accumulation in the liver is still not known.132 Overexpression of HSD17B13 in mouse liver increases lipogenesis, leading to hepatic steatosis.133 This data is consistent because NAFLD patients show enhanced HSD17B13 expression in their livers.132,133 However, HSD17B13 knock-out mice also show increased hepatic steatosis and inflammation phenotype, mainly due to an increase in de novo lipogenesis.134 This observation contrasts with what is observed in humans; HSD17B13 loss-of-function protects from liver disease.39 A recent study in Chinese ethnicity suggests that this variant in the HSD17B13 gene confers disease susceptibility rather than disease progression.135 In highlighting these discrepancies, further studies are warranted to define the role of wt. HSD17B13 and its variants in liver metabolism, both under physiological and pathological conditions.
Recently, another protein-truncating variant in HSD17B13, i.e., rs143404524 and c.573delC, was identified. This deletion leads to a frameshift at codon 192, resulting in prematurely truncated protein.136 This variant is common in African Americans and rare in Whites and Hispanic/Latino people.136 Similar to rs72613567: TA, people with rs143404524 show reduced liver disease. Similarly, rs62305723 G>A encoding missense variant p.P260S, which shows lower enzymatic activity, is also associated with decreased disease severity.132 All these data support the idea that reducing the HSD17B13 expression may halt liver disease progression.
Polymorphisms/mutations in genes involved in insulin signaling/insulin resistance
121Gln mutation in ENPP1 (ectoenzyme nucleotide pyrophosphate phosphodiesterase 1/PC-1) and several polymorphisms of IRS-1 (insulin receptor substrate-1) correlate with liver damage and fibrosis.32 These genetic determinants are associated with IR, which more often predisposes people to NAFLD. Transcription factor 7 like 2 (TCF7L2) is a known regulator of glucose homeostasis. Polymorphisms in this gene that increase the type 2 diabetes risk are correlated with IR and NAFLD severity.63,137 Polymorphisms observed in apolipoprotein E (APOE),29 and nuclear receptor pregnane X receptor (PXR)60 also correlate well with either NAFLD appearance or severity.
Polymorphisms/mutations in genes involved in lipid metabolism
The 493 G>T SNP in the 5′ promoter region of the microsomal triglyceride transfer protein (MTP) is associated with increased steatosis and histological NASH grade compared to either with high-activity homozygous allele or with heterozygous allele-containing patients.46,47 MTP encodes protein-disulfide isomerase and mediates neutral lipid transfer between the membranes.138 MTP is essential for VLDL synthesis and secretion in the liver and the intestine. The G>T SNP in MTP decreases the transcription rate, reducing protein levels that are insufficient to excrete all the TGs from liver cells.
The presence of C-482T, T-455C, or both polymorphisms in the apolipoprotein C3 (APOC3) gene increases the NAFLD prevalence by 38% compared to wild-type homozygotes.139 These same authors showed that people with these polymorphisms also exhibit a whopping 60% increase in plasma TG levels.139
Phosphatidyl Ethanolamine MethylTransferase (PEMT) gene product participates in the phosphatidylcholine synthesis, which is required to synthesize VLDL. Compared to controls, NASH patients show a higher frequency of PEMT loss-of-function (V175M) allele, and these patients mainly show low body mass index.52
Polymorphisms/mutations in cytokine/adipokines
Genetic studies have shown a clear correlation between the severity of liver diseases, such as steatosis, inflammation, IR, and fibrosis, with the SNPs of genes involved in fibrogenesis, inflammation, and oxidative stress-responsive pathways.42,140–143 TNF-α (Tumor Necrosis Factor-alpha) is the first cytokine whose SNPs are linked to susceptibility for NAFLD, steatohepatitis, and IR.64 Mutations/polymorphisms at -238G, -863A, and -1031C in TNF-α are identified in NASH patients.64,144,145 Several other SNPs in genes such as TNF-related apoptosis-inducing ligand (TRAIL),68 Interleukin-6,42,43 Signal Transducer and Activator of Transcription 3 (STAT3),62 Adiponectin,25 Leptin receptor,45 peroxisome proliferator-activated receptor-alpha (PPARα),58 PPARγ,59 peroxisome proliferator-activated receptor gamma-coactivator 1 alpha (PGC-1α)57 have also been correlated with the NAFLD disease progression (Table 1).
Miscellaneous Polymorphisms
The role of phosphatidylinositol 3-kinase (PI3K) signaling as a significant pathway in human cancer development and progression, primarily HCC, is well known.146–148 Moreover, IR abrogation is invariably linked to NAFLD progression.149,150 A recent high-throughput whole exosome gene sequencing had identified a familial c.512C>T variation in the phosphoinositide-3-kinase adaptor protein 1 gene.53 However, this single-family study requires a more extensive population-based genomic analysis before understanding its role in NAFLD progression. Neurocan (NCAN), also known as chondroitin sulfate proteoglycan 3 is a proteoglycan mainly involved in remodeling the extracellular matrix of the central nervous system151 but also shows its expression in the liver.152 Gorden et al. in 2013 demonstrated that NCAN rs2228603 (T variant; P92S) is positively linked with steatosis, lobular inflammation, and fibrosis.51 Mammalian tribbles homolog 1 (TRIB1) regulates the liver’s lipogenesis by modulating the plasma’s TG and Cholesterol levels. SNP rs6982502 present in the enhancer sequence of TRIB1 genomic sequence was shown to be significantly associated with NAFLD.69 SNP rs17321515 and rs2954029 gene polymorphisms found in TRIB1 also increase NAFLD.70 Furthermore, TRIB1 rs17321515 gene polymorphism increased coronary heart disease risk in both the general population and NAFLD patients.153 A recent Multiomics study of NAFLD had identified 18 independent sequence variants at 17 loci, of which 4 loci, i.e., rs140201358 in PNPLA2, rs7029757 in TOR1B, rs1801689 in apolipoprotein H (APOH), and rs6955582 in Glucuronidase beta (GUSB) are novel variants.154PNPLA2 homozygous mutations are associated with neutral lipid storage disease and fatty liver.155PNPLA2 p.Asn252Lys variant leads to an increase in the levels of high-density lipoprotein-cholesterol156,157 and is associated with a high waist: hip ratio. The rs7029757[A] polymorphism in TOR1B has been associated with ALT levels and cirrhosis but not NAFLD.158APOH is highly and almost exclusively expressed in the liver.159,160 The trait associations of p.Cys325Gly in APOH strongly suggest its role in lipid metabolism.157,161–163GUSB encodes β-glucuronidase, a lysosomal enzyme involved in the breakdown of glycosaminoglycans.164 The exact role of how p.Leu649Pro in GUSB associates with NAFL disease progression needs to be established.
Other than these polymorphisms in genes that influence the severity of fibrosis like transforming growth factor, beta 1 (TGF-β1) and Angiotensin II,26 angiotensin II type 1 receptor,27 Kruppel-like factor 6,44 in genes influencing the release and/or response to endotoxin or cytokines like cluster of differentiation 14 (CD14),30 and in genes that regulate oxidative stress like hemochromatosis (HFE),33,34 superoxide dismutase-2,61 cytochrome P450,31 UDP glucuronosyl-transferase 1 family, polypeptide A1,71 and glutathione-s-transferase genes35 are also reported. However, most of these studies were conducted with a small cohort of patients and require further validation using a larger sample size.
GWAS analysis largely failed to discriminate the polymorphisms present in the conditions of high-fat liver accumulation cases vs. the onset of NAFL and/or cirrhosis. Magnetic resonance imaging (MRI)-derived proton density fat fraction provides accurate liver fat quantification, but liver biopsy is essential for NASH diagnosis and staging. In most cases, the gene polymorphisms that account for high fatty deposition in liver cells correlate well with the onset of cirrhosis. The exception is p.His48Arg in alcohol dehydrogenase 1b165 and p.Cys282Tyr in homeostatic iron regulator.166 These two variants are associated with cirrhosis through alcohol consumption 1b and HFE rather than solely through hepatic fat.154 A recent multi-omics approach using 4,809 all-cause cirrhosis and 861 HCC cases led to the identification of 4 gene variants, i.e., rs738409 in PNPLA3, rs28929474 in SERPINA1, rs58542926 in TM6SF2, and rs72613567 in HSD17B13 that are significantly associated with cirrhosis.154 It is worth noting that only two of these four variants (rs738409 in PNPLA3 and rs58542926 in TM6SF2) are associated with lipid accumulation in the liver cells, while the other two variants, rs72613567 in HSD17B13 and rs28929474 in SERPINA1, are associated with NASH39 and α1-antitrypsin deficiency,167 respectively.
Polygenic risk scores and prediction panels and their clinical relevance
Polygenic risk scores (PRS; also known as genetic risk scores) help assess the heritable risk of developing a particular disease in an individual’s lifetime based on the combined number of genetic variations associated with that particular individual.168 GWAS studies had identified robust and reproducible associations for the NAFLD progression with variants in the genes such as PNPLA3, TM6SF2, HSD17B13, MBOAT7, and GCKR. However, PRS forms the forefront frontier in the studies involving NAFLD heritability. PRS helps aggregate the most appropriate genetic variants associated with NAFLD progression with biochemical parameters to identify the patients at greater risk of developing severe NAFLD.169 These PRS and metabolic factors could be utilized to identify severe liver diseases in patients with gene regulatory networks. PRS helps guarantee the most highly predictive value, the best diagnostic accuracy, and the more precise individualized therapy.170
Conclusions and future perspectives
Genetic studies over the past decade, particularly family studies and inter-ethnic susceptibility variations, have significantly advanced our understanding of NAFLD’s molecular basis. The genome-wide scans continue to provide compelling new insights into NAFLD biology and highlight desirable pharmaceutical targets. Identifying genetic factors predisposing an individual to NAFLD also helps to stratify the people at high risk and gives them a chance to follow preventive strategies early in their life. GWAS analysis helped distinguish the genetic polymorphisms that modulate either the steatosis, fibrosis, or both in NAFLD patients. For example, the TM6SF2 E167K genotype associates with steatosis but not fibrosis, whereas MBOAT7 rs641738 is solely associated with increased fibrosis.98 On the other hand, for example, the PNPLA3 genotype is generally correlated with S2-S3 grades of steatosis and F2-F4 stages of fibrosis.98
NAFLD pathogenesis shares several genes involved in more general processes, such as inflammation and fibrosis. NAFLD generally occurs along with systemic metabolic dysfunction leading to a concomitant increase in cardiovascular and diabetes risk.171 Characterizing this shared genetic basis of NAFLD and other related disorders should enable the development of precision medicine approaches and the identification of preventive and therapeutic strategies.171 The data generated from studies involving either candidate-gene associations or GWAS is also exploited to discover drugs. Based on the candidate gene approach and GWAS studies, therapeutic targeting of PPARα, PPARγ, PGC-1α, STAT3, and TNF should benefit NAFLD/NASH patients. Elafibranor is a dual PPARα and PPARδ ligand that alleviates histological outcomes associated with NASH without worsening fibrosis.172
Despite numerous GWAS studies, the identified variants can explain only a fraction of calculated NAFLD heritability.173–176 It may be easy to feel that the previous generation GWAS studies, mainly array-based, are not powerful enough to identify the minute fraction of the genetic variants responsible for NAFLD heritability. However, the more important cause could be that none of the GWAS studies probably considered gene-gene interaction and/or gene-environment interaction to predict the progression of NAFLD. Strengthening this idea, it was observed that the relative risk associated with the NAFLD progression is modified by obesity, which leads to inter-individual risk variability.177 However, all individuals with this jeopardy do not develop the disease, signifying that these inter-individual differences are products of interactions between genes and the environment.178 It is also possible that various genetic factors also could work interactively. For example, APOC3 and PNPLA3 gene variants account for nearly 11% and 6.5% of NAFLD, respectively, while both account for 13.1% of the variance, suggesting the interactive effects of these two polymorphisms.139 GWAS’s vital contribution is identifying polygenic risk scores, which combine the effects of known genetic risk variants. These polygenic risk scores should help predict the patients at the highest risk of developing/progressing the disease.108,179 Although such a risk scoring assessment is highly statistically significant, this process is highly limited by their classification protocol and accuracy in prediction.108
Future Perspectives
GWAS and Omics platforms have identified the gene variants that increase the risk of NAFLD and HCC. However, these variants are comparatively rare and cannot completely explain the increasing prevalence of NAFLD worldwide. It is important to note that diet, lifestyle, and metabolic co-morbidities are the most critical environmental risks.17,180 Lifestyle modifications, essentially involving dietary energy restriction and regular physical activity, are fundamental to the treatment of NAFLD.181,182 American Association for the Study of Liver Diseases recommends at least 3–5% total weight reduction to ameliorate steatosis, while 7–10% weight loss improves most pathophysiological features.183 The European Association for the Study of the Liver, European Association for the Study of Diabetes, and European Association for the Study of Obesity also advise 7–10% of total weight loss target.184
The interaction between genes and nutrients is called Nutrigenomics,185,186 including ‘nutrigenetics’ and ‘nutriepigenetics’. It gives insight into how nutrients influence the expression of a person’s genetic makeup that swaps from healthiness to disease. PNPLA3 is the most widely studied gene, which shows interaction with the diet patterns in NAFLD patients.187 An Italian cohort study showed that high carbohydrate and sugar intake in PNPLA3 homozygous mutated people, but not wild type group led to high TG accumulation.188 PNPLA3 is also influenced by dietary fatty acids, particularly n-6/n-3 polyunsaturated fatty acids.189,190 When challenged with a high-fat diet, the TM6SF2 mutated allele carriers showed better fasting and postprandial lipid profiles. There was also a consequent reduction of circulating atherogenic lipoproteins in them.67,191 MBOAT7 expression was shown to be also downregulated by diet-induced hyperinsulinemia.50 Independent of other metabolic risk factors and dietary restrictions, whole-grain supplementation to the GCKR mutated allele carriers shows a reduction in fasting glucose levels associated with reduced insulin concentrations.192 In addition to these gene variants, hepatic transcription factors, which regulate the metabolism, are known to be perturbed in metabolic syndromes. These include cAMP-responsive element binding proteins like 3 (CREB3L3), PPARα, and forkhead box O1 (FoxO1), etc. Proteins like these can sense nutrient availability and thus can bridge the NAFLD genetics with the environment. i.e., these transcription factors appear to be amenable to dietary and/or nutrient-based therapies, being potential targets of nutritional therapy.193 No drugs are certified to treat NAFLD;194 thus, a better understanding of Nutrigenomics and the development of genome-based dietary guidelines and personalized nutrition therapy for NAFL disease management will help develop effective treatments.
Abbreviations
- ALT:
alanine aminotransferase
- APOB:
apolipoprotein B
- APOH:
apolipoprotein H
- AST:
aspartate aminotransferase
- GCKR:
glucokinase regulatory protein
- GK:
glucokinase
- GWAS:
genome-wide association studies
- GG:
guanine guanine
- GUSB:
Glucuronidase beta
- HCC:
hepatocellular carcinoma
- TIMP:
tissue inhibitor of metalloprotease
- HFE:
hemochromatosis
- IR:
insulin resistance
- MBOAT7:
membrane-bound O-acyltransferase domain containing protein 7
- MTP:
microsomal triglyceride transfer protein
- NAFLD:
non-alcoholic fatty liver disease
- NASH:
non-alcoholic steatohepatitis
- NCAN:
neurocanin
- PEMT:
phosphatidyl ethanolamine methyl transferase
- PGC-1α:
peroxisome proliferator-activated receptor gamma-coactivator 1 alpha
- PI3K:
phosphatidylinositol 3-kinase
- PNPLA3:
patatin-like phospholipase 3
- PPAR:
peroxisome proliferator-activated receptor
- PRS:
polygenic risk scores
- SNP:
single nucleotide polymorphism
- STAT3:
signal transducer and activator of transcription 3
- TA:
thymine and adenine
- TG:
triglyceride
- TM6SF2:
transmembrane 6 superfamily member 2
- TNF-α:
tumor necrosis factor-alpha
- VLDL:
very-low-density lipoprotein
- TCF7L2:
Transcription factor 7 like 2
Declarations
Acknowledgement
The authors would like to acknowledge the contributions of the scientists for their excellent research studies that we have cited in this review and apologize that, due to space constraints, we had to omit some of the studies. We also thank Dr. Ravikanth Vishnubhotla, Group Leader-Genetics, Asian Healthcare Foundation/AIG Hospitals, for the critical reading of the manuscript and valuable suggestions.
Funding
This work is supported by ICMR-NIN intramural grants (20-FT04, 21-FS01, and 21-FS03).
Conflict of interest
SKM has been an editorial board member of Gene Expression since October 2022. The authors declare that they have no other conflict of interests.
Authors’ contributions
SKM searched the literature, collected the data, and wrote and finalized the manuscript. YKG reviewed the literature and improvised the manuscript. PNR and BRS are clinical doctors who reviewed the literature and helped ensure that all critical data about this topic is represented in this review. All authors had reviewed, finalized, and approved the manuscript for publication.