Introduction
The coronavirus disease (COVID-19) outbreak caused by the SARS-CoV-2 virus has become one of the largest pandemics in modern history. According to the World Health Organization, as of January 2025, COVID-19 has spread to 215 countries and caused more than seven million deaths worldwide.1 The spectrum of clinical forms of COVID-19 varies from asymptomatic infection to severe acute respiratory syndrome.2
It has been discovered that proteins with chaperone-like properties are actively involved in the pathogenesis of COVID-19 by regulating the immune response and viral replication,3 contributing to the development of a cytokine storm,4 and participating in the antigen presentation of viral proteins from infected cells.5 Furthermore, SARS-CoV-2 shares immunogenic epitopes with several human chaperones, which can lead to immune hyperactivation,6,7 multi-organ damage,8 and post-acute sequelae of COVID-19.9 In addition, as obligate intracellular parasites, viruses can seize control over the host cell metabolic machinery, including the chaperoning system, to maintain their life cycle and sustain productive infection.10
A recently discovered class of chaperones, known as heat-resistant obscure (Hero) proteins, is likely to perform functions similar to other molecular chaperones, such as maintaining proteostasis and protecting proteins from pathological aggregation.11,12 Hero proteins are hydrophilic, highly charged, heat-resistant proteins with low molecular weight and disordered structures. These properties allow them to protect other proteins from denaturation under extreme conditions, either through simple “molecular shielding” or by promoting liquid-liquid phase separation.13
Small EDRK-rich factor 2 (SERF2, also known as Hero 7), a member of the Hero protein family, is known for its distinctive role in protein aggregation, which varies depending on the protein “client”.13 Members of the SERF family perform dual functions, both promoting and preventing fibril formation of amyloidogenic proteins.14 Several studies have confirmed that SERF2 increases aggregation of huntingtin, β-amyloid, and α-synuclein, contributing to amyloid proteotoxicity.15–17 However, it also prevents aggregation of TAR DNA-binding protein 43 (TDP-43), which regulates viral RNA expression and has been actively studied in the research on COVID-19.18–20
The SERF protein family has an amino acid composition similar to that of DNA- and RNA-binding proteins, such as zinc-finger proteins.18,21,22 Since zinc-finger proteins are heavily implicated in COVID-19 pathogenesis,23,24 SERF2 may perform similar functions, including the regulation of viral gene expression.18
In our previous studies, we identified an association between the SERF2 SNP rs4644832 and the risk of cerebrovascular diseases.12,25 Because critical illness in COVID-19 and cerebrovascular diseases share several key pathogenic mechanisms, such as increased production of proinflammatory cytokines, hypercoagulation, endothelial dysfunction, inflammation, and oxidative stress,26–34 we hypothesized that SNP rs4644832 correlates with an increased risk of severe COVID-19. Therefore, we set out to investigate the association between SERF2 SNP rs4644832 and the risk of severe COVID-19.
Materials and methods
This study follows the criteria of a case-control study, according to the STROBE guidelines. A total of 1,373 unrelated Russians (178 patients with a severe course of COVID-19 and 1,195 controls with mild or asymptomatic COVID-19) from Central Russia were recruited for the study. The patients were enrolled during the COVID-19 pandemic from 2020 to 2022 at the intensive care units of Kursk Regional Hospital No. 6 and Kursk Regional Tuberculosis Dispensary. The patients were recruited consecutively. Inclusion criteria for the study were self-declared Russian descent and a birthplace within Central Russia. The COVID-19 patients were enrolled under the following exclusion criteria: hepatic or renal failure, endocrine, autoimmune, and/or oncological diseases, which could alter laboratory parameters. All patients in the case group had polymerase chain reaction (PCR)-confirmed COVID-19 and required intensive care unit admission. The control group consisted of healthy volunteers who were diagnosed with COVID-19 but did not require hospitalization (Fig. 1). Baseline and clinical characteristics of the study population are listed in Table S1.
Low fruit and vegetable intake was defined according to the World Health Organization guidelines as consuming less than 400 g per day.35 Low physical activity was defined as engaging in less than 180 minutes per week of various physical activities.36
Genomic DNA was extracted from blood samples, and the quality of extracted DNA was assessed using a Nanodrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). SNP rs4644832 in SERF2 was genotyped using allele-specific probe-based PCR in accordance with a previously published protocol. Details of primer design, reaction solution, and PCR protocol steps have been previously published.12
STATISTICA software (version 13.3, Informer Technologies, Inc., Santa Clara, CA, USA) was used for statistical analysis. The normality of distribution for quantitative data was assessed using the Shapiro-Wilk test. Most quantitative parameters deviated from a normal distribution; thus, they were presented as the median along with the first and third quartiles [Q1 and Q3]. The Kruskal–Wallis test was applied to compare quantitative variables among three independent groups. Pairwise comparisons were then performed using the Mann-Whitney test. These tests were employed to analyze associations between rs4644832 in SERF2 and clinical features of COVID-19 in the patient group.
To evaluate the independent contribution of the rs4644832 variant in SERF2 to severe COVID-19 risk while adjusting for relevant covariates, we performed multivariate logistic regression in R (version 4.4.1). The analysis included one SNP (rs4644832 in SERF2) and five covariates: age, sex, smoking status, body mass index (BMI), and vegetable intake. Unfortunately, data on physical activity levels were unavailable for the control group and could not be included as a covariate in the regression model.
The dependent variable was disease status (0 = control, 1 = severe COVID-19), and all categorical predictors (SNP genotype, sex, smoking status, vegetable intake) were encoded as factors. Genotypes were modeled under a categorical (non-additive) framework. Continuous variables (age, BMI) were used as numeric predictors. Individuals with missing values for any variable were excluded from the analysis.
The dataset was split into a training set (80%) and a testing set (20%) using the caret package to allow model evaluation. The logistic regression model was fitted using the glm function with a binomial link function. Model performance was evaluated by the likelihood ratio test against a null (intercept-only) model, Nagelkerke’s pseudo-R2 (calculated with the DescTools package), and classification accuracy on the training set.
To visualize significant predictors (P < 0.05), odds ratios (OR) and 95% confidence interval (CI) were extracted using the broom package, exponentiated, and plotted on a log scale. Predictors with OR > 1 were classified as “risk” factors, while those with OR < 1 were classified as “protective”.
Compliance of genotype distributions with the Hardy-Weinberg equilibrium was assessed using Fisher’s exact test. Genotype frequencies and their correlation with disease risk were analyzed using SNPStats software (https://www.snpstats.net/start.htm ). A log-additive model was used for genotype association analysis. Associations within the entire group of COVID-19 patients and controls were adjusted for age and sex.
Due to the potential modifying influence of environmental risk factors on the association of genetic markers with disease, associations were analyzed based on the presence or absence of the risk factor. When information about an environmental risk factor was unavailable in the control group (for fruit/vegetable intake, physical activity levels, and BMI), the patient group was compared to the overall control group. In such cases, the Bonferroni correction was applied to account for multiple comparisons.
The following bioinformatics resources were used to analyze the functional effects of SNPs:
The bioinformatic tool GTExPortal (http://www.gtexportal.org/ ) was used to analyze the expression levels of the gene in whole blood, blood vessels, and lungs, as well as to analyze the expression of quantitative trait loci (eQTLs).37 The method of eQTL analysis is fully described in our previous article.38
The eQTLGen (https://www.eqtlgen.org/ ) was applied for the examination of eQTL expression in peripheral blood.39 eQTLGen incorporates 37 datasets, with a total of over 31,000 individuals.
The HaploReg v4.2 (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php ) was used to assess histone modifications. Histone H3 protein lysine residues at positions 27 and 9 (H3K27ac and H3K9ac, respectively), as well as mono- and tri-methylation at position 4 (H3K4me1 and H3K4me3), were studied.40–42 This resource compiles ChIP-seq data from the Roadmap Epigenomics projects.43
The atSNP affinity test (http://atsnp.biostat.wisc.edu/search ) was used to assess the impact of SNPs on the gene’s affinity for transcription factors (TFs).44 In detail, the method of TF analysis is described earlier in our study.45
The Gene Ontology online tool (http://geneontology.org/ ) provides a systematic classification of gene functions, which we used to analyze the joint involvement of TFs linked to the reference or SNP alleles in biological processes directly related to the pathogenesis of COVID-19.46
The Lung Disease Knowledge Portal (https://cd.hugeamp.org/ ) and Common Metabolic Diseases Knowledge Portal (https://hugeamp.org/ ) were used to analyze the correlation between SNPs and phenotypic risk factors of severe COVID-19.
The STRING database’s bioinformatic tools were used to study key functional partners of SERF2. Moreover, the STRING database was utilized to analyze protein-protein interactions between SERF2 and its functional partners.47,48
The Comparative Toxicogenomics Database (https://ctdbase.org/ ) was employed to evaluate the influence of various chemicals and hormones on SERF2 expression.49
Results
The distribution of genotype frequencies conformed to Hardy–Weinberg equilibrium (P > 0.05). The log-additive model analysis was performed in the total sample. Additionally, we analyzed groups stratified by sex, BMI, smoking status, fruit and vegetable intake, and physical activity to identify associations between genetic variants and severe COVID-19 depending on the presence or absence of environmental risk factors (Table 1).
Table 1Statistically significant associations of rs4644832 in SERF2 with severe COVID-19 in subgroups stratified by sex, smoking status, fruit and vegetable intake, physical activity, and BMI
| Genetic variant | Effect allele | Other allele | N | OR [95%CI]1 | P2 (PBonf) | N | OR [95%CI]1 | P2 (PBonf) |
|---|
| | | Males | Females |
| rs4644832 SERF2 | G | A | 591 | 0.62 [0.36–1.05] | 0.059 | 782 | 0.51 [0.31–0.87] | 0.006 |
| Smokers | Non-smokers |
| rs4644832 SERF2 | G | A | 407 | 0.83 [0.44–1.56] | 0.55 | 933 | 0.46 [0.29–0.74] | 0.0004 |
| Low fruit and vegetable intake | Normal fruit and vegetable intake |
| rs4644832 SERF2 | G | A | 367 | 0.38 [0.22–0.67] | 0.0002 (0.0004) | 525 | 0.87 [0.50–1.52] | 0.63 (1.0) |
| Low physical activity | Normal physical activity |
| rs4644832 SERF2 | G | A | 443 | 0.41 [0.23–0.75] | 0.001 (0.002) | 453 | 0.71 [0.43–1.20] | 0.19 (0.4) |
| BMI < 25 | BMI ≥ 25 |
| rs4644832 SERF2 | G | A | 201 | 0.94 [0.49–1.80] | 0.85 (1.0) | 813 | 0.42 [0.25–0.70] | 0.0002 (0.0004) |
The analysis revealed significant associations between the polymorphic variant rs4644832 in SERF2 and severe COVID-19. The G allele exhibited a protective effect in the total sample (OR = 0.56, 95% CI 0.39–0.81, P = 0.001) (Table 1), as well as in females (OR = 0.51, 95% CI 0.31–0.87, P = 0.006), non-smokers (OR = 0.46, 95% CI 0.29–0.74, P = 0.0004), individuals with BMI ≥ 25 (OR = 0.42, 95% CI 0.25–0.70, PBonf = 0.0004), individuals with low fruit and vegetable intake (OR = 0.38, 95% CI 0.22–0.67, PBonf = 0.0004), and individuals with low physical activity (OR = 0.41, 95% CI 0.23–0.75, PBonf = 0.002) (Table 1).
Multivariable logistic regression analysis
Multivariable logistic regression adjusted for age, sex, smoking status, BMI, and vegetable intake revealed significant associations between genetic and clinical factors and severe COVID-19 (Likelihood Ratio Test χ2 = 122.5, P < 0.001) (Table S2, Fig. 2). The model explained 25% of the variance (Nagelkerke R2 = 0.25) and correctly classified 86% of cases.
Carriers of the SNP rs4644832 SERF2 A/G genotype had 45% lower odds of severe COVID-19 compared to A/A homozygotes (OR = 0.55, 95% CI 0.31–0.93, P = 0.03) (Fig. 2). The G/G genotype showed a non-significant protective trend (OR = 0.43, 95% CI 0.06–1.61, P = 0.3). Each additional year of age increased the odds by 7% (OR = 1.07, 95% CI 1.05–1.10, P < 0.001). Females had 60% lower odds than males (OR = 0.40, 95% CI 0.25–0.63, P < 0.001). Regular vegetable consumption was strongly protective (OR = 0.38, 95% CI 0.25–0.58, P < 0.001), whereas smoking showed no significant association (P = 0.6). Each unit increase in BMI increased the odds by 7% (OR = 1.07, 95% CI 1.03–1.11, P = 0.001) (Fig. 2).
Clinical features of COVID-19 and rs4644832 SERF2
We analyzed associations between rs4644832 in SERF2 and clinical features of COVID-19 among patients. In groups stratified by sex, age, smoking status, fruit and vegetable intake, and physical activity, several significant associations were observed. In females, A/A homozygotes were associated with higher C-reactive protein (CRP) levels than heterozygotes (P = 0.02).
Notably, rs4644832 influenced clinical features in both smokers and non-smokers. Among smokers, G/G homozygotes had a higher median BMI than A/A homozygotes and heterozygotes (P = 0.015). Among non-smokers, A/A homozygotes were associated with higher CRP levels (P = 0.005) and higher leukocyte counts (P = 0.025) compared to heterozygotes.
In the group aged over 68 years, protective A/A homozygotes were associated with higher thrombocyte counts than heterozygotes (P = 0.01).
In the group with high fruit and vegetable intake, A/A homozygotes had higher CRP levels than heterozygotes (P = 0.0075). Furthermore, in the low physical activity group, A/A homozygotes showed higher thrombocyte (P = 0.02) and leukocyte counts (P = 0.0495) compared to heterozygotes (Fig. 3, Table S3).
Molecular correlates of rs4644832 in SERF2
Analysis of cis-eQTL-mediated expression profiles of genes in various tissues revealed several relevant associations. The G allele of rs4644832 in SERF2 decreases expression of ADAL, CATSPER2, CATSPER2P1, MAP1A, STRC, and STRCP1 and increases expression of SERF2, ENSG00000249839, HYPK, PDIA3, and ZSCAN29 in lungs, blood vessels, and whole blood (Fig. 4, Table 2).
Table 2Effect of the A allele of rs4644832 in SERF2 on gene expression (cis-eQTL) in various tissues (https://gtexportal.org )
| SNP | Effect allele | Gene expressed | P-value | Effect (NES) | Tissue |
|---|
| rs4644832, SERF2 | A | ADAL | 0.00002 | ↓ (−0.29) | Artery - Aorta |
| | | 2.00×10−7 | ↓ (−0.26) | Artery - Tibial |
| | CATSPER2 | 5.30×10−7 | ↓ (−0.2) | Artery - Tibial |
| | | 3.20×10−7 | ↓ (−0.27) | Lung |
| | | 0.00052 | ↓ (−0.12) | Whole Blood |
| | CATSPER2P1 | 0.0001 | ↓ (−0.22) | Artery - Tibial |
| | ENSG00000249839 | 1.00×10−16 | ↑ (0.68) | Artery - Aorta |
| | | 1.90×10−8 | ↑ (0.6) | Artery - Coronary |
| | | 2.80×10−10 | ↑ (0.41) | Artery - Tibial |
| | | 3.40×10−8 | ↑ (0.4) | Lung |
| | HYPK | 8.1×10−6 | ↑ (0.1) | Lung |
| | MAP1A | 7.8×10−6 | ↓ (−0.17) | Artery - Aorta |
| | PDIA3 | 2.70×10−7 | ↑ (0.1) | Whole Blood |
| | SERF2 | 1.1×10−6 | ↑ (0.14) | Artery - Aorta |
| | | 0.00015 | ↑ (0.16) | Artery - Coronary |
| | | 8.60×10−7 | ↑ (0.088) | Artery - Tibial |
| | | 6.40×10−7 | ↑ (0.11) | Lung |
| | | 1.60×10−10 | ↑ (0.1) | Whole Blood |
| | STRC | 1.30×10−8 | ↓ (−0.32) | Artery - Aorta |
| | | 3.90×10−7 | ↓ (−0.27) | Artery - Tibial |
| | | 2.30×10−11 | ↓ (−0.37) | Lung |
| | STRCP1 | 2.50×10−7 | ↓ (−0.31) | Artery - Aorta |
| | | 5.00×10−8 | ↓ (−0.28) | Artery - Tibial |
| | | 0.00002 | ↓ (−0.24) | Lung |
| | ZSCAN29 | 0.00014 | ↑ (0.13) | Lung |
| | | 0.00004 | ↑ (0.1) | Whole Blood |
In whole blood, the G allele of rs4644832 in SERF2 significantly decreases cis-eQTL–mediated regulation of SERF2, ZSCAN29, TUBGCP4, and PDIA3, while it increases cis-eQTL–mediated regulation of STRCP1, LCMT2, CATSPER2, MAP1A, STRC, CATSPER2P1, ADAL, and TRIM69 expression (Fig. 4, Table 3).
Table 3Effect of the G allele of rs4644832 in SERF2 on gene expression (cis-eQTL) in whole blood (https://www.eqtlgen.org/cis-eqtls.html )
| SNP | Allele | Gene expressed | Z-score | P-value | FDR |
|---|
| rs4644832, SERF2 | G | ZSCAN29 | ↓ (−46.5249) | 3.27×10−310 | <0.05 |
| | SERF2 | ↓ (−26.1112) | 2.72×10−150 | <0.05 |
| | TUBGCP4 | ↓ (−23.7631) | 8.03×10−125 | <0.05 |
| | STRCP1 | ↑ (20.6369) | 1.28×10−94 | <0.05 |
| | LCMT2 | ↑ (18.2518) | 2.01×10−74 | <0.05 |
| | CATSPER2 | ↑ (11.8697) | 1.70×10−32 | <0.05 |
| | PDIA3 | ↓ (−10.7944) | 3.65×10−27 | <0.05 |
| | MAP1A | ↑ (9.1256) | 7.13×10−20 | <0.05 |
| | STRC | ↑ (9.0382) | 1.59×10−19 | <0.05 |
| | CATSPER2P1 | ↑ (8.2227) | 1.99×10−16 | <0.05 |
| | ADAL | ↑ (6.7174) | 1.85×10−11 | <0.05 |
| | TRIM69 | ↑ (6.2919) | 3.14×10−10 | <0.05 |
According to HaploReg v4.2, rs4644832 in SERF2 has a significant effect on histone modifications in COVID-19-related tissues, including blood vessels, lung, and blood cells. This variant is located within a DNA region binding to histone H3, characterized by mono-methylation at lysine 4 of histone H3 (H3K4me1), marking enhancers in blood cells, and tri-methylation at lysine 4 (H3K4me3), marking promoters in lungs, aorta, and blood cells. The effect of these histone marks is enhanced by acetylation of lysine 27 of histone H3 (H3K27ac), marking enhancers in lungs, aorta, and blood cells, and by acetylation of lysine 9 of histone H3 (H3K9ac), marking promoters in blood cells. Furthermore, rs4644832 in SERF2 is located within DNA regions hypersensitive to DNase-1 in blood samples (Table 4).
Table 4Tissue-specific effects of rs4644832 in SERF2 on histone modifications (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php )
| Tissue | H3K4me1 | H3K4me3 | H3K27ac | H3K9ac | DNase |
|---|
| Lung | – | Pro | Enh | – | – |
| Vessels – aorta | – | Pro | Enh | – | – |
| Blood | Enh | Pro | Enh | Pro | DNase |
Bioinformatic resources including the Lung Disease Knowledge Portal (https://lung.hugeamp.org/ ) and Common Metabolic Diseases Knowledge Portal (https://hugeamp.org/ ) were used to analyze correlations between rs4644832 in SERF2 and phenotypic risk factors for severe COVID-19. It was found that the risk allele A of SNP rs4644832 SERF2 significantly increased the likelihood of hospitalization in COVID-19 patients. It was also associated with increased CRP levels and monocyte percentage, and decreased eosinophil count, eosinophil percentage, and insulin-like growth factor 1 levels (Table 5).
Table 5Summary of associations between rs4644832 in SERF2 and phenotypes linked to a severe course of COVID-19
| № | SNP | Phenotype | P-value | Beta (OR) | Sample Size |
|---|
| 1 | rs4644832, SERF2 (A | Hospitalized vs non-hospitalized COVID-191 | 0.008 | OR▲1.0641 | 52 246 |
| 2 | | Eosinophil percentage2 | 2.27×10−16 | Beta▼−0.0033 | 1 334 180 |
| 3 | | Eosinophil count2 | 2.00×10−12 | Beta▼−0.0027 | 1 794 490 |
| 5 | | Insulin-like growth factor (IGF-1) 2 | 0.00001 | Beta▼−0.0045 | 445 573 |
| 7 | | Plasma C-reactive protein2 | 0.0002 | Beta▲0.0052 | 1 215 140 |
| 9 | | Monocyte percentage2 | 0.03 | Beta▲0.0053 | 853 181 |
The protective G allele of rs4644832 in SERF2 creates DNA binding sites for 46 TFs involved in the following biological processes: positive regulation of CD8-positive, alpha-beta T cell differentiation (GO:0043378; false discovery rate (FDR) = 0.00301), negative regulation of CD4-positive, alpha-beta T cell differentiation (GO:0043371; FDR = 0.0399), cellular response to glucocorticoid stimulus (GO:0071385; FDR = 0.00763), nuclear receptor-mediated steroid hormone signaling pathway (GO:0030518; FDR = 0.01), and lung development (GO:0030324; FDR = 0.0144) (Table S4).
Protein-protein interactions of SERF2
Analysis of primary functional partners of SERF2 using the STRING database (protein-protein interactions (PPI) enrichment P-value: 3.92×10−5) revealed ten proteins with the most prominent interactions: actin related protein 2/3 complex subunit 2 (ARPC2), boule RNA binding protein (BOLL), huntingtin interacting protein K (HYPK), myosin light chain 6 (MYL6), ribosomal protein L23a (RPL23A), ribosomal protein lateral stalk subunit P1 (RPLP1), ribosomal protein S19 (RPS19), ribosomal protein S27 (RPS27), signal recognition particle 14 (SRP14), and translation machinery associated 7 homolog (TMA7) (Fig. 5, Table S5).
The full list of biological processes and Reactome pathways involving SERF2 and its PPI-partners is presented in Table S6. Key pathways include cytoplasmic translation (GO:0002181; FDR = 0.0082), ribosome assembly (GO:0042255; FDR = 0.0395), viral mRNA translation (HSA-192823; FDR = 0.00016), infectious disease (HSA-5663205; FDR = 0.0062), and SARS-CoV-2 modulation of host translation machinery (HSA-9754678; FDR = 0.0254), among others.
Discussion
In this study, we provide genetic evidence that rs4644832 of SERF2 is associated with severe COVID-19. We found that the G allele of rs4644832 SERF2 has a protective effect against severe COVID-19, and this effect is modified by sex, smoking status, level of fresh fruit and vegetable intake, physical activity, and BMI. We observed the protective effect of allele G rs4644832 SERF2 exclusively in females and non-smokers. This can be explained by the positive regulation of SERF2 expression by tobacco components and androgens,50–53 as previously mentioned.12 Moreover, several studies confirm that estrogen decreases SERF2 expression.54,55
Additionally, the protective effect of rs4644832 SERF2 was also observed in subgroups with low fruit and vegetable intake, low physical activity, and overweight (BMI ≥ 25). We hypothesize that the manifestation of the G allele’s protective effect occurs in these subgroups with predisposing risk factors for severe COVID-19 due to the predominance of protective effects from lower BMI, normal fruit and vegetable intake, and regular physical activity.
The role of SERF2 in human pathology remains to be fully elucidated. Our research group recently discovered a link between rs4644832 SERF2 and ischemic stroke.12 In our previous study, we suggested that SERF2 influences ischemic stroke development through its ability to maintain proteome quality,12 which is also essential in the context of COVID-19.56
SERF2 plays a dual role in proteostasis. On one hand, it accelerates amyloid formation.15–17 Persistent fibrin amyloid microclots obstruct small vessels and inhibit tissue oxygenation, potentially leading to thrombotic complications and post-acute sequelae of COVID-19.57 On the other hand, SERF2 prevents TDP-43 aggregation.13 TDP-43 acts as a regulator of viral RNA expression due to its ability to interact with ribonucleoprotein complexes.19,58,59 Viral infections trigger TDP-43 aggregation; for instance, the SARS-CoV-2 main protease induces neurotoxic TDP-43 aggregation,20 and its spike S1 protein receptor-binding domain (SARS-CoV-2 S1 RBD) binds to TDP-43,60 suggesting TDP-43 involvement in the neurological symptoms of COVID-19.19
In cis-eQTL analysis, we have found that the risk allele A of rs4644832 upregulates SERF2 expression in blood vessels, lung, and whole blood. Therefore, it may be associated with increased proteotoxicity and amyloid clot formation. Furthermore, increased TDP-43 stability caused by high SERF2 expression can promote the viral life cycle.59
Some genes influenced by rs4644832 SERF2 cis-eQTL effects are linked to ubiquitination, which is essential for protein synthesis, signaling, and innate and adaptive immune responses.61 Efficient ubiquitination followed by degradation of SARS-CoV-2 spike protein ensures an antiviral response.62 Recent studies have shown that high ubiquitination levels are associated with a favorable COVID-19 prognosis due to proper immune regulation and prevention of immune damage.63 The G allele of rs4644832 SERF2 upregulates TRIM69, which encodes a human E3 ubiquitin-protein ligase critical for antiviral immunity, mediating major histocompatibility complex class I (MHC I) antigen processing and presentation.64–67 It is worth mentioning that SERF2 itself, conversely, prevents degradation of ubiquitinated proteins.68
According to GTExPortal data, the risk allele A of rs4644832 SERF2 increases PDIA3 expression via cis-eQTL effects. PDIA3 encodes a molecular chaperone that is a core component of major histocompatibility complex class I and is heavily implicated in antigen presentation to cytotoxic T lymphocytes.69–72 Overexpression of PDIA3 enhances Wnt/β-catenin signaling,73,74 which is involved in triggering the cytokine storm.75 Furthermore, recent studies confirm that PDIA3 disulfide isomerase activity participates in the viral life cycle76; its deletion significantly decreases viral load and inflammatory cytokine levels.77,78 Considering that the G allele of rs4644832 SERF2 reduces PDIA3 levels, this may explain the SNP’s protective effect against a severe course of COVID-19.
Moreover, in cis-eQTL analysis, we discovered that rs4644832 of SERF2 modulates the expression of several genes involved in maintaining cytoskeleton function, including MAP1A,79TUBGCP4,80 and STRC.81 The cytoskeleton ensures the structural integrity of the endothelial barrier; thus, endothelial dysfunction, one of the major pathological features of severe COVID-19,27,28 is believed to be associated with cytoskeleton disorganization.82 Viruses frequently depend on microtubules at multiple stages of their life cycle. SARS-CoV-2 employs the host cytoskeleton for virion transport, cell-to-cell spread, and disruption of the immune response.83
Notably, SERF2 and its functional partners in the PPI network are associated with viral mRNA translation and modulation of the host translation machinery by SARS-CoV-2. This discovery provides evidence of SERF2’s pivotal role in COVID-19 pathogenesis.
We accessed data from the Lung Disease Knowledge Portal and the Common Metabolic Diseases Knowledge Portal to identify associations between rs4644832 SERF2 and phenotypical risk factors for severe COVID-19. It was found that the risk A allele is associated with an increased likelihood of hospitalization in COVID-19 patients. Moreover, the A allele of rs4644832 SERF2 increases certain laboratory traits typical of severe COVID-19, such as monocyte percentage and CRP levels.
Elevation in monocyte proportion can be linked to the COVID-19-associated cytokine storm. The cytokine storm in COVID-19 differs considerably from the canonical cytokine storm seen in macrophage activation syndrome in other infectious diseases.84 In COVID-19, the atypical cytokine storm is orchestrated mainly by monocytes, whereas in macrophage activation syndrome, macrophages play the predominant role.85
Our analysis of COVID-19 clinical features corresponds with data from the Common Metabolic Diseases Knowledge Portal. It shows a correlation between the protective G allele of rs4644832 SERF2 and decreased CRP levels in females and non-smokers. CRP is one of the most accurate and sensitive markers of inflammation.86,87 It is synthesized in the liver in response to pro-inflammatory cytokines, such as interleukin-6, interleukin-1β, and tumor necrosis factor-α.88 In COVID-19, CRP levels have an indisputable prognostic and diagnostic value.89
To further explore the molecular mechanisms by which rs4644832 SERF2 influences COVID-19 pathogenesis, we analyzed TFs that bind to the DNA region containing the studied SNP and their associated biological processes. These TFs are involved in positive regulation of CD8-positive, alpha-beta T-cell differentiation (GO:0043378), negative regulation of CD4-positive, alpha-beta T-cell differentiation (GO:0043371), cellular response to glucocorticoid stimulus (GO:0071385), nuclear receptor-mediated steroid hormone signaling pathway (GO:0030518), and lung development (GO:003032).
CD8+ T-cells, or cytotoxic T-cells, are involved in eliminating cells infected by intracellular pathogens.90 CD4+ T-cells, or T-helper cells, play an important role in the adaptive immune response against intracellular pathogens.91
The efficacy of glucocorticoids in managing hyperimmune responses in COVID-19 patients is well established. They are effective immunosuppressors and are widely used to treat hypoxic respiratory failure and ARDS.92 Glucocorticoids reduce inflammation by regulating cytokine gene transcription and proinflammatory pathways.93 Activation of the cellular response to glucocorticoid stimulus and the nuclear receptor-mediated steroid hormone signaling pathway, promoted by the protective G allele of rs4644832 SERF2, may consequently enhance the efficacy of glucocorticoid treatment in severe COVID-19 cases.
Our study has several limitations. Firstly, since a COVID-19 diagnosis can be based on clinical evidence, the control group includes both PCR-confirmed and clinically confirmed cases, which may lead to misclassification. Secondly, our study is limited to the Russian population; therefore, confirming our results requires studies in diverse populations. Furthermore, we did not measure gene expression levels in the blood of SERF2 genotype carriers. Instead, we used bioinformatic tools to assess cis-eQTL effects of SNPs on gene expression. Similarly, histone modifications and TFs binding were analyzed exclusively using bioinformatic resources.
Conclusions
The present study reveals the protective effect of the G allele of rs4644832 SERF2 against severe COVID-19. These results provide novel insights into the involvement of Hero proteins in the pathogenesis of viral infections. Further studies may help uncover the detailed role of SERF2 in COVID-19 pathogenesis.
Supporting information
Supplementary material for this article is available at https://doi.org/10.14218/GE.2025.00057 .
Table S1
Baseline and clinical characteristics of the studied groups.
(DOCX)
Table S2
Odds ratios from multivariable logistic regression analyzing genetic and clinical predictors of severe COVID-19 (568 controls vs. 138 COVID-19 patients).
(DOCX)
Table S3
Clinical features associated with rs4644832 SERF2.
(DOCX)
Table S4
Analysis of the impact of rs4644832 SERF2 on the binding of transcription factors with DNA (data from Gene Ontology resources http://geneontology.org/ ).
(DOCX)
Table S5
Main functional characteristics of predicted functional partners of Small EDRK-rich factor 2 (SERF2).
(DOCX)
Table S6
Functional enrichments in the SERF2 network.
(DOCX)
Declarations
Ethical statement
The study was conducted according to the guidelines of the Declaration of Helsinki (as revised in 2024) and was approved by the Ethical Review Committee of Kursk State Medical University (protocol №1 from January 11, 2022). Informed consent was obtained from all individual participants included in the study.
Data sharing statement
No additional data are available.
Funding
No funding was received to assist with the preparation of this manuscript.
Conflict of interest
All authors declare no conflicts of interest.
Authors’ contributions
Study design (OB), performance of experiments, analysis and interpretation of data, statistical analysis (AD, MI, KK), manuscript writing (AD), critical revision, administrative, technical, and material support, and study supervision (OB). All authors have made significant contributions to this study and have approved the final manuscript.