Introduction
Antimicrobial resistance (AMR) is a global epidemic that challenges modern medicine and leading cause of high fatality rates.1,2 The World Health Organization (WHO) considered AMR as one of the world’s top 10 global public health challenges. According to the WHO, carbapenem-resistant Pseudomonas aeruginosa (PA), carbapenem-resistant Klebsiella pneumoniae (KP), and carbapenem-resistant Acinetobacter baumannii are three critical-priority pathogens that are resistant to carbapenems and most other penicillins.3 Currently, there is a concerning trend of increasing resistance to colistin, an antibiotic often considered as a last resort option in treating nosocomial infections. It is a global crisis that has remained unchanged and requires immediate attention and action.4 The mortality for of AMR reached 750,000 per year and may extend to 10 million by 2050. This alarming statistic indicates that AMR poses a greater danger than cancer and diabetes combined.5,6 The global death toll from pneumonia is estimated to be 1.575 million.7 Global data show 2.8 million antibiotic-resistant infections occur each year in the USA, resulting in over 35,000 deaths. Similarly, AMR is estimated to cause 25,000 deaths per year in the European Union alone.7,8 Antibiotics that can kill antimicrobial-resistant superbugs are rapidly running out. AMR is influenced by major three mechanisms, (1) changes in antibiotic membrane permeability, (2) enzymatic breakdown of antibacterial medicines, and (3) changes in antimicrobial target bacterial proteins.9 AMR remains a pressing issue due to the increasing occurrence of drug-resistant pathogens that possess novel resistance mechanisms, leading to the emergence of AMR.10–12 The rapid global spread of multi-resistant and pan-resistant bacteria, which cause infections that are difficult to treat with currently available antimicrobial medicines like antibiotics, is particularly concerning.13,14
KP is becoming progressively difficult to treat with last-line antibiotics. It is a major opportunistic pathogen that causes a variety of infections such as pneumonia, bacteremia, urinary tract infection, and potentially deadly septic shock. It is especially problematic in hospitals, where it causes a variety of acute infections. Since KP was first identified as a pathogen of pneumonia in 1882, it has emerged as the world’s second most important clinical pathogen, after Staphylococcus aureus. It is now considered a strain of concern and a serious risk to public health because of the emergence of multidrug-resistant strains associated with hospital outbreaks and hypervirulent strains associated with severe community-acquired infections. As a result, it is becoming extensively important to investigate the pathogenicity mechanism and the virulence control network to prevent and control KP infection.15 Similarly, PA is a gram-negative, opportunistic pathogen with a broad host range that can be composed of various ecological niches. It is a common cause of many human infectious diseases, including sepsis, acute and chronic infections of the human airways, urinary tract infections, burn infections, and keratitis. It is also responsible for a large proportion of hospital-acquired infections and is linked to nosocomial infections.16 The most difficult problem with PA is its ability to rapidly develop resistance while treating an infection.17 The WHO has classified it as a priority-one pathogen and new drugs are seriously needed due to the emergence of multidrug-resistant strains.18
The microarray technique is a powerful tool used for detecting multiple target gene expressions of different organisms and can measure the transcript abundance of 10s of 1000s of genes in biological samples at the same time. It has a lot of advantages for expression profiling, and it may generate huge data, which requires a series of analyses to make it understandable.19–21 Microarray data must be correctly analyzed to obtain the optimum result. Computational analysis is essential for processing the data from large-scale expression profiling experiments and establishing the framework for biological understanding.22,23 It also has an important role in overcoming barriers to target identification and drug detection and development.13 The majority of microarray research strives to distinguish genes and analyze connections between individual genes in signaling pathways and networks to infer the phenotype of genes.24–27 Hence, the current study focused on biological differences between different bacterial strains of the group including wild types and mutants. We aimed to perform differential gene expression analysis on three microarray datasets (GSE24688, GSE4026, and GSE117438). Also, to understand the resistance mechanism and functional enrichment of the genes associated with KA and PA. Despite belonging to different orders the two bacteria work to obtain antibiotic resistance mechanisms. Thus, comparative analysis would open new vistas for the uniqueness of these genes with their actions.
Materials and methods
Data retrieval
The raw data included several microarray data sets obtained from the Gene Expression Omnibus (commonly known as GEO) public repository. Three microarray datasets (KP, PA1, and PA2) were retrieved, GSE24688, GSE4026, and GSE117438, and the analysis was performed. Details of each dataset are shown in Table 1. Table 1 shows the expression profile of the microarray dataset (GSE24688) of KP, the total number of isolated samples is 30, with 11 different strains, including 6 wild-type samples and 24 mutants. RNA type data were collected for differentially expressed gene (DEG) analysis. The second expression profile of the microarray dataset (GSE4026) of PA has shown the total number of isolated samples was 30, with four different strains, including 10 wild-type samples and 20 mutants. RNA data were collected for DEG analysis. In the third expression profile of the microarray dataset (GSE117438) of PA, the total number of isolated samples was 6, with 2 different strains, including 3 wild-type samples and 3 mutants. RNA data were collected for DEG analysis.
Table 1Details of the microarray datasets used in the gene expression analysis
GEO ID | Experiment type | Organism name | Total samples, n | Total strains, n | Total wild-type, n | Total mutants, n |
---|
GSE24688 | Expression profiling by array | Klebsiella pneumoniae | 30 | 11 | 6 | 24 |
GSE4026 | Expression profiling by array | Pseudomonas aeruginosa | 30 | 4 | 10 | 20 |
GSE117438 | Expression profiling by array | Pseudomonas aeruginosa | 6 | 2 | 3 | 3 |
Data processing and identification of DEGs
For each microarray dataset, raw data were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/gds ) followed by a grouping of wild-type and mutant genes (R version 4.0.0) (Fig. 1). It has shown implementation of the complete preprocessing workflow for microarray analysis. The limma R package was used for background correction and quantile normalization. A nonspecific intensity filtering procedure was then applied to eliminate low probes in each data set based on the probe intensity distribution. For statistical visualization and analysis, we examine the data to determine the extent to which the data are presented. The method we used here estimates that the data are on a log2 scale, usually in the range of 0 to 16. The “exprs” function can take an expression value as a data frame; with one column per sample and one row per gene. The filtering procedure is described in detail in the limma user guide (https://bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf ).
Data were normalized followed by clustering and correlation analysis to determine the linear relationship between samples. An unsupervised analysis is a good way to determine where the data variation comes from. It can also detect samples that are likely to be outliers. The “cor” function can calculate correlations (on a scale of 0–1) in pairs among all samples. This can then be visualized with a heat map, which is one of the most popular libraries for creating the heatmap in R. The correlation coefficient is determined using correlation analysis and is essential for evaluating how much one sample changes when the other does. We used hierarchical clustering on the microarray expression data for each strain to identify major similarities and differences in the transcriptional response to our panel of strains expressed as fragments per kilobase millions.
Principal component analysis (PCA) was performed. The major purpose of PCA is to visualize the sample data and classify the samples into two groups. As PCA is an unsupervised method, it does not consider known sample groups. However, we can add labels as we draw the result. The “ggplot2” package was very useful for this. The “ggrepel” package can be used to position text labels more smartly so that they are readable. The volcano plot function is a popular method to visualize the results of a DE analysis. The log-fold change is represented by the X-axis, and the Y-axis is a measure of statistical significance, which was. the logarithmic probability or B statistic. The distinctive shape of the volcano must be seen.
Functional enrichment analysis
Following DEG analysis, to understand the biological process, molecular function, and cellular component, we have used comparative gene ontology (GO) (https://www.comparativego.com/ ) here. The molecular pathways and functional analysis of each data set were performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Mapper (https://www.genome.jp/kegg/tool/map_pathway1.html ). To perform detailed gene list analysis and graphical visualization of enrichment pathways, we used ShinyGO (http://bioinformatics.sdstate.edu/go/ ) with the following configuration: search for KP and PA species, false discovery rate threshold p < 0.05, number of most significant terms to show 50.28,29
Results
We used different microarray datasets containing 66 samples of both organisms that were retrieved from the GEO database (GSE24688,30 GSE4926,30 and GSE1174386), which led to interesting findings regarding the DEGs in the two groups (Fig. 1).
Differential gene expression
We obtained three boxplots after normalizing all three datasets of KP and PA (number of samples: 30, 30, and 6). Previously, the microarray dataset expression values exceeded the log2 scale (>16). The expression value fell between the log2 scale (Fig. 2a), making it suitable for further comparison and statistical analysis after normalization.
Sample correlation and hierarchical clustering
The correlation heatmaps (Fig. 2b) show the three datasets of the KA and PA samples. The dendrograms in the heatmaps show closely related samples. The high correlation of the sample is shown by the red color and the low correlation is shown by the blue color. The diagonal red boxes are from identical sample IDs, i.e., highly correlated with values = 1. Despite diagonal boxes, some other boxes are representing the correlation between highly correlated samples. In these heatmaps, the clustering was done based on correlated samples (strains and groups). The mutant group is shown in pink and the wild-type group is shown in sky blue.
Volcano plot of DEGs
The sky blue color shows highly significant genes whereas the red color shows weakly significant genes. Significant p-values were <0.5 and the Fc cutoff was 1. Of 3,157 KP genes, we found 1,140 DEGs, of 1,556 PA1 genes we found 94 DEGs, and of 1,556 PA2 genes we found 641 DEGs (sky blue). The second heatmap shows the top 10 DEGs in each dataset (Fig. 2e).
Gene comparison
In this study, the retrieved KP and PA datasets comprised 2,366 and 1,066 total genes. The study observed 569 (19.9%) of the total overlapping genes after comparison of the three datasets of two organisms. As the organisms belong to two different orders, we checked whether they shared significant genes. We had 1,140 significant genes out of 3,157 KP genes, 94 significant genes out of 1,556 PA1 genes, and 641 significant genes out of 1,556 PA2 genes. After comparing all three groups of significant genes (Fig. 2g), we found 10 common significant genes (atpF, fur, ndk, nuoK, ppk, pyrH, rmlC, rnk, rph, and ubiE). Afterward, the study was designed to search the MicroBIGG-E database for drug-resistance genes in both KP and PA and found 746 antibiotic-resistance genes in KP and 684 drug-resistance genes in PA.30,31 We compared common genes and found 246 (20.8%) drug resistance genes in both KP and PA, and compared the common KP and PA genes to all of the genes (both significant and nonsignificant) from the three datasets (Fig. 2h), revealing five common antibiotic resistance genes (arsB, arsC, arsR, PhoP, and Pho). Our study identified three ars genes (arsB, arsC, and arsR). The ars operon is the most common resistance mechanism seen in Gram-positive and Gram-negative bacteria, and it can be chromosomally or plasmid-encoded. The PhoP/PhoQ (PhoPQ) found in our study follows the two-component system (TCS) for drug resistance. The PhoPQ TCS system is highly conserved among gram-negative bacteria, whether pathogenic or nonpathogenic (Fig. 3).
GO and pathway enrichment analysis
We used comparative GO as the primary tool for GO, using a controlled vocabulary to identify genes in terms of molecular functions, biological processes, and identified cellular components. To gain insight into the biological roles of the DEGs, we performed GO enrichment on the top 10 highly expressed significant genes in each GEO dataset (Fig. 2f, Tables 2–4), common 10 significant genes of three datasets (Figs. 2g, 4, and Table 5) as well as the top five drug resistance genes in each dataset (Figs. 2h, 5, and Table 6).
Table 2GO of the top 10 highly expressed significant genes of Klebsiella pneumonia
Gene | Molecular function | Biological process | Cellular component |
---|
fadH | 2,4-dienoyl-CoA reductase Nicotinamide Adenine Dinucleotide Phosphate Hydrogen (NADPH) activity, Flavin mononucleotide (FMN) binding | – | – |
chaB | – | – | – |
miaE | tRNA-(2-methylthio-N-6-(cis-hydroxy) isopentenyl adenosine)-hydroxylase activity | – | – |
argR | DNA binding, DNA-binding transcription factor activity, arginine binding | Arginine biosynthetic process, protein complex oligomerization | Cytoplasm |
moaB | – | Mo-molybdopterin cofactor biosynthetic process | |
moaD | Molybdopterin synthase activity | – | – |
ybiV | hydrolase activity | – | – |
traL | – | Conjugation, pilus assembly | Cell outer membrane, an integral component of the membrane |
yacC | – | – | – |
yaaA | – | – | – |
Table 3GO of the top 10 highly expressed significant genes of PA1
Gene | Molecular function | Biological process | Cellular component |
---|
pmrA | DNA binding | Phosphorelay signal transduction system, regulation of transcription, DNA-templated | – |
pilH | – | phosphorelay signal transduction system | – |
pilG | – | phosphorelay signal transduction system | – |
rmlC | dTDP-4-dehydrorhamnose 3,5-epimerase activity | dTDP-rhamnose biosynthetic process | – |
cupA1 | – | Cell adhesion | Pilus |
pmrB | Phosphorelay sensor kinase activity, ATP binding | – | An integral component of the membrane |
pilU | ATP binding, endodeoxyribonuclease activity, producing 5′-phosphomonoesters | Type IV pilus-dependent motility | – |
upp | Magnesium ion binding, uracil phosphoribosyltransferase activity, GTP binding | Uracil salvage, nucleoside metabolic process, UMP salvage | – |
pilZ | Cyclic-di-GMP binding | – | – |
orn | 3′-5′-exoribonuclease activity, nucleic acid binding | – | Cytoplasm |
Table 4GO of the top 10 highly expressed significant genes of PA1
Genes | Molecular function | Biological process | Cellular component |
---|
mmsB | 3-hydroxyisobutyrate dehydrogenase activity, oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor, NADP binding, NAD binding | Valine catabolic process | – |
catA | ferric iron binding, catechol 1,2-dioxygenase activity | Catechol-containing compound catabolic process | – |
putA | DNA binding, DNA-binding transcription factor activity, 1-pyrroline-5-carboxylate dehydrogenase activity, proline dehydrogenase activity, oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor | Proline metabolic process, proline biosynthetic process, proline catabolic process to glutamate | Cytoplasmic side of plasma membrane |
antA | Iron ion binding, dioxygenase activity, 2 iron, 2 sulfur cluster binding | Cellular metabolic process | – |
catC | Muconolactone delta-isomerase activity | Beta-ketoadipate pathway | – |
coxA | Cytochrome-c oxidase activity, heme binding, metal ion binding | Oxidative phosphorylation, aerobic respiration, electron transport coupled proton transport, respiratory electron transport chain | The plasma membrane, an integral component of the membrane |
gcvT2 | Aminomethyltransferase activity, transaminase activity | Glycine catabolic process | Cytosol |
pra | – | – | – |
pcaC | Peroxiredoxin activity | – | – |
coIII | Cytochrome-c oxidase activity, electron transfer activity, oxidoreduction-driven active transmembrane transporter activity | Aerobic respiration, aerobic electron transport chain | The plasma membrane, an integral component of the membrane |
Table 5GO enrichment on the common 10 genes of KP and PA
Gene | Molecular function | Biological process | Cellular component |
---|
pyrH | ATP binding, UMP kinase activity | De novo CTP biosynthetic process | Cytoplasm |
Fur | DNA binding, DNA-binding transcription factor activity, metal ion binding | – | Cytoplasm |
ubiE | 2-octaprenyl-6-methoxy-1,4-benzoquinone methylase activity, demethylmenaquinone methyltransferase activity, S-adenosylmethionine:2-demethylquinol-8 methyltransferase activity, S-adenosylmethionine:2-demethylmenaquinol methyltransferase activity, S-adenosylmethionine:2-demethylmenaquinol-7 methyltransferase activity | Ubiquinone biosynthetic process, aerobic respiration, menaquinone biosynthetic process, methylation | – |
Ndk | Nucleoside diphosphate kinase activity, ATP binding, metal ion binding | GTP biosynthetic process, UTP biosynthetic process, CTP biosynthetic process | Cytoplasm |
nuok | Quinone binding, NADH dehydrogenase (quinone) activity | ATP synthesis coupled electron transport | The plasma membrane, an integral component of the membrane |
Rnk | DNA binding, kinase activity, RNA polymerase binding | Regulation of DNA-templated transcription, elongation | – |
Rph | tRNA binding, 3′-5′-exoribonuclease activity, tRNA nucleotidyltransferase activity | rRNA processing, tRNA processing, rRNA catabolic process | – |
atpF | Hydrolase activity, proton-transporting ATP synthase activity, rotational mechanism | ATP synthesis coupled with proton transport | The plasma membrane, an integral component of the membrane, proton-transporting ATP synthase complex, coupling factor F(o) |
rmlC | dTDP-4-dehydrorhamnose 3,5-epimerase activity | dTDP-rhamnose biosynthetic process | – |
ppk | ATP binding, polyphosphate kinase activity, metal ion binding | polyphosphate biosynthetic process | Polyphosphate kinase complex |
Table 6GO enrichment of the five common genes of KP and PA
Gene | Molecular function | Biological process | Cellular component |
---|
arsB | Arsenite secondary active transmembrane transporter activity, antimonite secondary active transmembrane transporter activity | Arsenite transport, response to arsenic-containing substance | An integral component of the plasma membrane, an integral component of the membrane |
arsC | – | – | – |
arsR | Transcription regulatory region sequence-specific DNA binding, DNA-binding transcription factor activity | Regulation of transcription, DNA-templated | – |
phoP | Phosphorelay sensor kinase activity, phosphorelay response regulator activity, transcription regulatory region sequence-specific DNA binding, DNA-binding transcription activator activity, DNA-binding transcription factor activity | Positive regulation of phospholipid biosynthetic process | Cytosol, protein-DNA complex |
phoQ | ATP binding | Phosphorelay signal transduction system | The plasma membrane, an integral component of the membrane |
More specifically, we found molecular functions, biological processes, and cellular components associated with these key genes. Surprisingly, among the top 10 most abundantly expressed significant KP genes, the argR gene is involved in all three major aspects, including molecular functions, DNA binding, DNA-binding transcription factor activity, and arginine binding; biological processes, arginine biosynthetic process and protein complex oligomerization; and a cellular component, cytoplasm. In PA1, none of the genes were enriched in all of the three predefined functional characteristics. In PA2, putA, coxA, and coIII genes are involved in all three major attributes of GO. Likewise, in common 10 significant genes and common five drug resistance genes pyrH, nuoK, atpF, ppk, arsB, phoP, and phoQ are involved in the three major areas of GO. All the gene information was retrieved from PATRIC and MicrobesOnline databases (Tables 7 and 8).32
Table 7Details of 10 common genes
Gene name | Organism | Type | GC content | Position |
---|
pyrH | KP | Protein, 241 a.a | 55.37% | 217570.. 218295 (+) on NC_009648 genomic chromosome 5 |
pyrH | PA | Protein, 245 a.a. | 63.55% | 4092231.. 4092968 (−) on NC_002516 genomic chromosome 1 |
Fur | KP | Protein, 150 a.a | 52.32% | 792692.. 793144 (−) on NC_009648 genomic chromosome 5 |
Fur | PA | Protein, 134 a.a. | 60.25% | 5351675.. 5352079 (−) on NC_002516 genomic chromosome 1 |
ubiE | KP | Protein, 251 a.a. | 53.31% | 4738750.. 4739505 (+) on NC_009648 genomic chromosome 5 |
ubiE | PA | Protein, 256 a.a. | 64.46% | 5702669.. 5703439 (+) on NC_002516 genomic chromosome 1 |
Ndk | KP | Protein, 143 a.a. | 56.25% | 3126833.. 3127264 (−) on NC_009648 genomic chromosome 5 |
Ndk | PA | Protein, 143 a.a. | 61.11% | 4266470.. 4266901 (−) on NC_002516 genomic chromosome 1 |
nuoK | KP | Protein, 100 a.a. | 56.44% | 2939557.. 2939859 (−) on NC_009648 genomic chromosome 5 |
nuoK | PA | Protein, 102 a.a. | 64.40% | 2992548.. 2992856 (+) on NC_002516 genomic chromosome 1 |
Rnk | KP | Protein, 140 a.a. | 61.23% | 741607.. 742029 (−) on NC_009648 genomic chromosome 5 |
Rnk | PA | Protein, 134 a.a. | 68.15% | 5940020.. 5940424 (−) on NC_002516 genomic chromosome 1 |
Rph | KP | Protein, 238 a.a. | 59.55% | 4365064.. 4365780 (−) on NC_009648 genomic chromosome 5 |
Rph | PA | Protein, 239 a.a. | 67.50% | 6003377.. 6004096 (−) on NC_002516 genomic chromosome 1 |
atpF | KP | Protein, 154 a.a. | 51.40% | 4537292.. 4537756 (−) on NC_009648 genomic chromosome 5 |
atpF | PA | Protein, 156 a.a. | 60.08% | 6252707.. 6253177 (−) on NC_002516 genomic chromosome 1 |
rmlC | KP | Protein, 184 a.a. | 55.86% | 2715683.. 2716237 (−) on NC_009648 genomic chromosome 5 |
rmlC | PA | Protein, 181 a.a. | 64.29% | 5813122.. 5813667 (+) on NC_002516 genomic chromosome 1 |
Ppk | KP | Protein, 686 a.a. | 54.00% | 3097224.. 3099284 (+) on NC_009648 genomic chromosome 5 |
Ppk | PA | Protein, 736 a.a. | 63.86% | 5902386.. 5904596 (−) on NC_002516 genomic chromosome 1 |
Table 8Details of five common drug resistance genes
Gene | Organism | Type | GC content | Position | Resistance mechanism |
---|
arsB | KP | Protein, 430 a.a | 60.79% | 3625438.. 3626730 (−) on NC_009648 genomic chromosome 5 | Efflux pump |
arsB | PA | Protein, 427 a.a. | 69.70% | 2506978.. 2508261 (+) on NC_002516 genomic chromosome 1 | Efflux pump |
arsC | KP | Protein, 140 a.a. | 60.76% | 3625006.. 3625428 (−) on NC_009648 genomic chromosome 5 | Efflux pump |
arsC | PA | Protein, 156 a.a. | 67.94% | 2508293.. 2508763 (+) on NC_002516 genomic chromosome 1 | Efflux pump |
arsR | KP | Protein, 109 a.a. | 60.91% | 3626782.. 3627111 (−) on NC_009648 genomic chromosome 5 | Efflux pump |
arsR | PA | Protein, 116 a.a. | 69.23% | 2506614.. 2506964 (+) on NC_002516 genomic chromosome 1 | Efflux pump |
phoP | KP | Protein, 223 a.a. | 59.52% | 1292135.. 1292806 (−) on NC_009648 genomic chromosome 5 | Efflux pump, gene-altering cell wall |
phoP | PA | Protein, 225 a.a. | 66.22% | 1277688.. 1278365 (+) on NC_002516 genomic chromosome 1 | Efflux pump, cell wall alteration |
phoQ | KP | Protein, 488 a.a. | 58.62% | 1290669.. 1292135 (−) on NC_009648 genomic chromosome 5 | Drug target, two-component system, virulence, regulation of gene expression |
phoQ | PA | Protein, 448 a.a. | 68.37% | 1278362.. 1279708 (+) on NC_002516 genomic chromosome 1 | Efflux pump, gene-altering cell wall |
Molecular pathways and functional analysis were performed with KEGG Mapper. It is a set of KEGG mapping tools available on the KEGG website (https://www.kegg.jp/ or https://www.genome.jp/kegg/ ) along with the KOALA family of automatic mapping tools KO identifiers (KEGG orthology) available, which were used during mapping. The enrichment results were obtained by analyzing significant genes in each of the three data sets using KEGG Mapper (Supplemental sheets 1, 2, and 3).33 ShinyGO was used to create a pathway visualization of the common 10 significant genes.
Discussion
Our research focused on determining the expression levels of AMR genes KP and PA to reveal their function and the gene interactions with the potential to enable the bacteria to survive the effects of antibiotics designed to kill them.34 A previous report suggests that the Pseudomonas aeruginosa strain AG1 demonstrated remarkable effectiveness with a hybrid assembly, using both short and long reads. This approach provided valuable insights into the genomic architecture and molecular factors associated with multiresistance and virulence determinants.35 A multiomic approach using the genomic and transcriptomic elements related to antibiotic resistance genes was also studied before in the versatile environmental organism PA (AG1).36 We collected microarray data from the GEO database followed by we detected DEGs based on wild-type and mutant groups from KP and PA samples.37 In this study, the underlying possible cellular processes and metabolic pathways in both of the organisms were analyzed.
Data normalization
Normalization is an essential step for microarray data analysis to eliminate systematic variation that affects the measured gene expression levels.38 Hierarchical clustering was used to group the strain samples, and the dendrogram is displayed at the top of a heatmap that shows the Pearson correlation between each pair of samples.39 Sample identity was shown by two-way color indexing the group and strain. The results demonstrated that KP and PA samples from different groups and strains could be successfully distinguished. Similar findings were reported in the identification of core genes of PA against multiple perturbations by Molina-Mora et al.40 The study also discussed various methodological issues related to data analysis, normalization, and classification algorithms. PCA exploratory data analysis results are graphically represented in two-dimensional dot plots of scores of principal components. The total sample was divided into two groups, mutant (red) and wild-type (sky blue) based on different strains of bacterial like the work of Lee et al.41
DEG comparison
Volcano plots are ideal for examining the levels of expression of a large number of genes (logFC Vs log-odd). To compare all the genes in the data sets as well as the important AMR genes, Venny online software was used to draw Venn diagrams. MicrobesOnline was used for in-depth analysis of key genes. It provides a community resource for genome comparison and functional analysis like the previous report in PA strain AG1, where 518 DEGs over time, including 14 hub genes, multiple gene clusters, and 15 enriched pathways were identified.42 More than 1,000 complete genomes of bacteria, fungi, and archaea, as well as thousands of expression microarrays from various organisms ranging from model organisms, are available through this portal like the previous study.43,44 Consequently, with the widespread presence of arsenic in the environment, certain bacteria have evolved arsenic defense mechanisms to digest arsenic. Interestingly we found Ars genes (arsenate reduction and methylation), Aio genes (arsenate oxidase), and Arr genes (arsenate respiration) are the most common genes involved in arsenic metabolism pathways.45 Five (Ars RDABC) or three (ArsRBC) genes are found in the two most prevalent kinds of these operons produce three proteins, (1) an arsenate permease (arsB), (2) an arsenate reductase (arsC), and (3) a regulatory protein (arsR), which transforms arsenate to arsenite.46,47 The phosphate transporter (Pit system) transports arsenate (As V) into the cell, which is then reduced to arsenite (As III) by a cytoplasmic arsenate reductase enzyme encoded by the Ars C gene. The As (III)-specific transmembrane protein arsB, produced by the Ars B gene, then effluxes this (As III) from cells via the chemiosmotic gradient.48,49
The sensor kinase PhoQ and the response regulator PhoP make up the PhoPQ TCS.50 PhoQ can autophosphorylate in response to a variety of environmental signals, including low Mg2+ and Ca2+ levels, acidic pH, the presence of cationic antimicrobial peptides, and osmotic upshift. PhoP receives the phosphate from phosphorylated PhoQ, and activated PhoP regulates the expression of downstream genes (PhoP regulon). Despite the fact that PhoQs in different microorganisms can detect the same or similar environmental signals, the regulons activated by phosphorylated PhoP vary between bacteria species.51 PhoPQ systems of human pathogens are involved in polymyxin resistance, low-magnesium adaptation, virulence, and acid tolerance. Due to the extensive coverage of the PhoP regulon, the loss of PhoPQ function has a pleiotropic effect on a bacterium.52 The research highlights the difference between overlapping drug-resistant genes and overlapping significant genes. These dissimilarities are likely owing to the evolution of genes from a common ancestor with similar functions, which have undergone mutations and changes in their sequences and functions over time.53,54
GO and pathway enrichment analysis
Graphs and integrated them into graph reports are very important for visualizing and comparing GO groups in several samples, including those from bacterial pathogens, synchronously from different sources.55,56 The gene information was retrieved from the PATRIC and MicrobesOnline databases (Tables 7 and 8).32 It is a Shiny application built with many R/Bioconductor programs and a large annotation and pathway database extracted from a variety of sources.28,29 It is intended for in-depth gene list analysis, with graphical enrichment visualization and pathway analysis (p-value < 0.05). The hierarchical clustering tree summarizes the relationship between the significant pathways listed in the enrichment tab. Pathways with a high number of shared genes are grouped and p-values with larger dots are more significant. The interactive plot, like the tree tab, represents the relationship between enriched pathways. If two pathways (nodes) share 20% (default) or more genes, they are interconnected. In our study, the gene sets are significantly enriched in darker nodes. Larger nodes correspond to larger gene sets. More overlapping genes are represented by thicker edges (Fig. 6a, b).29
In clinical application, we found that arsenic resistance genes (arsB, arsC, and arsR) in multidrug-resistant KP and PA can be valuable for researchers in several ways to develop therapeutic use. Salam et al.57 in 2020 studied arsenic resistance genes provided insights into cross-resistance mechanisms between arsenic and antibiotics. The investigators can potentially discover novel drug targets or pathways, offering promising therapeutic avenues to combat multidrug resistance effectively.57,58 Linking to our findings the arsenic resistance genes can be used in target identification for drug development. Those structural similarities gene with efflux pump protein can be taken for new drug development. Inhibiting these genes or proteins could potentially restore sensitivity to antibiotics in multidrug-resistant strains.59 Recently, Venturini et al.60 focused on these genes and suggested clinicians can use them as diagnostic markers to identify multidrug-resistant strains of KP and PA, enabling informed decisions on selecting appropriate antibiotic treatments for better patient outcomes. In personalized treatment strategies, some of the MDR genes also contribute to allowing tailored treatment based on the genetic characteristics of the infecting strain, thereby optimizing patient outcomes. The two-component system genes, phoP, and phoQ, have a crucial role in the multidrug resistance of KP and PA, modulating cellular processes essential for antibiotic resistance. Their detection as diagnostic markers can help identify multidrug-resistant strains, guiding informed treatment decisions. Additionally, targeting these genes presents a promising approach for drug development, offering potential new therapies, while combination strategies may enhance the efficacy of existing antibiotics against these challenging pathogens, ultimately combating multidrug resistance and improving patient outcomes.61 This study explored the genetic basis of drug resistance in KP and PA, two dangerous superbugs responsible for life-threatening infections, including those acquired in hospitals and during the COVID-19 pandemic. The findings offer promising prospects for diagnostics, treatment, and combating antibiotic resistance, aiming to enhance intensive care unit (ICU) patient outcomes and global public health. This ground-breaking study delves into the intricate genetic mechanisms underlying drug resistance in KP and PA, formidable pathogens linked to severe hospital-acquired infections and COVID-19 vulnerability. By analyzing extensive RNA microarray data from multiple datasets, the research reveals 10 shared significant genes and five drug-resistance genes, holding potential for future diagnostics and treatment strategies. These valuable insights aid in combating AMR, fortifying ICU patient care, and bolstering global efforts against life-threatening infectious diseases caused by these two superbugs.
Conclusion
We observed interesting phenomena of nonsimilarity between common significant genes and common drug resistance genes that led to conclusions about genetic divergence due to which its physical properties have been changed but chemically involved in similar drug resistance mechanisms. Our study used a robust integrative bioinformatics analysis of microarray data to investigate the DEG of AMR superbugs. Through a comparative analysis supported by functional enrichment results, diverse pathways and distinct biological processes, molecular function, and cellular components were highlighted. Interestingly, we observed 10 common genes in all three datasets that are highly expressed and significant, five genes that are present in both KP and PA and are involved in drug resistance in both superbugs. Our findings provide a detailed account of the drug resistance gene’s divergent evolution, which could open new vistas for MDR prognosis and can be informative in antimicrobial therapeutics research. The candidate gene approach can be used to validate AMR to know the potential interesting association patterns with resistance mechanisms.
Abbreviations
- a.a.:
amino acid
- AMR:
antimicrobial resistance
- ATP:
adenosine triphosphate
- CH-OH:
Methanol
- CTP:
Cytidine triphosphate
- DEG:
differentially expressed gene
- DNA-T:
DNA- templated
- dTDP:
Deoxythymidine diphosphate
- FMN:
flavin mononucleotide
- GC:
guanine-cytosine
- GEO:
gene expression omnibus
- GMP:
guanosine monophosphate
- GO:
gene ontology
- GTP:
guanosine-5′-triphosphate
- ICM:
integral component of membrane
- ICU:
intensive care unit
- KEGG:
Kyoto Encyclopedia of Genes and Genomes
- KP:
Klebsiella pneumonia
- logFC:
log fold change
- NAD:
Nicotinamide adenine dinucleotide
- NADPH:
Nicotinamide Adenine Dinucleotide Phosphate Hydrogen
- PA:
Pseudomonas aeruginosa
- PAPB:
Positive regulation of phospholipid biosynthesis
- PATRIC:
Pathosystems Resource Integration Center
- PC:
Principal Component
- PCA:
principal component analysis
- PK:
polyphosphate kinase
- PM:
Plasma Membrane
- PTS:
phosphorelay transduction system
- RACS:
response to arsenic-containing substance
- RG:
Resistance Gene
- RG:
resistance gene
- RT:
regulation of transcription
- TCS:
two-component system
- UMP:
Uridine Monophosphate
- UTP:
Uridine-5′-triphosphate
- WHO:
World Health Organization
Declarations
Acknowledgement
We are thankful to the Department of Biotechnology and Department of Computer Application, Siksha O Anusandhan University and Genetic and Genomic Laboratory, SoAS, Centurion University of Technology and Management for providing all the facilities and laboratory assets.
Data sharing statement
Supplementary data to this article can be found online at https://docs.google.com/spreadsheets/d/16rGSB2n1EOa_gNFPjrhe3zI5gpbAER40oCAkc-c-AIg/edit#gid=0
Funding
There is nothing to declare.
Conflict of interest
The authors declare that the research was conducted without any potential conflict of interest.
Authors’ contributions
Study concept and design (SS, SPR, JNM, SS, and TS), database organization (SS, SPR, and JNM), Bioinformatics analysis of the data (SS and SPR), drafting of the manuscript (SS, SPR, JNM, SS, and TS). JNM with JD and all authors revised the manuscript critically and approved the version to be published.