Introduction
Most RNA-containing viruses have small genomes that encode only the key proteins for RNA replication, packaging, and some essential accessory functions.1,2 The expansion of the (+)RNA genome size beyond a 12-kb limit is under restrictions imposed by packaging constraints.3,4 The low fidelity of RNA copying by viral RNA polymerases and the action of cell editing enzymes (e.g., deaminases) may be additional factors hampering the maintenance of large genomic RNAs.5–8 These evolutionary pressures dictate the economy of the coding space achieved by the use of overlapping open reading frames (ORFs) and non-canonical translation strategies in RNA viruses.6,9 Notwithstanding, creating a compact genome is not the only evolutionary trend in the world of RNA viruses. Large nidoviruses (e.g., coronaviruses) have RNA genomes of 30 to 41 kb with minimal or no gene overlaps.10,11
Regardless of their genome size and propensity to use overlapping genes, many (+)RNA viruses employ translational recoding mechanisms, such as programmed ribosomal readthrough (PRT) and ribosomal frameshifting (PRF), to regulate gene expression, primarily the gene for RNA-dependent RNA polymerase (Pol).9 Both PRT and PRF direct a fraction of translating ribosomes to recode or bypass a stop codon in the zero reading frame. As a result, two proteins are produced by translation from one initiation codon: the canonical smaller protein (CSP) and a larger fusion protein (LFP). This review focuses on the occurrence and evolution of PRT and PRF signals in (+)RNA viruses, with some examples from double-stranded RNA (dsRNA) viruses and retroviruses. Because of the limited space, many aspects of these non-canonical translation mechanisms are not discussed here; the reader is encouraged to refer to excellent recent reviews on the topic.9,12–14
Programmed ribosomal readthrough of leaky stop codons
PRT determinants
An occasional read-through of mRNA stop codons is an extremely rare event in both prokaryotes and eukaryotes; however, it increases by ∼1,000 times at specific PRT signals, thus allowing the production of CSPs and LFPs at ratios of 20:1 to 10:1.15 The idea that a termination codon in a viral gene can be suppressed to yield two proteins sharing the N-terminal portion comes from the early works on bacteriophage Qbeta,16 retroviruses,17 and tobacco mosaic virus (TMV).18 In the pioneering work by Hugh Pelham, in vitro translation of TMV RNA was shown to produce the 126-kDa CSP and 180-kDa LFP, with the larger protein generated by suppression of a UAG stop codon (Fig. 1).18 In tobamoviruses, the read-through requires the type I PRT signal with UAG_CAR_YYA consensus (R, purine; Y, pyrimidine).19,20 The PRT is mediated by a specific tRNATyr having a pseudouridine in the anticodon (G-psi-A).21 A type I signal also directs the PRT in the replicase gene of the Providence tetravirus (Table 1a).9,19,20,22–27
Table 1Examples of known and suspected cases of translational recoding signals in RNA viruses
(a) Programmed ribosomal readthrough
|
---|
Taxon | Genome | Product | Signal | Notes | Reference |
---|
Alphavirus | | | | | |
Sindbis virus | (+) | Replicase | PRT type II | stem-loop RSE | 20,22 |
Tobamovirus | (+) | Replicase | PRT type I | linear RNA signal | 19,20 |
Tobravirus | (+) | Replicase | PRT type II | stem-loop RSE | 20 |
Furovirus | (+) | RNA-1, replicase; RNA-2, CP-RTD | PRT type I; PRT type II | stem-loop RSE; stem-loop RSE | 9 |
Pomovirus | (+) | RNA-1, replicase; RNA-2, CP-RTD | PRT type II; PRT type I | stem-loop RSE; linear RNA signal | 27 |
Benyvirus | (+) | CP-RTD | PRT type I | linear RNA signal | 26 |
Alphatetravirus | | | | | |
Providence virus | (+) | replicase | PRT type I | linear RNA signal | 9 |
Luteovirus | (+) | CP-RTD | PRT type III | pseudoknot RSE formed by distal RNA interactions | 24 |
Tombusviridae | | | | | |
Tombusvirus | (+) | replicase | PRT type III | pseudoknot RSE formed by distal RNA interactions | 25 |
Reoviridae | | | | | |
Coltivirus | ds | VP9-RTD | PRT type II | stem-loop RSE | 9 |
Retroviridae | | | | | |
Gammaretrovirus | rt | replicase | PRT type III | pseudoknot RSE | 23 |
In the Sindbis virus, Venezuelan equine encephalitis virus, and related alphaviruses, a UGA stop codon precedes the replicase gene portion coding for RNA polymerase.9 The type II PRT signal in alphaviruses includes the UGA_C sequence20 and a downstream secondary structure element (RNA stimulatory element, RSE).20,22 Similar type II signals exist in the replicase and/or capsid protein (CP) genes of plant furoviruses (Fig. 1), tobraviruses, pecluviruses, pomoviruses, and coltivirus RNA-9 segment gene (Table 1a).9,20 Notably, the divided RNA genomes of furoviruses and pomoviruses contain two PRT signals, one in the RNA-1 (replicase gene) and the other in the RNA-2 (CP-RTD gene) (Table 1a and Fig. 1). In tobacco rattle tobravirus infection, tRNATrp with a methylated cytosine in anticodon (Cm-C-A) promotes the readthrough.28
Type III PRT signals involve a G-rich sequence adjacent to a stop codon (usually UAG) and a downstream RSE. In the murine leukemia gammaretrovirus (MuLV) gag-pol gene, the RSE represents a compact pseudoknot (Fig. 1),23 whereas in the luteovirus CP-RTD gene, the pseudoknot is formed by a stem-loop and a 3′-sequence located ∼750 nt downstream (Table 1a).24 In the tombusvirus replicase gene, the RSE pseudoknot requires even more distant interactions across ∼3,500 nt (Fig. 1).25 The UAG in MuLV gag-pol is suppressed by a glutamine tRNA.29
Biological sense of PRT in viral genes
In MuLV, a leaky UAG codon allows the synthesis of Gag and Gag-Pol polyproteins at a fixed ratio of 10:1, thus providing a means to produce more viral structural proteins than enzymes (Fig. 1). This ratio is vital for virus replication, as artificial modulation of the wildtype ratio of Gag/Gag-Pol is tolerated only to a limited extent, with downregulation of the Gag-Pol fusion being significantly more sensitive than its upregulation.30 In line with this, MuLV RT is able to bind to the translation release factor eRF1, creating a positive feedback loop that increases the synthesis of Gag-Pol.31 Hence, retrovirus PRT-driven Gag-Pol synthesis increases with time, being a unique example of viral PRT regulation.
The replicase gene in the alpha-like (+)RNA virus supergroup (including alphaviruses, tobamoviruses, tobraviruses, and a number of other animal and plant viruses) encompasses the conserved domains of methyltransferase (Mtr), RNA helicase (Hel), and RNA polymerase (Pol).1 A subset of alpha-like virus replicase genes contains a leaky stop codon that drives the synthesis of Mtr-Hel and Mtr-Hel-Pol proteins at a ratio of 20:1 to 10:1 (Fig. 1). A tobamovirus UAG/UAC mutant producing only the 180-kDa Mtr-Hel was able to replicate in Nicotiana benthamiana and Arabidopsis thaliana but produced milder symptoms, was deficient in anti-silencing activity attributed to the 126-kDa protein, and was prone to reversion to the wildtype.32 It was suggested that the 126-kDa protein is involved in opposing the cell defense silencing mechanism rather than in tobamovirus RNA replication, which explains its predominance over the 180-kDa replicase.32 However, the PRT-driven expression of tobamovirus Mtr-Hel and Mtr-Hel-Pol may have an alternative explanation. In the related alpha-like brome mosaic bromovirus (BMV), genomic RNA-1 and RNA-2 code for the replication-associated proteins 1a (Mtr-Hel) and 2a (Pol), respectively.33 BMV 1a is produced in excess of 2aPol owing to the unequal translation activities of RNA-1 and RNA-2.33 BMV 1a is a multifunctional protein that drives the remodeling of endoplasmic reticulum (ER) membranes with the help of cellular ESCRT proteins and reticulons, the creation of replication-associated spherules (membrane invaginations whose interior is lined by hundreds of 1a copies), and the delivery of 2aPol and viral RNA templates to the membranes.33–36 Spherules, similar to those of bromoviruses are produced in infected cells by many (+)RNA viruses in the alphavirus superfamily (alphaviruses, tobamoviruses, and tobraviruses) and beyond (tombusviruses).33 Notably, the tobamovirus 126-kDa CSP is associated with membranes and forms a heterodimer with the 180-kDa LFP, resembling the 1a-2aPol complex in bromoviruses.32 The replicase of tombusviruses consists of the 33-kDa CSP and 92-kDa LFP, with the latter produced via PRT (Fig. 1).37 The N-termini of these proteins contain membrane-binding and dimerization domains.37 In a striking parallel with bromoviruses, the tombusvirus CSP (assisted by the ESCRT system) serves as the key organizer of spherule formation, and multiple copies of this protein cover the spherule interior.37 It is tempting to speculate that the replicative proteins produced by PRT-driven translation in alpha-like and tombus-like viruses play roles equivalent to those of the BMV 1a and 2aPol proteins. In other terms, the PRT in disparate virus groups may serve to produce multiple copies of a CSP that remodels the membranes and paves the spherule interior and a few copies of the LFP with RNA polymerase activity.
The C-terminally extended CP versions (CP-RTD) of plant furoviruses (Fig. 1), luteoviruses, pomoviruses, benyviruses, poleroviruses, and enamoviruses are produced by PRT (Table 1a).9,20 One or a few copies of CP-RTD have been detected by immunospecific electron microscopy at one end of the rod-like particles of beet necrotic yellow vein benyvirus and potato mop-top pomovirus.26,27 A small amount of CP-RTD is associated with icosahedral luterovirus, enamovirus, and polerovirus particles.38,39 Regardless of particle morphology, the virion-incorporated CP-RTD molecules serve to enhance virus interactions with and transmission by corresponding vectors - aphids (luteoviruses, enamoviruses, and poleroviruses) or the soil fungi Polymyxa betae (benyvirus) and Spongospora subterranea (pomovirus).26,27,38,39 Hence, the PRT in these virus systems produces the major CP and the minor CP-RTD at ratios providing for the assembly of viral particles competent for vector transmission.
Programmed ribosomal frameshifting −1 PRF driven by RNA signals
Ribosomal frameshifting occurs when the translating ribosome skips from the zero reading frame to the −1 or +1 reading frame and continues translation to produce a fusion protein. The chance for spontaneous frameshifting upon mRNA translation is quite low (10−3 to 10−7 per codon);40 however, the PRF signals in viral mRNAs promote ribosomes to change the reading frame at 5 to 30% frequencies. A classical −1 PRF signal that controls the ratio of Gag and Gag-Pol polyproteins was initially discovered and characterized in Rous sarcoma alpharetrovirus (Fig. 2).41,42 Similar signals have been found in several clinically important viruses, such as human immunodeficiency lentivirus-1 (HIV-1) and HIV-2, human T-cell lymphotropic deltaretrovirus types 1 and 2, SARS CoV-1, and SARS CoV-2, as well as in a number of other animal and plant viruses (Table 1b).9,12,41–62 The −1 PRF signals consist of a slippery site with X_XXY_YYZ consensus (where XXX denotes any three identical nucleotides, Y is A or U, and Z is A, C, or U; triplets are shown for the zero frame), a spacer of 5 to 9 nucleotides, and a downstream RSE (Fig. 2).42–48 It is postulated that, while the ribosome aminoacyl and peptidyl sites are occupied by the respective XXY and YYZ triplets, a difficult-to-unwind RSE exerts a translational pause, thus promoting backward shifting of two tRNAs in the P and A sites and decoding the slippery sequence as XXX YYY.41–47 RSE in different viral genes is represented by a stable stem-loop (HIV-1 and related lentiviruses, astroviruses, sobemoviruses, and dianthoviruses),49,63 a compact pseudoknot (alpharetroviruses, betaretroviruses, deltaretroviruses, and most nidoviruses),9,41,48,50 or an elaborated pseudoknot formed by RNA interactions across a ∼4 kb distance (luteoviruses) (Table 1b and Fig. 2).51
In most known cases, −1 PRF serves to express viral reverse transcriptase or RNA polymerase LFP (Table 1b and Fig. 2),9 yet this signal is used in some other viral genes. Thus, −1 PRF in Acyrthosiphon pisum virus (a picorna-like insect virus) is employed to synthesize minor capsid proteins.64 In the Sindbis virus and the related alphaviruses, −1 PRF allows the synthesis of a minor transframe (TF) protein, which is included in virions to assist virus budding and spread in animal hosts,52 whereas in flaviviruses it serves to downregulate the expression of RNA polymerase and to produce a TF NS1’ protein influencing virus pathogenicity (Table 1b).53
−1 PRF driven by RNA-protein complexes
An idea that a protein bound to RNA may promote ribosomal frameshifting has received attention in early works and was supported by studies with a synthetic system, where the lentivirus RSE was replaced with an iron-response element from ferritin mRNA to produce a −1 frameshifting signal stimulated by an iron-regulatory protein.65,66 PRF signals stimulated by trans-acting proteins were recently found in cardioviruses and arteriviruses. In porcine reproductive and respiratory syndrome arterivirus (PRRSV), in addition to a pseudoknot-driven −1 PRF regulating the ratio of 1a and 1ab polyproteins, an extra frameshifting occurs on a slippery site GG_GUU_UUU located in the 1a gene portion coding for the nsp2 replicase subunit (Table 1b).54 The −1/−2 PRF in PRRSV requires the interaction of a C-rich sequence, located 10 nucleotides downstream of the shift site, with the complex of cellular poly(C)-binding proteins (PCBP) and the virus-encoded nonstructural protein nsp1β.55,67 Binding of the trans-activating PCBP-nsp1β complex to the C-rich tract results in ribosome stalling on the slippery sequence.55
In encephalomyocarditis virus and Theiler’s murine encephalomyelitis virus (cardioviruses), the −1 PRF occurs on a conserved G_GUU_UUU sequence located within the 2B-coding region of the polyprotein gene and requires binding of the cardiovirus 2A protein to a downstream stem-loop separated from the shift site by a 13-nt spacer (Table 1b and Fig. 2).56,57 The frameshift efficiency in cardiovirus-infected cells gradually increases from 0 to 70% as the infection proceeds, apparently because of the utter dependence of PRF on the available amounts of the trans-activating 2a protein.57 Apart from stabilizing the RSE, 2A binds to small ribosomal subunits and may interfere with host translation factors to further enhance virus frameshifting.68
+1 PRF in viral mRNAs
+1 PRF signals are mechanistically less conserved and relatively uncommon in viruses. In a few groups of (+)RNA viruses and dsRNA viruses, the RNA polymerase is likely to be expressed as an LFP by +1 (or −2) ribosomal frameshifting. Thus, in members of the Closteroviridae family of plant viruses, which belongs to the alpha-like supergroup, the replication-associated Mtr and Hel domains are encoded in a zero-frame ORF 1a, whereas Pol is encoded in a +1-frame ORF 1b (Table 1b and Fig. 2).69,58 GUU_UAG_C is a putative frameshift site in the ORF 1a of beet yellows closterovirus, and PRF may involve ribosome stalling at the stop codon in the A-site and a slippage from GUU to UUU in the P-site.9 In support of this, most of the 110 closterovirus sequences currently available in the GeneBank contain a consensus G/CUU_stop_C at the putative frameshift sites. However, as deduced from amino acid conservation profiles in ORFs 1a and 1b in citrus tristeza closterovirus (CTV) ORF 1a, a predicted frameshift site is located 25 triplets upstream of the stop codon and is represented by a GUU_CGG_C sequence.70 It was proposed that in CTV, the ribosome pausing occurs on a rare arginine CGG codon,70 analogous with some yeast retroelements.71 These sequence comparisons imply that the determinants of the frameshifting signal are not strictly conserved in closteroviruses, although the principal mechanism of the CSP and LFP synthesis may be similar.
There are a number of dsRNA viruses and (−)RNA viruses containing +1 PRF signals in their replication-associated genes (Table 1b). In some members of the Totiviridae family of dsRNA viruses, the Gag-Pol fusion polyprotein is apparently synthesized by +1 (or −2) frameshifting. In Trichomonas vaginalis virus 1, the PRF takes place at the CC_CUU_UUU sequence adjacent to a stop codon, but no stimulating secondary structure was predicted,59 whereas in Leishmania virus 1, the PRF, in addition to a shifty site, may require a pseudoknot located downstream of a stop codon.60 The PA mRNA of the influenza A virus is translated into a conventional PA replicase component and the minor +1 frameshift product, PA-X, that modulates the host cell shut-off.61 The +1 PRF occurs on a conserved sequence UCC_UUU_CGU in the PA gene.62 Intriguingly, similar shifty sequences were found in RNA polymerase genes of disparate viruses of animals and plants, namely the chronic bee paralysis virus, fijiviruses, and amalgamaviruses, suggesting a novel +1 PRF mechanism that is conserved in diverse eukaryotes (Table 1b).62
Biological sense of PRF in viral genes
As is the case with programmed read-through, PRF allows the synthesis of protein isoforms at a fixed ratio. In most cases, this concerns reverse transcriptase and RNA polymerase precursors that are expressed as PRF fusions by the well-conserved ‘−1 simultaneous slippage’ and then undergo proteolytic maturation. Obviously, keeping the synthesis of structural and abundant nonstructural proteins relatively high compared to a polymerase fusion helps to maintain a balance between the proteins needed in stochiometric and enzymatic amounts. It is possible that the same holds true for closteroviruses that express more of the Mtr-Hel 1a polyprotein than an Mtr-Hel-Pol 1ab fusion. It should be noted that the 1a proteins of beet yellows closterovirus and closely related closteroviruses contain a conserved hydrophobic Zemlya domain that interacts with ER membranes and may be involved in their remodeling.72 This implies a potential parallel with other alpha-like viruses that express a bulk of proteins with the Mtr, Hel, and membrane binding domains and much lesser amounts of RNA polymerase.33–35,73
The ratio of shorter and longer polyproteins may change in time in PRF driven by RNA-protein signals such as those of arteriviruses and cardioviruses.54,56,57,67 Early in PRRSV infection, the 5′-terminal ORF1a is translated into a single 1a polyprotein whose processing results in equimolar amounts of mature proteins nsp1α to nsp8. At later stages, with the accumulation of the nsp1β, this protein complexes with PCBP and promotes −1 or −2 PRF, resulting in the predominant expression of nsp1α, nsp1β, and two nsp2 derivatives, nsp2N and nsp2TF.67 The nsp1α and nsp1β are involved in transcriptional control and innate immune evasion. The nsp2TF, representing the N-terminal two-thirds of nsp2 fused to a short trans-frame C-terminal sequence, has a membrane-binding domain and is involved in the formation of replication organelles and repression of interferon responses; mutations hampering the synthesis of the nsp2TF caused a 50- to 100-fold drop in the PRRSV replication in cell culture.67 Taken together, these data provide a good rationale for the elevated synthesis of nsp1α, nsp1β, nsp2N, and nsp2TF at late stages of arterivirus infection. Likewise, in cardioviruses, early translation produces structural and nonstructural proteins in equimolar amounts. At later stages, the accumulated 2A protein binds to a stem-loop within the 2B-coding region, which results in −1 PRF-driven overexpression of the structural proteins at the expense of RNA polymerase and other replication-associated proteins (Fig. 2).57 Conceivably, this mechanism will also lead to upregulation of the cardiovirus 2A and leader protein (L) expression needed for their success in the shut-off of cap-dependent translation of host mRNAs, inhibition of apoptosis, and interference with cell nucleocytoplasmic trafficking.74
Evolution of translational recoding mechanisms in viruses
The use of translational recoding signals by viruses has obvious advantages as it expands the RNA coding capacity and provides a means to regulate protein expression. However, it has some inherent drawbacks that have had to be attended in the course of virus evolution. In cells, mutation or aberrant splicing creates mRNAs with illegal stop codons and extended 3′-untranslated regions that are targets for nonsense-mediated decay (NMD).75 Many virus mRNAs are naturally polycistronic and contain 3′-distal stop codons, including those associated with PRT and PRF. However, their genomes are obviously not sensitive to NMD, at least in part, due to evolving specific RNA signals that inhibit NMD.76 Another example of a host factor that opposes translational recoding is the antiviral protein Shiftless, which specifically inhibits the −1 PRF in HIV-1 and a number of other virus and cell mRNAs.77 The question of whether viruses have mechanisms to evade the action of Shiftless remains open.
PRT and PRF signals are present in viruses and across all kingdoms of life, with −1 PRF being most common in viruses (Table 1) and PRT being predominant in eukaryotic mRNAs.9,12–14,78–81 Some eukaryotic recoding signals resemble those in RNA viruses, e.g., the −1 shifty site-spacer-pseudoknot arrays in the human paraneoplastic Ma3 gene and mouse Edr gene.82,83 It could be postulated that these recoding signals can be exchanged among viruses and cells by horizontal flow. RNA-to-RNA transfer is quite possible, as RNA viruses are prone to recombination and may acquire non-coding sequences, gene fragments, and even entire genes from other viruses or cell mRNAs.1,58,84 RNA-to-DNA transfer of recoding signals resulting from the interplay of retroelements, retroviruses, and the cell genome also cannot be excluded. On the other hand, PRT and PRF signals might have been ‘invented’ independently on several occasions in virus gene evolution. In support of the recombination hypothesis, experimental swapping of recoding signals (i.e., replacement of a natural PRT signal for a heterologous PRT or PRF signal) may create viable retrovirus and (+)RNA virus phenotypes.85,86
The expression mechanisms in RNA viruses evolve faster than the conserved amino acid sequences of the replication-associated enzymes. This may result in quite dissimilar patterns of protein synthesis in the related viruses. In the alpha-like supergroup, the RNA polymerase expression in tobamoviruses,18 tobraviruses,28 and some (but not all) alphaviruses employs PRT,9 whereas in closteroviruses it requires +1 PRF (Table 1 and Figs. 1 and 2). The same applies to totiviruses, of which some use −1 PRF and others +1 PRF to express the Gag-Pol fusion.9 Most members of the family Tombusviridae employ PRT to express RNA polymerase, with the exception of dianthoviruses, which use −1 PRF (Table 1).9 Within an evolutionary compact virus lineage, some conserved genes may be devoid of the recoding signals that exist in their relatives. The largest known (+)RNA genome of planarian secretory cell nidovirus does not have a −1 PRF signal, which is ubiquitous in other nidovirus replicase genes, nor does it have any other apparent recoding signals.11 Likewise, the alpha-like replicases of tymoviruses, potexviruses, and carlaviruses result from the orthodox translation of single uninterrupted genes. Even more strikingly, the presence of recoding signals may vary on the subspecies level, as is the case with the Semliki Forest alphavirus complex, where some strains encompass a leaky stop codon between the Mtr-Hel- and Pol-coding genes, whereas others have none.22,87 Hence, the programmed recoding signals may be retired in some members of evolutionary compact RNA virus groups, and whether and how their absence is compensated, remains unknown. The establishment of virus expression patterns occurred long after the appearance of extant virus lineages, and the recoding signals might have been acquired or dismissed on multiple occasions in evolution.
Conclusion
In the past decade, the use of bioinformatic methods and databases, dual reporter systems, ribosome profiling, and other experimental approaches has greatly expanded our knowledge of non-canonical translation mechanisms in viruses. Further studies are expected to provide new perspectives on antiviral strategies targeted at viral PRT and PRF and the use of recoding mechanisms in synthetic biology.
Abbreviations
- BMV:
brome mosaic bromovirus
- CP:
capsid protein
- CP-RTD:
capsid protein readthrough domain
- CSP:
canonical smaller protein
- CTV:
citrus tristeza virus
- dsRNA:
double-stranded RNA
- ER:
endoplasmic reticulum
- Hel:
RNA helicase domain
- HIV:
human immunodeficiency lentivirus
- LFP:
larger fusion protein
- Mtr:
methyltransferase domain
- MuLV:
murine leukemia gammaretrovirus
- NMD:
nonsense-mediated decay
- ORF:
open reading frame
- Pol:
polymerase
- PCBP:
poly(C)-binding proteins
- PRF:
programmed ribosomal frameshifting
- PRRSV:
porcine reproductive and respiratory syndrome arterivirus
- PRT:
programmed ribosomal readthrough
- RSE:
RNA stimulatory element
- RSV:
Rous sarcoma virus
- RTD:
readthrough domain
- TF:
transframe
- TMV:
tobacco mosaic virus
Declarations
Acknowledgement
The author is grateful to Professor Andrey Vartapetian for his critical reading of the manuscript.
Funding
There is nothing to declare.
Conflict of interest
There is nothing to declare.