Introduction
Visualization of the spatiotemporal organization of the genome in the nucleus is important to advance our understanding of the complex interplay between chromatin structure, movement, and function in normal as well as in diseased states.1 Conventionally, the in situ hybridization assay is performed to label the target DNA sequence with a reporter probe via complementary base-pairing, and this technique is generally regarded as the gold standard for visualizing and localizing specific genomic loci in fixed cells.2 However, the complicated and lengthy procedure of the in situ hybridization assay has hindered the adoption of this technique in routine clinical practices. Typical in-situ hybridization assays, such as fluorescent in situ hybridization (FISH) and chromogenic in situ hybridization, involve probe preparation, sample denaturation, and overnight hybridization before microscopy imaging is conducted. Furthermore, the incompatibility of in situ hybridization with living cells precludes its use for live cell imaging.3
In view of the limitations of in situ hybridization, the system of clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated proteins (Cas) has emerged as a potential solution for the development of a versatile live cell imaging tool. The CRISPR/Cas system is an adaptive defense mechanism in prokaryotes that allows them to counter viral invasion by recognizing and degrading specific foreign nucleic acids.4,5 Presently, the CRISPR/Cas9 system derived from Streptococcus pyogenes is the most widely used genome editing tool, and over the years, various CRISPR/Cas systems were also developed for diagnostic and therapeutic purposes.6–9 The CRISPR/Cas9 system essentially comprises the Cas9 endonuclease and a single-guide RNA (sgRNA). The CRISPR/Cas9 system is highly programmable because the specific binding and introduction of a double-strand break by the Cas9 endonuclease are dictated by the sgRNA. The sgRNA can be customized to target any genomic sequence with a protospacer adjacent motif (PAM) located approximately 20 nucleotides to the sgRNA-complementary target sequence.4
An endonuclease-deficient Cas9, also known as dead Cas9 (dCas9), has been engineered to retain its ability to the target DNA sequence without cleaving it. Specifically, the RuvC and Histidine-Asparagine-Histidine (HNH) catalytic domains of Cas9 were mutated to generate dCas9 that is devoid of nuclease activity,4,10 and subsequent fusion of the dCas9 with a fluorescent protein (FP) allows specific genomic sequences to be visualized in living cells.3 Since its first inception, various strategies have been developed to improve the applicability of dCas9 for dynamic visualization of genomic elements in cells. In this review, we present advances in CRISPR/Cas9 strategies for bioimaging, including the labeling of dCas9 and engineering of sgRNA, as well as other strategies aimed at improving sensitivity, specificity, signal-to-noise ratio (SNR), and enabling multi-loci visualization. An overview of CRISPR-Cas9-mediated imaging of genomic loci in cells is presented in Figure 1, and the characteristics of the various strategies developed are summarized in Table 1.3,11,12–41
Table 1Characteristics of various strategies developed for CRISPR-Cas9-mediated imaging of genomic loci in cells
Type of Cas9 | Cas9 system | Target cell | Target sequences | Features | Live/ fixed cells | Ref. |
---|
Labeling of dCas9 protein |
dSpCas9 | dSpCas9-EGFP | Human cells | Repetitive elements in telomeres & MUC4 gene, nonrepetitive sequence in MUC4 gene | First demonstration of Cas9 as an imaging tool and able to target repetitive and nonrepetitive loci | Live | 3 |
dhCas9 | dhCas9-EGFP | Murine cells | Repetitive elements in telomeres, minor & major satellite regions | First demonstration of repetitive loci imaging in animal cells | Live & fixed | 12 |
dSpCas9 | dSpCas9-3×GFP | Plant cells | Repetitive elements in telomeres | Improves localization of dCas9-FP in plant cells by modifying sgRNA | Live | 11 |
dSpCas9, dNmCas9, dSt1Cas9 | dSpCas9-RFP, dNmCas9-GFP & dSt1-BFP | Human cells | Repetitive elements in telomeres | Demonstrates multi-loci imaging by using dCas9 orthologs fused to different FPs | Live | 13 |
dSpCas9, dSaCas9 | dSpCas9-EGFP, dSaCas9-mCherry | Human cells | Repetitive elements in telomeres & MUC4 gene | Demonstrates dual-color imaging by using dCas9 orthologs | Live | 15 |
dSpCas9, dSaCas9 | dSpCas9-mRuby2, dSaCas9-EGFP | Plant cells | Repetitive elements in telomeres | First demonstration of repetitive loci imaging in plant cells | Live | 16 |
dSpCas9 | dSpCas9-scFv-GCN4-sfGFP-GB1 | Human cells | Repetitive elements in telomeres | First demonstration of SunTag system for CRISPR imaging | Live | 18 |
dSpCas9 | dSpCas9-scFv-GCN4-sfGFP-GB1 | Human cells | Repetitive elements in telomeres, MUC1 & MUC4 genes | Improves labeling efficiency of SunTag by coupling with polycistronic vector | Live & fixed | 19 |
dSpCas9 | dSpCas9-scFv-GCN4-sfGFP, dSpCas9- scFv-GCN4-3XmNeonGreen, dSpCas9- scFv-GCN4-mNeonGreen | Human cells | Telomeres and low repetitive chromosome loci | Imaging of low-repetitive chromosome loci | Live | 17 |
dSpCas9 | dSpCas9-Halo-Halo ligand-JF646 | Murine cells | Major satellite regions at pericentromic regions | Allows imaging of pericentromeres without global denaturation | Fixed | 21 |
dSpCas9 | dSpCas9-HaloTag- JF549 | Murine cells | Short interspersed nuclear elements and heterochromatin regions | Demonstrates the interrogation of chromatin in live cells | Live | 20 |
dSpCas9 | Biotinylated dCas9-sreptavidin conjugated QD | Pseudorabies virus tracking in cell lines (Vero & HeLa cells) | dsDNA genome | Demonstrates real-time virus tracking in cells | Live | 22 |
dSpCas9 | Biotinylated dCas9-sreptavidin conjugated QD525, LAP-dCas9-Tz1-QD625 | HIV-1 integrated proviral DNA in cells | dsDNA genome | Demonstrates dual-color imaging by using QD labeled through bioorthogonal ligation reaction | Live | 23 |
dSpCas9 | LgBiT-dSpCas9, SmBiT-dSpCas9 | Human cells | MUC4 gene | Demonstrates the use of NanoLuc for CRISPR imaging | Live | 25 |
Engineering of sgRNA |
dSpCas9 | dSpCas9-MS2-MCP-EFGP, dSpCas9-PP7-PCP-mCherry | Murine cells | Minor & major satellite regions | First demonstration of engineered sgRNA for dual-color imaging | Live | 14 |
dSpCas9 | dSpCas9-2×MS2-tdMCP-3×EFGP, dSpCas9-3×PP7-tdPCP-3×EFGP, dSpCas9-2×MS2-tdMCP-3×mRuby | Plant cells | Repetitive elements in telomeres | Improved labeling efficiency of telomeres as compared to dCas-GFP | Live | 29 |
dSpCas9 | dSpCas9-MS2-BFP, dSpCas9-PP7-GFP, dSpCas9-boxB-RFP, dSpCas9-MS2-BFP-PP7-GFP, dSpCas9-PP7-GFP-boxB-RFP, dSpCas9-boxB-RFP-MS2-BFP | Human cells | Repetitive elements in telomeres | Demonstrates multiplexing imaging up to six different loci | Live | 26 |
dSpCas9 | dSpCas9-aptamer-tdMCP-mCherry, dSpCas9-aptamer-tdPCP-EGPF | Human cells | Repetitive elements in telomeres, centromeres and MUC4 | Demonstrates long-term imaging up to 26 hours | Live | 27 |
dSpCas9 | dSpCas9-6XMS2 aptamer-MCP-EYFP, dSpCas9-6XPP7 aptamer-PCP-RFP | Human cells | Repetitive elements in telomeres | Similar to Shao et al.27 but with an extension of sgRNA hairpins | Live | 28 |
dSpCas9 | dSpCas9-16XMS2 aptamer-MCP-YFP | Human cells | low and nonrepetitive chromosome loci | Demonstrates the ability of one extended sgRNA to detect eight repeats | Live | 30 |
dSpCas9 | dSpCas9-PUFBS-Clover-PUFc, dSpCas9-PUFBS-Ruby-PUFa | Human cells | Repetitive elements in telomeres and centromeres | First demonstration of Casilio for cell imaging | Live | 31 |
dSpCas9 | dSpCas9-PUFBS-Clover-PUF48107, dSpCas9-PUFBS-IRF670-PUFc, dSpCas9-PUFBS-mRuby2-PUF9R | Human cells | Nonrepetitive genomic loci | Demonstrates the use of three Casilio modules for the tricolor imaging of three nonrepetitive loci | Live | 33 |
dSpCas9 | dSpCas9-PUFBS-Clover-PUF | Murine cells | Repetitive elements in telomeres and major satellites | Improves Casilio system by combining all the elements in a plasmid | Live | 34 |
dSpCas9 | dSpCas9-MS2-mVenus, dSpCas9-PP7-mCherry, dSpCas9-Puf1-PUM-iRF670 | Human cells | long non-coding RNA loci | Demonstrates allele-specific labeling | Live | 35 |
dSpCas9 | dSpCas9-Atto 550-labelled gRNA, dSpCas9-Alexa fluor 488-labelled gRNA | Plant cells, human cells | Repetitive elements in telomeres and centromeres | First demonstration of repetitive loci imaging in formaldehyde-fixed, isolated nuclei and chromosomes | Fixed | 36 |
dSpCas9 | dSpCas9-Atto 550-labelled gRNA, dSpCas9-Alexa fluor 488-labelled gRNA | Plant cells, murine cells | Repetitive elements in telomeres and centromeres | Tris-HCL treatment to enable RGEN-ISL imaging of ethanol:acetic acid-fixed, isolated nuclei and chromosomes | Fixed | 37 |
dSpCas9 | dSpCas9-Halo-Atto 550-labelled gRNA | Plant cells | Repetitive elements in telomeres and centromeres | Modified RGEN-ISL for repetitive loci imaging in paraformaldehyde-fixed plant tissues | Fixed | 38 |
dSpCas9 | dSpCas9-Atto 550-labelled gRNA | Plant cells | Repetitive elements in telomeres and knobs | Simultaneous imaging of DNA repeats (RGEN-ISL), proteins (immunostaining) and DNA replication sites (EdU labeling) in formaldehyde-fixed, isolated nuclei and chromosomes | Fixed | 39 |
dSpCas9 | dSpCas9-Cy3-labeled gRNA | Human cells | Repetitive elements in genomic loci | First demonstration of direct labeling of gRNA for live cell imaging | Live | 40 |
dSpCas9 | dSpCas9-MTS-MB | Human cells | Repetitive elements in telomeres and centromeres | First demonstration of molecular beacon for cell imaging | Live | 41 |
dSpCas9 | dSpCas9-MTS-dual-FRET MB | Human cells | Nonrepetitive genomic loci | First demonstration of dual-FRET molecular beacon for cell imaging | Live | 32 |
dSpCas9 | dSpCas9-pepper-tDeg-tdTomato | Human cells | Telomere, low- and high-copy genomic loci | Demonstrates fluorogenic CRISPR for cell imaging | Live | 24 |
Labeling of dCas9 protein
In 2013, the feasibility of using the CRISPR/Cas9 system for real-time, sequence-specific bioimaging of genomic elements in living human cells without introducing artificially inserted sequences was demonstrated for the first time by Chen and colleagues.3 The dynamic imaging of repetitive sequences in telomeres and MUC4 coding gene within human cells was achieved using dSpCas9 fused to an enhanced green fluorescent protein (EGFP). The construct of the SpCas9-EGFP fusion, carrying two copies of nuclear localization signals (NLS), greatly influenced its nuclear targeting ability as the investigated constructs resulted in nuclear localization that ranged from 20% (dSpCas9-NLS-NLS-EGFP) to almost 100% (NLS-dSpCas9-EGFP-NLS and NLS-dSpCas9-NLS-EGFP). During telomere imaging, the authors attributed the frequent nucleolar-like signals with fewer observable telomere spots to unbound dSpCas9-EGFP as dCas9 has a natural affinity for nucleolar regions.3 This inherent drawback was countered to an extent by improving the design of the sgRNA. Specifically, a putative polymerase-III terminator was removed by an A-U base pair flip to avoid premature termination of transcription, and the dSpCas9-binding hairpin structure was extended to improve the assembly of sgRNA-dCas9. Whether used alone or in combination, both strategies resulted in increased puncta numbers and the concurrent reduction in background and nucleolar signals, which led to an improved SNR. These strategies were also effective in improving SNR when visualizing telomeres with dSpCas9-3×GFP in live plant cells.11 Considering that most sequences in the human genome are nonrepetitive and each dCas9-EGFP can only recruit one copy of FP, Chen et al.3 demonstrated that an array of 26 to 36 sgRNAs tiled along the target locus enabled the visualization of a nonrepetitive sequence with the dSpCas9-EGFP labeling method. In this context, the labeling efficiency depends on the number of functional sgRNAs and successfully delivered into the cells.42 This landmark study by Chen et al.3 was followed closely by another study that demonstrated the applicability of using dead human codon optimized Cas9 (dhCas9)-EGFP to visualize highly repetitive sequences in living mouse embryonic stem cells.12 In addition to showing via 3D-FISH that the labeling of dCas9-EGFP/sgRNA was specific and remained associated with targeted DNA sequences during mitosis, Anton et al.12 also explored the combination of dCas9 labeling with 3D structured illumination microscopy to visualize the ultrastructure of diffraction-limited chromatin clusters and study their spatial relationship with known associated proteins.
A multi-colored version of the CRISPR system was subsequently developed by Ma et al.13 that centered on the use of three dCas9 orthologs fused to different FPs. In addition to dSpCas9, the dCas9 orthologs from Neisseria meningitidis (dNmCas9) and Streptococcus thermophilus (dSt1Cas9) were chosen to avoid cross-talk in cognate sgRNA binding between the three orthogonal Cas9 variants. Each dCas9 ortholog was then fused with three copies of green fluorescent protein (GFP), red fluorescent protein (RFP), or blue fluorescent protein in tandem to enhance the labeling signal. Contrary to the monochromatic imaging achieved with dCas9-GFP, the simultaneous labeling of various pairs of chromosomal loci afforded by this multi-colored, orthogonal Cas9-based labeling system enables the relative position and movement of these loci to be determined during cellular processes of interest in living cells. While the longer PAM of dNmCas9 (5′-NNNNGATT-3′) and dSt1Cas9 (5′-NNAGAAW-3′) may reduce off-target binding and confer greater specificity than dSpCas9 (5′-NGG-3′), it also restricts the range of sequences that can be targeted by those Cas9 orthologs.43,44 Furthermore, the large size of dCas9-FP proteins increases the difficulty of transfection or viral infection, as multiple large fusion proteins and their corresponding gRNAs have to be introduced into the same eukaryotic cell.14 Hence, Chen et al.15 proposed an alternative Cas9 ortholog to be used in combination with dSpCas9 to create a dual-color CRISPR system. They selected the Cas9 from Staphylococcus aureus (dSaCas9) as the smaller size of dSaCas9 is advantageous for delivery with viral expression vectors and has a less restrictive PAM sequence (5′-NNGRRT-3′) compared to dNmCas9 and St1dCas9. The resulting green (dSpCas9-EGFP) and red (dSaCas9-mCherry) CRISPR system was shown to resolve two genomic loci spaced by less than 300 kb and to label multiple genomic elements in the same cell through color mixing.15 In a separate study, the simultaneous use of dSpCas9-mRuby2 and dSaCas9-EFGP guided by the same sgRNA was investigated for the visualization of telomeres in living plant cells.16 Although dSaCas9 detected a greater percentage of telomeres (85.5%) compared to dSpCas9 (78%) when used separately, no significant difference in labeling efficiency was observed when these Cas9 orthologs were used simultaneously, as they showed almost complete co-localization.16
CRISPR imaging strategies relying on dCas9-FP to create a detectable signal on a given chromosomal site tend to have a low SNR due to the limited number of FPs that can be fused to the dCas9 protein and the limited brightness of each FP.13,17 Recognizing that the fusion of more than three copies of FP poses a challenge during plasmid construction due to the size of the FP (∼25 kDa) and increased susceptibility to bacterial recombination, Tanenbaum and colleagues fused a protein scaffold consisting of a repeating peptide (GCN4) array called SunTag to the dCas9 protein instead.18 The SunTag, in turn, binds to multiple single-chain variable fragment antibodies (scFv) fused to FPs, and the resulting controlled protein multimerization leads to an ultrabright fluorescent labeling of molecules. The SunTag system can recruit up to 24 copies of superfolder GFP (sfGFP) for imaging live cell dynamics and structures at the single-molecule level.18 In a separate study, the SunTag system with a 24×GCN4 tag improved labeling efficiency by 2.5-fold and the SNR by 5.4-fold as compared to dCas9-GFP in labeling high-repeat sequences in telomeres.19 By combining a polycistronic vector with the SunTag system, the resulting PoSTAC system allowed the expression of multiple sgRNAs using a single plasmid, with expression levels in single cells higher than co-transfection with plasmids containing individual sgRNAs.19 Nevertheless, the size of the tag can reach up to 1,400 kDa with 24 bound scFv-sfGFPs, and such large tags could potentially interfere with the activity or half-life when fused to a protein.18 A brighter alternative to sfGFP in the SunTag system was investigated by Ye et al.17 and a significant increase in fluorescent intensity was observed with three tandem repeats of mNeonGreen (3XmNeonGreen) compared to a single copy of sfGFP or mNeonGreen. Although low-repeat chromosome loci with as few as 15 copies of exact repeats were successfully visualized with sfGFP, mNeonGreen, and 3XmNeonGreen via the SunTag system, the best SNR was achieved when the antibodies were fused to a single copy of mNeonGreen.17
Other than the SunTag system, the dCas9 protein could also be fused with a HaloTag domain as demonstrated by Knight et al.20 and Deng et al.21 The HaloTag is a modular protein tag based on a modified Rhodococcus dehalogenase (DhaA), readily forms a covalent bond with a HaloTag ligand comprising a chloroalkane linker attached to a useful molecule.45 Therefore, the dCas9-HaloTag system has the capability for multi-color cell imaging by coupling the HaloTag ligands with different fluorescent dyes.21 Although HaloTag lacks the signal amplification capability of SunTag, a single dSpCas9-HaloTag molecule could be visualized using the cell-permeable, fluorescent Janelia Fluor 646 (JF646) as the HaloTag ligand, and the single-particle tracking allowed the diffusion and chromatin binding properties of Cas9 to be determined in living cells.20 The HaloTag system was also used by Deng et al.21 for multicolor labeling of target loci, including repetitive and nonrepetitive genomic sequences in fixed cells while circumventing the harsh and laborious procedures associated with traditional DNA-FISH. The Cas9-mediated fluorescence in situ hybridization protocol does not only remove the heat and formamide denaturation treatment entirely but could also shorten the assay time to 15 minutes under optimized conditions, with better preservation of cell morphology and DNA structure under the mild Cas9-mediated fluorescence in situ hybridization assay conditions (25°C and 37°C).
Whereas advancements in CRISPR-Cas9 imaging systems have mostly focused on the visualizing genomic sequences of mammalian cells, the use of quantum dots (QD) has been proposed to address the distinct methodological challenges encountered in single virus tracking.22 Given that QDs have a wide range of colors with excellent photoluminescence and photostability, a CRISPR-Cas9 imaging system that labels the nucleic acids of Pseudorabies virus (PRV) with QDs was devised by Yang et al.22 Briefly, biotinylated dCas9:gRNA complexes and streptavidin-conjugated QDs were firstly transfected into a mammalian cell line and then infected with PRV so that viral nucleic acids would be labeled with QDs during PRV assembly without modifying the envelope and capsid. The gRNA specifically targeted a gene of PRV not involved in viral infection and replication, ensuring that the infection process from viral adsorption to nuclear entry was not affected. The resulting PRV-QDs were then used to track and observe the entire infection process of PRV in different cell lines, making this strategy a powerful tool for studying viral infection.22 In a separate study, a dual-color CRISPR system that utilizes QDs was developed to visualize human immunodeficiency virus 1 integrated proviral DNA at the single copy level in latently infected cells.23 Instead of utilizing Cas9 orthologs from different bacteria to attain dual color labeling,13,15 differentially labeled dCas9 proteins were generated using two biocompatible labeling methods.23 Specifically, lipoic acid ligase (LplA) ligated trans-cyclooctene to the LplA acceptor peptide-tagged dCas9 followed by conjugation with tetrazine-QD625 using the Diels-Alder cycloaddition to generate dCas9-QD625 that emits a red fluorescent signal. The biotin-streptavidin labeling method was then used to couple biotinylated dCas9 to streptavidin-QD525, generating dCas9-QD525 that emits a green fluorescent signal. The specificity of the dual-color CRISPR system was further demonstrated by the absence of crosstalk between the two bioorthogonal ligation reactions, and only QDs linked with the CRISPR system could be transported into the cell nucleus. The co-localization of dCas9-QD625 and dCas9-QD525 in the subnuclear position, as a result of specific binding to human immunodeficiency virus 1 provirus integrated within the chromosomes of living cells, was validated using DNA-FISH.
Fluorescence-based imaging is compounded by the naturally high cellular auto-fluorescent background, as well as phototoxicity and photobleaching issues.46 Imaging methods using “always-on” fluorescent reporters that incessantly emit signals also suffer from poor differentiation between bound and unbound reporters.24 Hence, CRISPR/dCas9-FP systems have generally relied on high local concentrations of reporters to distinguish the signal from the background, limiting the technology to the detection of highly repetitive elements targeted by one sgRNA or unique sequences targeted by 26 to 36 non-overlapping sgRNAs.3,13,15,47 To circumvent the autofluorescence, phototoxicity, and photobleaching issues associated with fluorescence imaging, a bioluminescence-based firefly luciferase (Fluc) structural complementation reporter system called Paired dCas9 (PC) was developed by Zhang et al.48 The N- and C-terminal halves of the firefly luciferase (NFluc and CFluc) were each fused to a dSpCas9 protein, and proximity-mediated reassembly of the two Fluc fragments into a full-length luciferase was realized by directing the dCas9-sgRNA-NFluc and dCas9-sgRNA-CFluc to bind to the upstream and downstream segments of a target DNA site, respectively. The reassembled Fluc then acts as a “turn-on” reporter, and a luminescence signal is generated via the catalytic conversion of luciferin to oxyluciferin. Application of the PC system was demonstrated through the detection of the Mycobacterium tuberculosis genome by targeting the 16S rRNA gene region. Compared to Fluc, an engineered luciferase called NanoLuc (Nluc) generates a 150-fold brighter luminescent signal with its substrate furimazine, and it is also significantly smaller in size and more robust.49 These advantages of Nluc were capitalized by Heath et. al.25 through the use of NanoLuc Binary Technology (NanoBiT) to develop a bioluminescence-based Nluc complementation reporter system. Similar to the strategy used by Zhang et al.,48 two dCas9-sgRNA-NanoBiT complexes were directed to two target DNA sites with a specific DNA orientation and spacing distance to reassemble the full-length Nluc and restore its catalytic activity to generate luminescence in the presence of furimazine. Live imaging at the single-cell resolution of repetitive and nonrepetitive endogenous genomic sequences, as well as detection of single-copy genome edits induced by CRISPR-Cas9, were demonstrated by Heath et. al.25 But the approach still required further optimization due to the variable SNR observed across different cell lines.
Engineering of sgRNA
The sgRNA is integral to the targeting specificity of dCas9 and has proven to be an equally popular modification site for the recruitment of signal generators in CRISPR imaging. Several studies have described the insertion of MS2, PP7, and/or boxB aptamers into regions of the sgRNA scaffold that protrude from the Cas9/sgRNA complex, allowing FP-tagged RNA binding proteins such as MS2 coat protein (MCP), PP7 coat protein (PCP), and λ N22 peptide to recognize and bind to the corresponding RNA aptamer motifs.14,27–29,50 Optimization of the aptamer insertion site in the sgRNAs remains crucial, as sgRNAs with the same number of aptamer hairpins but with different insertion locations resulted in significantly different SNRs.28 In general, sgRNAs with aptamer hairpins appended to both the tetraloop and stem-loop 2 outperformed those with aptamers appended to only the tetraloop, stem-loop 2, or 3′ end. It was postulated that the double extension of the tetraloop and stem-loop 2 with RNA aptamers helps to stabilize the dCas9/sgRNA assembly and improve the binding of the complex to the target site.3,27,28 In plant cells, labeling efficiency was found to be influenced by the copy number of aptamers in the sgRNA but less so by the types of promoters.29 The orthogonality of the MS2 and PP7 aptamers was largely capitalized to create a dual-color CRISPR imaging system, as demonstrated through the simultaneous imaging of two distinct repeat regions in living cells, such as major and minor satellite regions of murine genomes,14 as well as telomeres and centromeres in human cells.27,28
At the same time, Ma et al.26 sought to extend beyond the imaging limit of dual color systems by developing CRISPRainbow that relies on MS2/MCP-BRP, PP7/PCP-GFP, and boxB/N22-RFP pairs to generate a wide spectrum of colors. Specifically, two copies of MS2, PP7, or boxB were inserted into the stem-loops of sgRNA to recruit the corresponding FP-tagged RNA binding proteins to generate blue, green, and red signals, respectively. The recruitment of two different FP-tagged RNA binding proteins to generate secondary colors was accomplished by appending two types of aptamers in one sgRNA: MS2 and PP7 (cyan), MS2 and boxB (magenta), and PP7 and boxB (yellow). Finally, the insertion of MS2, PP7, and boxB in the sgRNA leads to the recruitment of MCP-BRP, PCP-GFP, and N22-RFP, resulting in the generation of tertiary color (white). Another notable modification made to the sgRNA by Ma et al.26 was the replacement of an A-U pair in the first stem-loop with G-C, as it led to a substantial increase in SNR even without stem-loop extension. The simultaneous labeling of up to six distinct repetitive targets in living cells was demonstrated using CRISPRainbow, and the polychromatic range of the system could be potentially used to track the movement of several endogenous chromosomal loci concurrently in the same cell.26 Furthermore, the sgRNA-based labeling method has shown more than a twofold increase in tolerance to photobleaching compared to Cas9-based labeling methods, making it a robust labeling approach for long-term tracking of chromosomal dynamics.27
Whereas prior sgRNAs were engineered to contain two to eight MS2, PP7, and/or boxB hairpins per sgRNA.14,26–28,50,51 Qin et al.30 further augmented the fluorescent signal generated per dCas9-sgRNA complex by extending the 3′ end of the sgRNA with 16 tandem MS2 hairpins to recruit 32 FP-tagged MCP, allowing as few as eight repeats to be detected with one extended sgRNA. Compared to the 26 sgRNAs required in the study by Chen et al.,3 nonrepetitive locus could be detected with eight unique sgRNAs using a confocal microscope or four unique sgRNAs using lattice light sheet microscopy. Differential labeling of two low-repeat loci was also shown using mCherry-tagged dCas9 by targeting one site with a conventional MS2-free sgRNA and another with an extended sgRNA containing multiple MS2 hairpins to recruit YFP-tagged MCP. However, drawbacks of introducing a high number of repetitive DNA sequences into sgRNA have been highlighted, including technical difficulties in plasmid construction, susceptibility of plasmids encoding multiple tandem repeats to recombination, reduced expression level of sgRNA when inserted with three or more copies of these structured RNA aptamers, as well as potential interference with the normal physiological activities of the labeled loci.31,32,52 Furthermore, an attempt by Hong et al.53 to label human telomeres in living cells using sgRNA containing 3′ 24×MS2 resulted only in non-specific foci that did not overlap with mCherry-TRF1 labeling of telomeres.
To overcome the limited number of well-characterized RNA aptamers that can be inserted into the sgRNA, Cheng et al.31 created the Casilio system by combining CRISPR-Cas9 and Pumilio RNA-binding protein. Pumilio and fem-3 binding factor (PUF) proteins share a conserved RNA-binding domain that is programmable to bind to a specific 8-mer RNA sequence (PUF-binding site, PBS). The simplicity of the 8-mer PBS motif allows multiple PBS to be inserted at the 3′ end of the sgRNA for recruitment of FPs fused with a PUF domain (PUF-FP) and does not hinder sgRNA transcription or dCas9/sgRNA target binding activity. The fact that the dCas9/sgRNA complex can tolerate the insertion of at least 47 copies of PBS attests to the extensive multimerization capability of the Casilio system.31 By demonstrating that a nonrepetitive sequence can be detected with only one gRNA containing 15 copies of PBS, the “one-gRNA-per-target” requirement of the Casilio system would reduce the technical challenges of live cell imaging in hard-to-transfect cells and simplify the design of genome-wide gRNA libraries for imaging.33 To overcome the need to co-transfection multiple plasmids, Zhang and Song developed an all-in-one Casilio (Aio-Casilio) system by combining the dCas9 protein, PUF-FP, and sgRNA-PBS in one plasmid.34 The Aio-Casilio system was successfully used for the imaging of repetitive elements in telomeres and major satellites in various cell lines, but attempts to establish a stable labeled cell line with the Aio-Casilio system remained unsuccessful.34 Another advantage of the Casilio system is its amenability to multiplexing by programming the PUF domain to recognize any 8-mer RNA motifs. This advantage was first demonstrated by using two Casilio modules for the dual-color imaging of centromeres and telomeres in the same living cells.31 A more recent technique called PISCES (Programmable Imaging of Structure with Casilio Emitted Sequence of Signal) uses three Casilio modules for the tricolor imaging of three nonrepetitive locations along a continuous genomic region.33 PISCES could be applied to study chromatin loop formation and structural folding dynamics by enabling the visualization and tracking of multiple reference points over time. However, a comparative study of dCas9/gRNA genome-labeling systems for chromosome imaging indicated that the binding of corresponding proteins on RNA motifs such as PBS and MS2 could lead to the stabilization and accumulation of gRNA transcripts surrounding their transcription cassettes and result in the formation of nonspecific labeling foci.33
In a separate study, allelic positioning across space and time was explored in living cells using a technique called SNP-CLING (Single Nucleotide Polymorphism CRISPR Live-cell imagING).35 Comprising of dCas9, sgRNAs with three MS2, PP7, or Puf1 hairpins appended to three stem-loops and FP-tagged RNA binding proteins (MS2-mVenus, PP7- mCherry, and PUM1-iRFP670), the technique leverages the PAM specificity of dSpCas9 to distinguish between SNPs for allele-specific labeling. Specifically, the substitution of the 2nd or 3rd nucleotide in the dSpCas9 PAM motif by a heterozygous SNP was selected for allele discrimination, as the resulting non-dCas9-specific motif (5′-NYG-3′ or 5′-NGH-3′) will limit the dCas9-binding to only one of the alleles. Spatial allelic distances were investigated using a dual-color approach, whereas a tricolor approach that included the non-allele specific labeling of a third locus allowed changes in the inter-allelic distance relative to intergenic distances to be measured. Other than inserting aptamers and/or PBS motifs in the sgRNA, the transactivating RNA (tracrRNA) or crRNA could be directly labeled with a fluorescent dye instead. This strategy was adopted in the RNA-guided endonuclease-in situ labeling (RGEN-ISL) and CRISPR LiveFISH techniques36–40: the tracrRNA was labeled with Atto 550 in RGEN-ISL, whereas the crRNA was labeled with Cy3 in CRISPR LiveFISH. The fluorescent dye-labeled gRNA and dCas9 were assembled into fluorescent ribonucleoproteins (fRNPs) before the fRNPs were used to label repetitive sequences. Given that the crRNA contains the protospacer element that dictates the target binding site, the labeling of tracrRNA may represent a cost-effective approach to studying multiple gRNA-specific sites, as the labeled tracrRNA can be paired with different crRNAs.36 RGEN-ISL was initially developed to visualize repetitive sequences in isolated nuclei and chromosomes of formaldehyde-fixed specimens,36 but a modified RGEN-ISL method that included a decrosslinking treatment with Tris-HCl was able to extend the application of RGEN-ISL to the direct visualization of repetitive sequences in paraformaldehyde-fixed tissue samples.38 The Tris-HCl treatment also enabled RGEN-ISL to be applied to nuclei and chromosomes isolated from ethanol:acetic acid-fixed specimens.37 Visualization of repetitive sequences such as centromeric, telomeric, major, and minor satellite repeats in different species (plants, mice, and humans) with RGEN-ISL have been reported.54 Simultaneous detection of different genomic loci such as telomeres and centromeres could also be achieved with a multicolor RGEN-ISL that uses differentially labeled gRNAs.36 As RGEN-ISL is compatible with immunostaining and 5-ethynyl-deoxyuridine (EdU) labeling, these techniques were successfully used in combination to simultaneously visualize specific DNA repeats, proteins, and DNA replication sites in fixed plant nuclei and chromosomes.39 Contrary to RGEN-ISL, the fRNPs assembled in CRISPR LiveFISH were transfected into living cells to label endogenous repetitive sequences.40 Compared to dCas9-EGFP labeling, the use of fluorescent gRNA resulted in more than fourfold higher SNR, and nucleolar accumulation of gRNA was not observed.40 Applications of the LiveFISH fRNPs that were demonstrated include the detection of trisomy 13, the tracking of CRISPR-Cas9-mediated editing events, and simultaneous DNA and RNA labeling when used in combination with the dCas13d system.40
Another strategy employed to improve the SNR in the fluorescence-based CRISPR imaging system is using a molecular beacon (MB) to act as a “turn-on” reporter. MB is a single-stranded oligonucleotide with a stem-loop structure containing a fluorophore and quencher tethered at each terminus. The fluorescence is quenched when the stem-loop structure is intact, whereas hybridization between the MB and its target sequence unfolds the stem-loop structure, leading to the restoration of the fluorescent signal.55 This unique property of the MB was capitalized by Wu et al.41 to develop a CRISPR/MB system that is capable of quantifying genomic loci, monitoring chromatin dynamics, and dual-color imaging of repetitive sequences in live cells. Recruitment of MB to the target site was achieved by inserting an MB target sequence (MTS) into the sgRNA, with results indicating that stem-loop 2 was more amendable to such modification compared to the tetraloop or 3′ end of the sgRNA. In addition to selecting previously validated MTS that are not found in the human genome, the specificity of the CRISPR/MB system was further enhanced by synthesizing the MB with an oligonucleotide backbone that is highly resistant to non-specific openings in living cells. To address potential background signals arising from imperfect quenching or non-specific protein binding of single MBs, an improved version of the system called CRISPR/dual-fluorescence resonance energy transfer (FRET) MB was later described by Mao et al.32 Specifically, further modification was made to the sgRNA, where stem-loop 2 was inserted with two distinct MTS to recruit corresponding MBs whose fluorophores form a FRET pair. In addition to being more sensitive than CRISPR/MB, dynamic imaging of nonrepetitive genomic loci could be achieved with the CRISPR/dual-FRET MB system using as few as three unique sgRNAs. In contrast to the previously described MS2-based system carrying 32 MCP-FPs (∼1.76 MDa),30 the CRISPR/dual-FRET MB imaging complex (∼242 kDa) with a sevenfold lower mass would provide a more accurate reflection of normal chromatin dynamics.32 A potential drawback of the MB-based CRISPR system is the need for exogenous delivery of MBs, as they are not genetically encoded.
The fluorogenic CRISPR (fCRISPR) represents a more recent development for genomic DNA imaging with low background fluorescence and high fluorescent activation.24 This fully genetically encoded system incorporates fluorogenic protein technology whereby fluorogenic proteins were created by fusing a Tat peptide-derived degron domain (tDeg) to FPs. The tDeg causes fluorogenic proteins to be rapidly degraded by the proteasome unless the degron is concealed via binding of the fluorogenic protein to the Pepper RNA aptamer inserted at both the tetraloop and stem-loop 2 of the sgRNA. When the dCas9/sgRNA complex binds to the target genomic DNA, the stabilized fluorogenic protein exhibits strong fluorescence and, along with lower nucleoplasmic background and negligible nucleolar accumulation, results in markedly enhanced genomic loci imaging. Therefore, the degradation of the fluorogenic protein is the basis for the improved SNR in fCRISPR and enables the detection of low-copy genomic loci with as few as 14 copies without signal amplification. The imaging of nonrepetitive sequences using fCRISPR would require a larger number of unique sgRNAs, and further study could potentially overcome this by fusing multiple fluorogenic proteins to amplify fCRISPR signal. Presently, fCRISPR is amendable to multi-color imaging as FPs of various colors could be fused with tDeg to generate fluorogenic proteins of different hues. Orthogonal imaging of two genomic loci in living cells was also achievable by combining fCRISPR with another CRISPR-based genomic imaging system that fuses the sgRNA with a fluorogenic aptamer, Brocolli which, in turn, binds and activates the small molecule BI dyes. Other applications of the fCRISPR that were demonstrated include the tracking of chromosome dynamics and length as well as the spatiotemporal study of Cas9-induced DSB DNA double-strand break and repair events in real time.24
Other strategies
In addition to the exclusive labeling or recruitment of the signal generator by the dCas9 protein or engineered sgRNA, Hong et al.53 explored several combinations of labeling strategies to develop an optimal bimolecular fluorescence complementation (BIFC)-based CRISPR/dCas9 system for chromosome imaging. As Venus is a constitutively expressed FP, BIFC was selected to reduce the background fluorescence generated by unbound FPs and several dCas9/gRNA labeling strategies enabling the proximity-mediated assembly of non-fluorescent Venus N- (VN) and C-terminal (VC) fragments into a functional fluorescent Venus protein were tested. In the scFv-BIFC system, SunTag-dCas9 was used to assemble split Venus fragments (scFv-VN and scFv-VC), with fluorescence background from spontaneously assembled Venus proteins being significantly reduced compared to scFv-Venus. Hong et al.53 also tested two strategies combining the specificity of both dCas9 and gRNA, where functional Venus molecules could only be assembled within the dCas9/gRNA complexes to increase labeling specificity. In the dCas9-MCP-BIFC system, the VC was fused to dCas9, and MCP-VN was recruited to the dCas9/gRNA complexes via sgRNA containing 3′ 2×MS to form one functional Venus molecule per dCas9/gRNA complex. The third system, called SunTag-dCas9-MCP-BIFC, used SunTag-dCas9 to recruit multiple scFv-VC and sgRNA containing 3′ 2×MS to recruit MCP-VN. All three BIFC-based systems showed no nonspecific foci and greatly increased the SNR compared to the SunTag-dCas9 labeling system using scFv-Venus, but the SunTag-dCas9-MCP-BIFC system showed the highest SNR. The BIFC-based systems were able to label genomic loci containing as few as 40 repeats and could potentially be extended to multi-color imaging by using spectrally distinct BIFC complexes as well as being combined with the CRISPRainbow system.
CRISPR/Cas9-mediated proximity ligation assay (CasPLA) represents another unique approach on signal amplification technologies that can be coupled with the CRISPR system for SNP labeling.56 Whereas SNP-CLING uses heterozygous SNPs for allele-specific labeling,35 CasPLA was developed for the detection of homozygous SNPs and further shown to be capable of identifying mitochondrial DNA (mtDNA) heteroplasmy through the simultaneous detection of wild-type and mutated mtDNA.56 CasPLA involves the binding of two dCas9/sgRNA complexes to the target site, with one sgRNA targeting the mutant or wild-type sequence and the other sgRNA to a nearby sequence to facilitate proximity ligation. As a proximity probe binding site has been added to the stem-loop of these sgRNAs, hybridization of DNA proximity probes followed by proximity ligation would form a circular DNA structure that served as a template for rolling circle amplification (RCA). The long RCA product is then visualized via hybridization with fluorescent dye-labeled detection probes. Hence, hybridization of RCA from wild-type and mutant mtDNA with Alexa488- and Cy5-labeled detection probes, respectively, enabled mtDNA heteroplasmy to be visualized. The ability of CasPLA to label single-copy genomic loci was also demonstrated, whereby imaging of the wild-type KRAS gene was achieved with high specificity and approximately 60% detection efficiency.
A novel DNA tag comprising two to six repeats of CRISPR targetable DNA sequences from Caenorhabditis elegans genome called CRISPR-Tag has been proposed by Chen et al.42 for live-cell labeling of nonrepetitive genes at the level of single cells and single loci. Each repeat in CRISPR-Tag can be recognized by up to four sgRNAs, and the fluorescence signal from each dCas9/sgRNA complex was further amplified using a tandem split GFP system where each dCas9 protein fused to a 14-copy array of GFP11 could recruit as many as 14 copies of GFP. By inserting the CRISPR-Tag into an intron engineered into an FP, the resulting cassette could be integrated into the N- or C-terminus of a specific protein-coding gene to allow simultaneous imaging of DNA and protein of the target gene. Endogenous genes in human cells can be labeled with CRISPR-Tag as small as 251 bp (four CRISPR target sites), but longer tags such as the 635 bp-CRISPR-Tag (18 CRISPR target sites) would be more suited for most imaging applications due to the higher SNR. More importantly, the small sizes of CRISPR-Tags render them less likely to affect chromatin structure, transcription, replication, and protein expression. The long-term tracking ability of the CRISPR-Tag system was demonstrated by tracking the dynamics of LMNA locus and FP-tagged LMNA throughout the cell cycle. The CRISPR-Tag technology was subsequently integrated into the TriTag system for the simultaneous imaging of DNA, RNA, and protein.57 Specifically, a hybrid tag comprising CRISPR-Tag (DNA tag) and 12×MS2V5 hairpins (RNA tag) was first assembled before it was inserted into an intron engineered into an FP. MS2V5, which is generated by mutating nonessential nucleotides in the consensus sequence of MS2 while retaining functionality, not only prevents deletion through recombination but also improves signal and reproducibility in single-RNA detection in live cells.52 Incorporation of the TriTag in the intron instead of untranslated regions of a target gene enables transcription status to be visualized via the specific detection of nascent transcript in the nucleus. To demonstrate the triple labeling capacity of the TriTag system, TriTag was inserted into the C-terminus of various protein-coding genes. The genomic loci were then visualized via dCas9-GFP14X while the intensity of stdMCP-tdTomato spots that resulted from the binding to nascent mRNA reflected the degree of transcriptional activity of the corresponding gene. The resulting proteins were tagged with blue fluorescent protein with results showing that the insertion of TriTag did not significantly affect protein expression for all genes that were examined. The TriTag system was further demonstrated as a feasible tool to study the correlation of transcriptional bursting with protein expression, single-allele analysis of transcription kinetics across the cell cycle, and examination of chromatin dynamics in relation to transcriptional status. Both CRISPR-Tag and TriTag involve the modification of endogenous loci with a knock-in step, and applications in live-cell imaging studies are currently limited to protein-coding genes that can be tagged. Low knock-in efficiencies were reported in both studies, and only one of the alleles was successfully modified with a DNA tag in most of the knock-in positive cells.
Last but not least, a “track first and identify later” approach that combines the dynamic tracking capability of CRISPR imaging and DNA sequential FISH (seqFISH) was described by Takei et al.58 and Guan et al.59 In both studies, single-color live-cell imaging was conducted first using dCas9-EGFP to obtain dynamic information of multiple genomic loci before the cells were fixed immediately at the end of the live recording. Identities of the labeled loci were then resolved via seqFISH albeit different protocols were adopted by Takei et al.58 and Guan et al.59 Specifically, Takei et al.58 performed a highly multiplexed DNA seqFISH whereby each of the tracked regions would undergo sequential rounds of hybridization with a set of FISH probes labeled with a distinct fluorophore followed by imaging and probe stripping. The color sequence obtained, which is analogous to a barcode, allowed the multiple loci that were tracked during the live recording to be differentiated. To further simplify the FISH protocol, Guan et al.59 developed a fast staining technique that can essentially be completed in 1 m under optimized conditions and introduced a stringent wash step to substitute the use of enzymatic reaction or photobleaching to remove the interference of bound probes. The sequential introduction of probes that were specific to each locus during each round of DNA FISH allowed the identity of each locus to be resolved.
Conclusions
Advances in CRISPR-Cas9 imaging hold great potential for dynamic real-time tracking of genomic loci in live cells and have been demonstrated for the imaging of both repetitive and nonrepetitive loci. The various strategies developed to date have led to improvements in sensitivity, specificity, SNR, and multiplexing capacity. However, CRISPR-Cas9 imaging is still in its infancy, as the ultimate goal of translating this technology into clinical practice will require further validation. We foresee the emergence of more strategies for CRISPR-Cas9 imaging and its applications in clinical settings in the near future.
Declarations
Acknowledgement
There is nothing to declare.
Funding
The work was funded by a grant from the Universiti Putra Malaysia under the Geran Putra-Inisiatif Putra Muda (GP-IPM/2022/9715700).
Conflict of interest
The authors declare no conflict of interests.
Authors’ contributions
Review conception investigation, resources, and drafting of the manuscript (CYYu, GYA); critical revision of the manuscript for intellectual content (CYYu, KGC, CYYean, GYA). All authors have made a significant contribution to this review and approved the final manuscript.