ABSTRACT
Here, we introduce ‘TICIT’, targeted integration by CRISPR-Cas9 and integrase technologies, which utilizes the site-specific DNA recombinase – phiC31 integrase – to insert large DNA fragments into CRISPR-Cas9 target loci. This technique, which relies on first knocking in a 39-basepair phiC31 landing site via CRISPR-Cas9, enables researchers to repeatedly perform site-specific transgenesis at the exact genomic location with high precision and efficiency. We applied this approach to devise a method for the instantaneous determination of a zebrafish's genotype simply by examining its color. When a zebrafish mutant line must be propagated as heterozygotes due to homozygous lethality, employing this method allows facile identification of a population of homozygous mutant embryos even before the mutant phenotypes manifest. Thus, it should facilitate various downstream applications, such as large-scale chemical screens. We demonstrated that TICIT could also create reporter fish driven by an endogenous promoter. Further, we identified a landing site in the tyrosinase gene that could support transgene expression in a broad spectrum of tissue and cell types. In sum, TICIT enables site-specific DNA integration without requiring complex donor DNA construction. It can yield consistent transgene expression, facilitate diverse applications in zebrafish, and may be applicable to cells in culture and other model organisms.
INTRODUCTION
The zebrafish is a genetically tractable vertebrate animal model that has recently gained popularity in biomedical research (Choi et al., 2021; Demarest and Brooks-Kayal, 2018; Fazio et al., 2020; Gut et al., 2017; Rissone and Burgess, 2018; Sakai et al., 2018; Torraca and Mostowy, 2018). As the scientific community seeks to assign functions to thousands of human genetic variants being identified, the scale and throughput enabled by the zebrafish are particularly useful. Moreover, zebrafish models permit chemical screens guided by successful reversal of disease-related phenotypes in a whole organism, which may substantially reduce discovery time and attribution rate during the development of therapeutics (Helenius and Yeh, 2012; MacRae and Peterson, 2015; Swinney and Anthony, 2011). Hence, developing genome engineering tools to create zebrafish models more effectively and efficiently is expected to propel biomedical research that can later be translated into medicine (Cully, 2019; Patton et al., 2021).
While the zebrafish is highly amenable to genetic manipulations including Tol2-mediated transgenesis and CRISPR-Cas9-mediated mutations, the current methods have vexing shortcomings. Although transgenesis using the Tol2 transposon system is quite efficient, its outcome is often unpredictable and inconsistent between individual animals due to the variability in the number and location of integration event(s) (Abe et al., 2011). Thus, when using this method to compare the functions of different genetic variants, researchers often need to analyze multiple transgenic lines and outcross the animals for several generations in order to obtain reliable, consistent results. In contrast, in mice and other mammalian models, Rosa26 has been extremely useful as a ‘safe harbor’ genomic locus enabling ubiquitous transgene expression or faithful control of transgene expression by its own promoter (Chen et al., 2011; Kong et al., 2014; Soriano, 1999). Single copy integration at the Rosa26 locus can be achieved by homologous recombination in embryonic stem cells or mouse zygotes (Meyer et al., 2010; Soriano, 1999). Nonetheless, to date, a defined safe harbor genomic locus has not been widely recognized in zebrafish. Notably, various methods for targeted integration of large DNA fragments in zebrafish have been developed via homologous recombination, microhomology-mediated end joining (MMEJ), or homology-independent mechanisms (Auer et al., 2013; DiNapoli et al., 2020; Hisano et al., 2015; Hoshijima et al., 2019; Shin et al., 2014; Wierson et al., 2020; Zhang et al., 2016). However, in addition to their low to moderate knock-in efficiencies, these techniques may also lead to high rates of mutations and imprecise integrations at the target sites (Auer et al., 2013; DiNapoli et al., 2020; Hisano et al., 2015; Hoshijima et al., 2019; Shin et al., 2014; Wierson et al., 2020; Zhang et al., 2016). Innovations and tools that allow zebrafish researchers to choose a genomic location of interest and exploit it for facile transgenesis are still lacking.
The DNA integrase of phiC31 bacteriophage is a site-specific recombinase that mediates DNA recombination between two heterotypic binding sequences named attB and attP, which are converted into two attB/attP hybrid sequences, termed attL and attR, after the recombination (Hillman and Calos, 2012). This reaction is irreversible, and thus phiC31 recombinase can mediate stable integration (Hillman and Calos, 2012). Moreover, phiC31-mediated integration does not require any cellular auxiliary factors (Thorpe and Smith, 1998). It has been successfully used to insert large DNA constructs (10-100 kb) into the genomes of mammalian cells, fruit flies, Xenopus, and zebrafish (Belteki et al., 2003; Bischof et al., 2007; Li et al., 2012; Mosimann et al., 2013; Roberts et al., 2014). Previously, Mosimann, et al. and Roberts, et al. generated phiC31 transgenesis recipient zebrafish lines via Tol2 and demonstrated that DNA vectors containing the attB sequence could be inserted into genomic attP sites, resulting in mean germline transmission efficiencies of 34% and 10% in the two studies (Mosimann et al., 2013; Roberts et al., 2014). Encouraged by their results, we explored additional applications using phiC31 integrase.
This study combined CRISPR-Cas9 and phiC31 technologies and developed a workflow enabling facile single-copy, site-specific transgenesis at any user-specified genomic loci (Fig. 1). Using this method called targeted integration by CRISPR-Cas9 and integrase technologies or TICIT, researchers can pre-select a suitable, well-characterized genomic location for inserting their transgenes. Given that multiple transgenic lines can be generated with insertions occurring at the same locus, expression differences due to different insertion sites can be avoided, and the expression levels of the transgenes among different transgenic lines are expected to be similar. Meanwhile, compared to other homology-directed approaches, this method eliminates the need for constructing homology arms into transgene vectors for each target locus. Instead, any vectors that contain a 70-basepair attB sequence can be used for integration. Researchers can also use TICIT to manipulate their genes of interest. Here, we applied this method to generate allele-tracking markers that allow quick sorting of mutant embryos for any downstream applications (Fig. 1). We characterized a genomic locus that may be useful for future transgenesis studies. Finally, we showed that TICIT can mediate in-frame integration enabling transgene expression controlled by an endogenous promoter (Fig. 1).
RESULTS
Generation of genomic attP landing sites using CRISPR-Cas9
To insert attP into genomic target loci, we used Streptococcus pyogenes Cas9 (SpCas9) to create double-strand DNA breaks and used single-stranded oligodeoxynucleotides (ssODNs) as the donor DNA for DNA repair. This technique has been used to create designer mutations in zebrafish (Boel et al., 2018; Petri et al., 2022; Prykhozhij et al., 2018). Meanwhile, using this method, we have previously shown that small precise edits can be introduced at an allele frequency up to ∼10% in the injected embryos (Petri et al., 2022). We first targeted two SpCas9 cleavage sites in the tyrosinase (tyr) gene, named tyr_1 and tyr_2. Both guide RNAs (gRNAs) can efficiently mutate tyr and elicit the albino phenotype (Jao et al., 2013; Moreno-Mateos et al., 2017). We designed the ssODNs to contain an attP flanked by two short homology arms adjacent to the SpCas9 cleavage sites (Fig. 2A and Table S4). For attP, a 39-basepair (bp) minimal sequence that showed full recombination activity in human cells and had little or no effect on transgene expression was used (Calos, 2006; Kirchmaier et al., 2013; Mosimann et al., 2013). The attP sequence was inserted in different orientations in the ssODNs for these two loci so that both knock-in alleles possessed an in-frame stop codon in tyr. Further, the sequences for the homology arms were complementary to the non-target strand of SpCas9. They were 36 nucleotides (nts) long on the protospacer adjacent motif (PAM)-distal side and 91 nts on the PAM-proximal side. We and others have previously shown that this donor DNA configuration is effective in zebrafish and human cells (Petri et al., 2022; Prykhozhij et al., 2018; Richardson et al., 2016). The ssODNs were chemically synthesized and two phosphorothioate linkages were added to both termini to enhance stability and knock-in efficiency (Prykhozhij et al., 2018).
We performed microinjection of 1-cell stage zebrafish embryos with SpCas9 and gRNA ribonucleoprotein (RNP) complexes and the ssODN for each sgRNA. Subsequently, successful knock-in events were identified in two ways. First, we conducted two-step nested PCR to detect the attP integration in the tyr gene. As illustrated in Fig. 2B for the tyr_1 locus, a pair of tyr-specific primers (denoted as F1 and R1) located outside the region of the donor DNA was used for the first round of nested PCR to amplify the targeted locus. We used another tyr-specific primer (R2) and an attP-specific primer (F2) for the second-round PCR to amplify the attP knock-in alleles. This yielded PCR products of both expected and incorrect lengths, suggesting that some knock-in alleles may contain additional insertions or deletions (Fig. 2C). Second, PCR-positive samples were chosen, and their amplification products generated using a set of tyr-specific primers were subjected to next-generation sequencing. The results showed that this method had successfully inserted the entire attP site into both target loci (Table 1).
We raised the injected embryos to adulthood, outcrossed them to wild-type fish, and screened for founders. The same nested PCR strategy was employed to identify the embryos carrying the attP knock-in alleles (Fig. 2D). Further, the attP knock-in sequences in the F1 embryos were verified via Sanger sequencing (Fig. 2E,F). For tyr_1, we identified one founder from 16 F0 fish screened. For tyr_2, we identified one founder from three F0 fish screened. Thus, founder frequencies were 6.3% and 33.3% for tyr_1 and tyr_2, respectively. Germline mosaicism of the identified founders was 5.5-11.9% (Table 2). When F1 fish reached adulthood, we identified heterozygous fish carrying the attP knock-in alleles (hereafter named the attPtyr_1 and attPtyr_2 alleles) by fin clipping and PCR. We incrossed heterozygous fish for both lines and found that approximately 25% of their offspring showed the albino phenotype (data not shown), indicating that both attP insertions disrupted the tyr gene as expected. Together, these results indicate that we have established tyr mutant lines with the phiC31 landing site in the pre-defined tyr loci.
To evaluate the robustness of the attP knock-in method, we sought to insert attP into two more genes – the glial fibrillary acidic protein (gfap) gene and the potassium voltage-gated channel, subfamily H, member 6a (kcnh6a) gene. We designed and tested two to three gRNAs for each gene and identified one with 57% mutation efficiency for gfap and another one with 38.6% mutation efficiency for kcnh6a based on the PCR-fluorescent fragment length analysis (Table S5). Next, we designed ssODNs to contain an attP encompassed by two short homology arms as described above, except that we avoided having a premature stop codon in the knock-in alleles of these genes (Table S4). We performed microinjection of SpCas9 protein, gRNA, and ssODN, and we detected successful attP knock-in in the microinjected embryos via next-generation sequencing (Table 1). Following similar genotyping strategies used for the tyr loci, we identified two attPgfap founders from 33 F0 fish screened and three attPkcnh6a founders from 27 F0 fish screened (Table 2). Thus, founder frequencies for attPgfap and attPkcnh6a were 6.1% and 11.1%, respectively. The knock-in sequences were verified (Fig. 2G,H), and the germline mosaicism was 10-33% for the founders (Table 2). In sum, we have successfully created phiC31 transgenesis recipient lines in multiple zebrafish genes.
Generation of allele-tracking reporter lines using phiC31 integrase
Theoretically, an attP site located in an endogenous gene will enable the facile generation of a fluorescently tagged mutant allele via the phiC31 integrase technology, eliminating the need for time-consuming genotyping procedures. To test this, we mated heterozygous attPtyr_1 fish to wild-type fish. We microinjected their embryos with in vitro transcribed phiC31 mRNA and the plasmid pDestattB_ubi:EGFP originally developed by Mosimann et al. (Mosimann et al., 2013). This plasmid contains a 70-bp attB sequence and the EGFP gene driven by the ubiquitin (ubi) promoter, which can elicit ubiquitous green fluorescence from an early embryonic stage (Mosimann et al., 2011). We used two sets of PCR primers to detect the 5′ and 3′ ends of phiC31-mediated integration in the injected embryos (Fig. 3A). The data showed that 13 out of 24 analyzed embryos exhibited correct integration at both 5′ and 3′ ends (Fig. 3B). Since it was expected that only half of the embryos would carry the attP knock-in allele, these results suggest that phiC31-mediated DNA recombination was very efficient at the attPtyr_1 locus. We performed the same test using heterozygous attPtyr_2 fish. However, PCR analysis did not detect any integration events, suggesting that the attPtyr_2 locus may be defective or inaccessible for phiC31 integrase (data not shown). Consequently, we chose attPtyr_1 for the following study. Sanger sequencing results confirmed the attL and attR sequences flanking the integration of pDestattB_ubi:EGFP at the attPtyr_1 locus, indicating precise recombination between attB and attP (Fig. 3C). Taken together, these results demonstrate that phiC31 integrase can mediate precise and efficient DNA integration in zebrafish.
To evaluate germline transmission of the recombinant allele, we raised the injected embryos from heterozygous attPtyr_1 outcrosses to adulthood. We screened three fish and identified one founder that produced green-fluorescent offspring (Fig. 4A). The ratio between green and non-green F1 embryos was approximately 1:1. Using PCR, we could detect the correct integration of ubi:EGFP at the attPtyr_1 locus in 50% of the green fluorescent embryos, suggesting that this founder carried not only phiC31-mediated but also random integration (Fig. 4B). This result is unsurprising since random integration can occur at low frequency from plasmid DNA microinjection. It should be noted that random integration has been observed less frequently compared to phiC31-mediated integration into transgenic attP sites in zebrafish, and there are presently no known functional pseudo-attP sites in the zebrafish genome (Mosimann et al., 2013). We raised green-fluorescent F1 fish to adulthood and performed fin-clipping and PCR to identify the F1 fish that harbored the correct ubi:EGFP integration. Since these fish could still carry more than one EGFP integration (Fig. 4C), we outcrossed them and selected the ones that produced approximately 1:1 of green versus non-green embryos. Further, we verified that all fluorescent F2 progeny we analyzed carried the correct integration by PCR (Fig. 4D). These results indicate that we have successfully generated heterozygous fish carrying a single-copy, ubi:EGFP reporter in the tyr gene (denoted as the ubi:EGFPtyr allele).
A similar workflow was carried out for generating a red fluorescent report line to track the tyr mutant allele. We generated the pDestattB_ubi:mCherry construct and injected it with the phiC31 mRNA into the embryos of heterozygous attPtyr_1 fish outcrosses. When the injected embryos reached adulthood, we screened six F0 fish that exhibited high mCherry mosaicism and identified one founder (Fig. 4E). This founder fish produced approximately 50% red fluorescent progeny when crossed with a wild-type fish. Moreover, all fluorescent embryos analyzed by PCR showed a correct integration (Fig. 4F). We raised red fluorescent F1 fish to adulthood, outcrossed three fish to the wild-type fish, and found that all three carried a single-copy integration of ubi:mCherry at the tyr gene. Overall, these results demonstrate that attP fish lines are useful transgenesis recipients. One attP fish line can be used to derive multiple integration lines with ease. Meanwhile, phiC31 integrase can mediate efficient and transmissible DNA integration at genomic attP landing sites.
Instantaneous visual genotyping using the tyr allele-tracking reporter lines
Having created the allele-tracking reporter lines with two different colors for the tyr gene, we sought to demonstrate the feasibility of a technique for instantaneous visual genotyping. In this method, by mating two heterozygous mutant fish that each has its own marker linked to the same mutation, one can determine whether a progeny is a wild type, heterozygote, or homozygote simply by examining its color. Hence, we mated a heterozygous ubi:EGFPtyr fish to a heterozygous ubi:mCherrytyr fish and found that all double fluorescent fish exhibited the albino phenotype and vice versa (Fig. 5). Thus, these results demonstrate the advantages of constructing mutant lines along with allele-tracking reporters via the combination of CRISPR-Cas9 and phiC31 technologies.
Broad transgene expression at the attPtyr landing site
In the ubi:EGFPtyr fish, we observed broad and strong green fluorescence from an early embryonic stage, suggesting that the attPtyr landing site could support transgene expression in multiple tissues. To investigate this further, we examined the fluorescence of heterozygous ubi:EGFPtyr fish from embryonic to adult stages. Using a fluorescent stereoscope, we could see faint EGFP expression starting 6 h post fertilization (hpf) (Fig. 6A). The intensity of green fluorescence soon became apparent, appeared ubiquitous, and continued throughout the embryonic stages (Fig. 6A). All heterozygous ubi:EGFPtyr adult fish showed consistently strong and ubiquitous green fluorescence from outside (Fig. S1A). We dissected the adult fish and found that EGFP was expressed in all organs and tissue types, such as muscle, gill, eye, brain, skin, spleen, heart, liver, kidney, intestine, pancreas, testis, and ovary (Fig. S1B). However, the fluorescence was noticeably absent in mature eggs, which was different from another ubi reporter line (ubi:loxP-EGFP-loxP-mCherry, also known as ubi:Switch) that showed maternally deposited EGFP expression (Fig. S1C) (Mosimann et al., 2011). Next, to examine some of the tissue and cell types more closely, we performed immunohistochemistry (IHC) using an anti-GFP antibody. The results showed that EGFP could be detected in all tissue sections, even though it was not at the same level among different cell types (Fig. 6B and Fig. S2). These data are expected as it was previously shown that the ubi promoter renders a ‘ubiquitous’, but not necessarily ‘homogenous’ expression in all cell types (Mosimann et al., 2011). On the other hand, we cannot preclude the possibility that the heterogenous levels of GFP expression might also be due to a potential interaction between the tyr_1 locus and the integrated ubi promoter.
Further, we isolated hematopoietic cells from the whole kidney marrow, the hemogenic tissue in zebrafish, and analyzed the fluorescence of various blood lineages using flow cytometry (Fig. 6C). We found that EGFP was expressed in >90% of hematopoietic progenitors, lymphocytes, and myelomonocytes and ∼40% of erythrocytes. Taken together, these findings demonstrate that the attPtyr landing site can support transgene expression in a wide range of tissue and cell types.
Generation of a promoter-tagging line via TICIT
Another potential application of single-copy transgenesis at a user-specified genomic location is to generate gene-tagging or promoter-tagging lines. We sought to insert a reporter gene into the attPgfap locus as a proof of concept. During embryonic development, gfap is abundantly expressed in the glial cells of the eye and the central nervous system (CNS). To do this, we first generated a plasmid pGEM-T_P2A-EGFP by inserting the coding sequences of the self-cleaving peptide P2A and a promoter-less EGFP gene next to the attB site in pGEM-T (see the Materials and Methods). We expected that phiC31-mediated recombination between the plasmid DNA and the genomic attPgfap locus should result in an in-frame integration of EGFP after the start codon of gfap (Fig. 7A). Thus, we microinjected pGEM-T_P2A-EGFP and the phiC31 mRNA into the embryos of heterozygous attPgfap fish outcrossed to the wild-type fish. In the injected embryos, we could readily see mosaic EGFP expression, specifically in the CNS (Fig. 7B). When the injected embryos reached adulthood, we screened one F0 fish with EGFP expression in the CNS and found that it produced 18% fluorescent progeny (32 out of 175 embryos). All F1 fluorescent embryos expressed a consistent level and pattern of EGFP expression in the CNS (Fig. 7B), and all fluorescent embryos analyzed by PCR showed a correct integration (Fig. 7C). Further, confocal imaging analysis showed EGFP expression in the eye and the brain 1-day post fertilization (dpf) (Fig. 7D). While the fluorescence in the head region subsided after 1 dpf, its intensity in the posterior CNS region persisted through later stages, correlating with the reported gfap expression pattern (https://zfin.org) (Fig. S3). These results demonstrate that TICIT can be applied to generating transgenic animals in which an endogenous promoter of interest controls transgene expression.
DISCUSSION
Developing research tools can serve as a potent catalyst for biomedical research. In recent years, CRISPR technologies have been widely embraced by the scientific community and have spawned tens of thousands of publications and applications (Anzalone et al., 2020). However, even with current CRISPR methods, the efficiencies for targeted integration of large DNA fragments in mammalian cells and various model organisms still need to improve, primarily due to their dependencies on cellular DNA repair mechanisms (Prill and Dawson, 2020; Yang et al., 2020). Hence, these methods are unsuitable for large-scale experimentation. In contrast, phiC31 integrase mediates DNA integration independently and with precision and high efficiency (Mosimann et al., 2013; Roberts et al., 2014). However, its recognition sites are not programmable. This study describes a two-step protocol for facile targeted integration of long DNA fragments into the zebrafish genome. A workflow of this method is provided in Fig. S4. As illustrated in Fig. 1, we show that this new technique, named TICIT, can open doors to diverse downstream applications for zebrafish research.
In the first step of this protocol, we successfully engineered a phiC31 landing pad in four genes via SpCas9 and ssODNs at a founder frequency ranging between 6-33%. This method is simple and effective and does not require DNA cloning. Moreover, using this method, researchers can place the attP site precisely to create or avoid an in-frame stop codon for different applications, as shown in this report.
In the second step of this protocol, phiC31 integrase is employed to insert an entire plasmid DNA at the target site. We only needed to screen a small number of fish to identify a founder. The founder frequency ranged between 16-100% at two sites – attPtyr_1 and attPgfap. We showed that zebrafish lines carrying a functional landing site can be used repeatedly to generate multiple integration lines efficiently. These results demonstrate the value of the TICIT protocol. Nonetheless, this method is not without its limit. In the study, we did not detect any phiC31-mediated recombination at the attPtyr_2 site. A defective or inaccessible attP landing site has been reported before (Mosimann et al., 2013). It is presently unknown why some attP landing sites are refractory to phiC31 integrase. Since there is no CpG dinucleotide that could be methylated in the attP sequence, it could be other DNA modifications or DNA-binding proteins that render some loci inaccessible to the enzyme. Variability in the efficiency of Cre-mediated recombination at different genomic locations has also been documented (Lalonde et al., 2022). Similarly, the phiC31 landing pad inserted in different genomic locations may have different integration efficiencies (Mosimann et al., 2013; Roberts et al., 2014). This should not preclude phiC31 integrase from being a powerful tool.
With this method, we developed an approach for marking zebrafish with fluorescent reporters indicating whether a fish is wild-type, heterozygous, or homozygous for an allele of interest. This technique allows genotyping by visual inspection and can save time and money for routine zebrafish work. It could be equally useful for researchers working with mice, rats, or other organisms. The technique may also enable new experiments by early identifying mutant animals from a mixed population. For example, it can be used to sort a homogeneous population of homozygous mutants for use in high-throughput screening for potential treatments of genetic disorders. We used the ubi promoter to enable early detection and easy sorting. It is possible that other promoters may be used.
Furthermore, the results showed that phiC31 mediated unidirectional integration with high fidelity since we could identify correct attR and attL sequences flanking the inserted DNA. This enables efficient in-frame insertion of a fluorescent marker into a targeted gene, which may be helpful when studying previously uncharacterized genes. An important consideration in any promoter-tagging design is the strength of the promoter of interest. For weaker promoters, a signal amplification strategy such as the UAS-Gal4 system may be needed (Wierson et al., 2020). In addition to reporter genes, researchers can also use TICIT to express other transgenes from an endogenous promoter. For example, by expressing the cDNAs of various genetic variants, this technique could be used for gene replacement in comparative studies of genetic variations.
One of the most significant benefits of single-copy, site-specific transgenesis compared to transgenesis via Tol2 transposon is the ability to obtain faithful transgene expression controlled by its promoter without positional and copy number artifacts. This could be achieved using phiC31 integrase and a zebrafish line carrying a ‘safe harbor’ or well-defined phiC31 docking locus. Hence, developing this technique may widen the use of the zebrafish as a platform to rapidly assess the functions of genetic variants in the coding and non-coding regions identified in genome-wide association studies (GWAS) (Edwards et al., 2013; Gusev et al., 2018; Tucker et al., 2017). Indeed, the phiC31 transgenesis method via several Tol2-generated attP lines has been shown to be superior to the Tol2 transgenesis method when evaluating human cis-regulatory elements in zebrafish (Bhatia et al., 2021; Roberts et al., 2014). The landing sites in the integration recipient lines used in previous studies have been mapped to intergenic and intragenic regions (Bhatia et al., 2021; Mosimann et al., 2013; Roberts et al., 2014). Here, we demonstrate that the attPtyr_1 landing site exhibits high recombination efficiency. Fish harboring ubi:EGFP and ubi:mCherry in the attPtyr_1 locus showed strong, broad, and consistent expression levels over several generations. It should be noted these data cannot preclude the tyr locus may confer genetic interactions with the ubi promoter or other regulatory elements inserted into this locus. In addition to the ubi promoter, investigation of the expression patterns of additional promoters in this locus is warranted. Though transgene integration at the attPtyr_1 site will be accompanied by a single-copy deletion of the tyr gene, we have not observed any noticeable phenotypes in heterozygous tyr zebrafish, suggesting that the attPtyr_1 fish may be used as transgenesis recipients for other purposes. Moreover, it would be interesting to explore more integration sites for phiC31 using CRISPR-Cas9.
We show that CRISPR-Cas9 and phiC31 technologies can be efficiently combined to construct novel genome engineering tools and zebrafish models. Presumably, it may be possible to perform attP insertion and DNA integration via phiC31 in one single step, which has been successfully demonstrated in human cells using the prime editor system instead of SpCas9 (Anzalone et al., 2022; Yarnall et al., 2023). Thus, this will be a useful future direction for zebrafish. Moreover, identifying other microbial DNA integrases that exhibit high efficiencies in zebrafish will be helpful, which may enable more sophisticated experimental designs involving multiple DNA integrations or cassette exchange (Durrant et al., 2023; Low et al., 2022; Yarnall et al., 2023). It can be expected that a broader adoption and more creative uses of the CRISPR and integrase technologies in zebrafish and other model organisms will play an important role in accelerating more transformative biomedical research in the near future.
MATERIALS AND METHODS
Generation of gRNAs and Cas9 protein
All gRNAs used in this study were produced via in vitro transcription as previously described (Gagnon et al., 2014). Target site and oligonucleotide sequences are listed in Tables S1 and S2, respectively. Briefly, a gene-specific oligonucleotide was annealed to a constant oligonucleotide encoding the CRISPR-Cas9 scaffold and then a fill-in reaction was performed using T4 DNA polymerase (New England Biolabs). For the tyr_1 and tyr_2 sites, the 76-nt SpCas9 (C9) scaffold was used. For gRNAs targeting gfap and kcnh6a, the 86-nt, enhanced SpCas9 (C9E) constant oligonucleotide was used. (Petri et al., 2022) The products were purified using Monarch® PCR & DNA Cleanup Kit (New England Biolabs), which were then used as the template for in vitro transcription. In vitro transcription was performed using HiScribe™ T7 or SP6 High Yield RNA Synthesis Kit (New England Biolabs). RNA was subsequently purified using Monarch® RNA Cleanup Kit (New England Biolabs).
For the generation of Cas9 protein, the plasmid pET-28b-Cas9-His (Addgene, #47327) was transformed into Rosetta (DE3) competent cells (Novagen) following the manufacturer's instructions. The production and purification of Cas9 protein were carried out as previously described (Gagnon et al., 2014). Single-use aliquots were stored at −80°C.
Plasmid construction
To generate the plasmid for in vitro transcription of the mouse codon-optimized phiC31 integrase (phiC31o) mRNA, a T7 promoter was added to the pPhiC31o plasmid (Addgene, #13794). Briefly, two oligonucleotides, EcoRI-T7 and T7-EcoRI (Table S3), were hybridized with each other and ligated to EcoRI-linearized pPhiC31o. The construction (pT7_ PhiC31o) was transformed into NEB® Turbo competent E. coli cells (New England Biolabs) and verified by Sanger sequencing after plasmid DNA extraction.
To generate pDestattB_ubi:mCherry, the mCherry coding sequence was PCR amplified from the plasmid pGFP-bait-MCS-NTR-mCherry using GoTaq® DNA Polymerase (Promega) with primers listed in Table S3. PCR products were purified using the Monarch® PCR & DNA Cleanup Kit (New England Biolabs) and then digested by BspHI and MfeI-HF (New England Biolabs). Meanwhile, the plasmid pDestattB_ubi:EGFP (Addgene, #68339) was digested by NcoI-HF and MfeI-HF (New England Biolabs) to generate the vector backbone. The digested mCherry and vector backbone fragments were purified using Zymoclean Gel DNA Recovery Kit (Zymo Research) and then ligated using the T4 DNA ligase (New England Biolabs). The construction was transformed into NEB® Turbo competent E. coli cells (New England Biolabs) and verified by Sanger sequencing after plasmid DNA extraction.
To generate the pGEM-T_attB-P2A-EGFP plasmid, the EGFP coding sequence was PCR amplified from the plasmid pDestattB_ubi:EGFP using GoTaq® DNA Polymerase (Promega) with attB-P2A-EGFP-F and pA-r primers (Table S3). PCR product was purified and ligated to the pGEM®-T vector (Promega). The construction was transformed into NEB® Turbo competent E. coli cells (New England Biolabs) and a partial truncation in the resulting clone was discovered by Sanger sequencing in the attB-P2A-EGFP-F primer region. Hence, we designed another primer attB-P2A-EGFP-F1 (Table S3). The correct attB-P2A-EGFP sequence was PCR amplified from the plasmid with the truncated sequence using the attB-P2A-EGFP-F1 and pA-r primers. PCR product was purified and subcloned into pGEM®-T. The final construct pGEM-T_attB-P2A-EGFP was verified by Sanger sequencing.
Oligonucleotide and mRNA synthesis
All oligonucleotides, including the ssODNs for attP knock-in, were ordered from Integrated DNA Technologies. The phiC31 integrase mRNA was prepared by in vitro transcription using EcoRI-linearized plasmid pCDNA3.1_phiC31 (Addgene, #68310) or HindIII-linearized plasmid pT7_PhiC31o as the template and mMESSAGE mMACHINE™ T7 Transcription Kit (Invitrogen). The former vector contains a phiC31 coding sequence (phiC31) (Bischof et al., 2007; Mosimann et al., 2013), whereas the latter contains a mouse codon-optimized phiC31 coding sequence (phiC31o) (Raymond and Soriano, 2007). RNA was subsequently purified using Monarch® RNA Cleanup Kit (New England Biolabs).
Zebrafish microinjection
All zebrafish husbandry and experiments were approved by the Massachusetts General Hospital Subcommittee on Research Animal Care and performed in accordance with the guidelines of the Institutional Animal Care and Use Committee at the Massachusetts General Hospital.
Microinjections were performed using the 1-cell stage of TuAB zebrafish embryos and approximately 2 nl of injection solution per embryo. For attP knock-in experiments, the injection solution contained 480 ng/µl of Cas9 protein, 230 ng/µl of gRNA, and 0.5-1 µM of ssODN. To prepare the injection mix, Cas9 protein and gRNA were combined and put at room temperature for 5 min before the ssODN was added. We later used 288 ng/µl of Cas9 protein, 70-80 ng/µl of gRNA, and 0.5 µM of ssODN for the gfap and kcnh6a target sites to circumvent embryo death and deformity caused by high rates of gfap and kcnh6a mutations. For plasmid DNA integration experiments, the injection solution contained 12.5 ng/µl of the phiC31 or 5 ng/µl of phiC31o mRNA and 12.5 ng/µl or 25 ng/µl of plasmid DNA (pDestattB_ubi:EGFP for the tyr_1 and tyr_2 targeted sites and pGEM-T-attB-P2A-EGFP for the gfap targeted site). We have used the mRNA of both phiC31 and phiC31o in these experiments and observed no consistent differences in their performance. For testing gfap and kcnh6a gRNA efficiencies, the injection solution contained 288 ng/µl of Cas9 protein and 80 ng/µl of gRNA. Injected embryos were incubated at 28.5°C after injection.
Zebrafish genomic DNA extraction
Genomic DNA was extracted from fin clips of adult fish or embryos at 1 or 2 dpf. Zebrafish embryos that developed normally were lysed singly or as pools in lysis buffer (5 µl per embryo at 1 dpf, 8-10 µl per embryo at 2 dpf, and 30 µl per fin clip). The lysis buffer consisted of 10 mM Tris-HCl (pH 8.0), 2 mM EDTA (pH 8.0), 0.2% Triton X-100, and 0.5% Proteinase K. Lysates were incubated at 50°C overnight with occasional mixing till they turned clear, which were then heated at 95°C for 10 min to inactive Proteinase K. Genomic DNA was stored at 4°C.
PCR-fluorescent fragment length (PCR-FFL) analysis and next-generation sequencing (NGS)
PCR-FFL analysis was employed to determine gRNA efficiencies or the sizes of the knock-in alleles (Foley et al., 2009). To prepare the samples, two-step PCR reactions were performed. In the first step, gene-specific forward and reverse primers were used to amplify the targeted loci, and the forward primers contained an 18-bp M13 sequence (5′- TGTAAAACGACGGCCAGT) at the 5′ end. The PCR product was diluted 100-fold and used for the second PCR reaction using the 5′ 6-FAM-labelled M13 forward primer and a gene-specific reverse primer. PCR primer sequences are listed in Table S3. The final products were analyzed at the Massachusetts General Hospital DNA Core.
For NGS, PCR amplicons (generally less than 280 bps) encompassing the targeted loci were generated using 1 µl of the zebrafish lysate with Phusion® High-Fidelity Polymerase (New England Biolabs) and primers listed in Table S3. PCR products were purified using the Monarch® PCR & DNA Cleanup Kit (New England Biolabs) and submitted to the Massachusetts General Hospital DNA Core. Sequencing data were analyzed with CRISResso2 using the HDR mode (http://crispresso2.pinellolab.org/submission).
Zebrafish genotyping, line generation, and founder screens
The sequences of all PCR primers for genotyping are listed in Table S3. To generate attP knock-in lines, embryos microinjected with SpCas9, gRNA, and attP knock-in ssODN were raised to maturity and screened for founders. Potential founders (F0) were outcrossed to the wild-type fish, and their progeny (F1) were genotyped in pools (five embryos per pool) via two-step nested PCR. The first PCR was to amplify the targeted loci, and the second PCR was to detect the attP insertion at the targeted loci (Table S3). Alternatively, when F1 progeny were lysed individually, only the second step of PCR was needed to detect the attP insertion. Once an attP insertion was detected by PCR, the knock-in alleles were further verified by Sanger sequencing or next-generation sequencing, all using gene-specific primers to amplify the targeted loci (Table S3). For sequence confirmation by Sanger sequencing, PCR products were subcloned using the pGEM-T vector, followed by colony PCR to identify the clones containing the desired allele. Subsequently, the plasmid DNA was extracted and submitted to the Massachusetts General Hospital DNA Core for sequencing. For genotyping of F1 and F2 adult zebrafish, fish were anesthetized briefly with tricaine, and a small fin biopsy was taken for DNA extraction as described above. Gene-specific primers were used to amplify the targeted loci (Table S3), and the samples containing the knock-in alleles should yield PCR products corresponding to both the wild-type allele and the knock-in allele. For further verification, Sanger sequencing was employed.
To generate allele-tracking reporter lines, embryos from heterozygous attPtyr_1 fish incrosses or outcrosses with the wild-type fish were injected with the phiC31 mRNA together with pDestattB_ubi:EGFP or pDestattB_ubi:mCherry DNA. The injected embryos were raised to adulthood and screened for founders. The progeny of potential founders was first screened for fluorescence. Fluorescent F1 embryos were lysed singly, and PCR was performed to detect DNA integration at the targeted locus (Table S3). Sanger sequencing was used to confirm the sequences of attL and attR at the junctions of the integrated DNA. Subsequently, fluorescent progeny (F1) from the confirmed founders were raised to adulthood and genotyped by fin clipping and PCR as described above. We used two criteria to determine that a heterozygous ubi:EGFPtyr F1 fish did not carry a second integration outside of the targeted locus. First, when a heterozygous fish was outcrossed to a wild-type fish, it should produce approximately 50% of the fluorescent progeny. Second, all fluorescent F2 progeny should harbor the correct integration. F1 fish that fit these criteria were used to propagate all future generations.
To generate a target gene-specific reporter line, embryos from heterozygous attPgfap fish outcrossed to wild-type fish were injected with the phiC31 mRNA together with pGEM-T_attB-P2A-EGFP DNA. The injected embryos were raised to adulthood and screened for founders. The founder screen was performed as described for creating allele-tracking reporter lines.
Confocal imaging of the gfap:gfp transgenic embryos
Heterozygous P2A-EGFPgfap embryos were obtained by mating heterozygous P2A-EGFPgfap fish with wild-type TuAB fish. EGFP-positive embryos were selected for imaging at 1- and 2-dpf. Embryos were treated with 0.03 g/l phenylthiourea (PTU) to inhibit pigmentation, and manually dechorionated using tweezers. Right before imaging, embryos were anesthetized by 30 mg/l tricaine-S (Western Chemical) and mounted in 2% low melting agarose (LONZA) for dorsal or lateral view in 35 mm Petri dishes with a glass bottom. Imaging was performed on ZEISS LSM900 confocal microscope using a 10X objective. Z-stacks were collected with an interval of 3 µm. Images were stitched and computed as the sum of projections in ImageJ.
HCR™ RNA-FISH staining
HCR™ reagents including probe sets, amplifiers, and buffers, were purchased from Molecular Instruments (Los Angeles, CA, USA). The probe set was designed for Zebrafish gfap (NM_131373.2) and the EGFP mRNA sequences. Each probe set is composed of multiple probe pairs hybridized to different regions along the target mRNA. The HCR amplifiers for the gfap and EGFP mRNA targets were fluorescently labeled with AlexaFluor 546 and AlexaFluor 488, respectively. The staining procedure followed a modified version of the ‘HCR RNA-FISH protocol for whole-mount zebrafish embryos and larvae’ provided by the manufacturer. Briefly, embryos were anesthetized by immersing them in ice water and fixed with 4% paraformaldehyde (PFA) in phosphate-buffered saline for 1 h at 4°C. The samples were then permeabilized with proteinase K (final concentration 15 μg/ml) for 20 min, post-fixed in 4% PFA, and incubated overnight at 37°C in probe hybridization buffer containing 4 pmol of each probe (final concentration 16 nM). The probes were washed at 37°C with wash buffer before overnight incubation in amplification buffer containing 30 pmol of each fluorescently labeled amplifier at room temperature in a dark drawer. Embryos were then washed three times with 5X SSCT (0.75 M NaCl, 75 mM sodium citrate, 0.1% Tween-20). Finally, the embryos were ready for imaging.
To take images, embryos were mounted to glass-bottom dishes in 1% low-melting agarose gel in 1X E3 and imaged with a ZEIS LSM700 confocal scanning microscope using either 5X or 10X magnification. Images were obtained both with single z-planes and z-stacks. Imaging was performed with the following lasers: Diode 555 and Diode 488 for detecting the gfap and EGFP mRNA, respectively. Any post-capture image processing was performed uniformly to wild-type and transgenic fish embryos.
Histology and cytology
Adult zebrafish were euthanized using approved protocols with tricaine and then fixed in 4% PFA. Paraffin embedding, sectioning, H&E staining, and immunohistochemistry (IHC) for EGFP were performed using standard protocols. EGFP was detected using the JL-8 mouse monoclonal antibody (Clontech). For IHC, the primary antibody was diluted at 1:200, and the secondary antibody, an HRP-conjugated, donkey anti-rabbit antibody (Biovision), was used at a dilution of 1:1000.
For flow cytometry analysis of hematopoietic cells, ubi:EGFPtyr adult zebrafish were euthanized prior to kidney collection. The kidney was dissected and placed into ice-cold 0.9X phosphate-buffered saline (PBS) containing 5% fetal bovine serum. Whole kidney marrow (WKM) cells in single-cell suspension were generated by gently triturating and by passing through a 40-μm filter. FACS was conducted on CytoFLEX (Beckman Coulter, NJ, USA) and data were analyzed with FlowJo software (Tree Star, OR, USA). Various hematopoietic cell populations were identified as previously reported (Traver et al., 2003).
Acknowledgements
We thank the staff at the Massachusetts General Hospital Charlestown Navy Yard Zebrafish Facility for their technical support.
Footnotes
Author contributions
Conceptualization: J.-R.J.Y., J.M., R.T.P.; Funding acquisition: J.-R.J.Y., R.T.P.; Supervision: J.-R.J.Y., R.T.P.; Writing – original draft: J.-R.J.Y., J.M.; Writing – review & editing: J.-R.J.Y., J.M., W.Z., S.P., R.T.P.; Data curation: J.M., W.Z., S.R., N.U.G., Z.S.; Formal analysis: J.M., W.Z., S.R., N.U.G., Z.S.; Investigation: J.M.; Methodology: J.M.; Validation: J.M., W.Z., S.R.; Resources: S.P.
Funding
This work was supported by the Hassenfeld Scholar Award (to J.-R.J.Y.) and NIH grant no. R01 GM134069 (to R.T.P and J.-R.J.Y.). J.M. received support from the China Scholarship Council (no. 201808210354). Open Access funding provided by Harvard Medical School; University of Utah. Deposited in PMC for immediate release.
Data availability
The data that support the findings of this study are available in the article and/or supplementary material of the article.
References
Competing interests
The authors declare no competing or financial interests.