ABSTRACT
To identify candidate tissue regeneration enhancer elements (TREEs) important for zebrafish fin regeneration, we performed ATAC-seq from bulk tissue or purified fibroblasts of uninjured and regenerating caudal fins. We identified tens of thousands of DNA regions from each sample type with dynamic accessibility during regeneration, and assigned these regions to proximal genes with corresponding expression changes by RNA-seq. To determine whether these profiles reveal bona fide TREEs, we tested the sufficiency and requirements of several sequences in stable transgenic lines and mutant lines with homozygous deletions. These experiments validated new non-coding regulatory sequences near induced and/or essential genes during fin regeneration, including fgf20a, mdka and cx43, identifying distinct domains of directed expression for each confirmed TREE. Whereas deletion of the previously identified LEN enhancer abolished detectable induction of the nearby leptin b gene during regeneration, deletions of enhancers linked to fgf20a, mdka and cx43 had no effect or partially reduced gene expression. Our study generates a new resource for dissecting the regulatory mechanisms of appendage generation and reveals a range of requirements for individual TREEs in control of regeneration programs.
INTRODUCTION
Adult teleosts like zebrafish and urodele salamanders can regenerate entire amputated fins and limbs. By contrast, regenerative healing of adult mammalian limbs is limited to the very tips of digits (Borgens, 1982; Morgan, 1901; Spallanzani, 1768). Illuminating concepts and targets of appendage regeneration will decipher how and why tissue regeneration occurs and inform strategies to improve human tissue repair.
Cell transplantation, genetic fate mapping and live-imaging strategies have been employed across many species and tissues over the past 15-20 years to answer cell-of-origin questions in models of regeneration from planarians to mice (Dor et al., 2004; Kikuchi et al., 2010; Kragl et al., 2009; Wagner et al., 2011). Relevant to appendage regeneration, the salamander limb, zebrafish fin or murine digit tip blastema is no longer considered a homogeneous mass of pluripotent cells, but rather as a composite of lineage-restricted cell compartments (Knopf et al., 2011; Kragl et al., 2009; Rinkevich et al., 2011; Tornini et al., 2016; Tu and Johnson, 2011). A second area of expansion has been the study of traditional developmental regulators such as Fgf, Wnt, Bmp, Notch, Igf and Hh proteins during regeneration (Payzin-Dogru and Whited, 2018; Sehring and Weidinger, 2020; Seifert and Muneoka, 2018). Because of their many roles during embryonic development, and as many labs possess expertise and tools that probe these pathways, these have been logical choices in which to establish early regulatory models of appendage regeneration.
Mammalian genomes likely encode all gene products required to regenerate an amputated limb, yet they lack the correct instructions for strategically modulating those gene products to accomplish limb regeneration. Certain non-mammalian vertebrates such as salamanders and zebrafish possess these instructions, which exist as cis-regulatory elements. Although the catalogue of defined cell dynamics and molecular factors in tissue regeneration is expanding, we know comparatively little about how genes involved in regenerative events are engaged upon injury. That is, what are the non-coding DNA sequences that recruit activators (or repressors) of gene programs during tissue regeneration, how are these sequences distributed throughout genomes of regeneration-competent animals and which transcription factors engage with these sequences? Regulatory sequences are key suspected targets for evolutionary change, and they are likely to provide insight into how regenerative capacity is lost or maintained during species evolution (Goldman and Poss, 2020; King and Wilson, 1975; Wittkopp and Kalay, 2011; Yang and Kang, 2019).
Distal-acting enhancer regulatory sequences can direct expression of their target genes and have been predominantly studied as a means for stage- and tissue-specific regulation during embryonic development. Recently, we identified an enhancer linked to the leptin b (lepb) gene, named LEN, that activates gene expression during fin and heart regeneration in zebrafish, revealing a new class of gene regulatory sequences we named tissue regeneration enhancer elements (TREEs) (Kang et al., 2016). TREEs are identifiable by chromatin profiling, which has revealed or suggested similar classes of regulatory elements in Drosophila imaginal wing discs (Harris et al., 2016; Harris et al., 2020; Vizcaya-Molina et al., 2018), acoel worms (Gehrke et al., 2019), mouse bone (Guenther et al., 2015) and zebrafish hearts (Goldman et al., 2017). In addition to informing regulatory mechanisms of regeneration, TREEs can be engineered in simple constructs to spatiotemporally target developmental factors to injured tissue and boost or inhibit regenerative events (Kang et al., 2016).
Here, we have profiled genome-wide changes in chromatin accessibility (ATAC-seq) and expression (RNA-seq) during regeneration of amputated zebrafish fins, assessing both whole tissue samples and purified fibroblasts from regenerating fins. These experiments revealed thousands of putative TREEs, several of which we functionally characterized in new transgenic and mutant zebrafish lines. This study provides a resource and framework for the discovery and dissection of dynamic chromatin structure and enhancer elements during appendage regeneration.
RESULTS AND DISCUSSION
ATAC-seq and RNA-seq profiling of whole-fin regenerates
Immediately after amputation of a zebrafish caudal fin, nearby epidermal cells begin to migrate to cover the lesion site and form a wound epidermis, alternatively called a regeneration epidermis. This specialized epidermis matures and releases signals to the underlying stump cells to form a blastema, a proliferative mass of mainly fibroblasts and osteoblasts that act as progenitors for new structures (Akimenko et al., 2003; Ando et al., 2017; Johnson and Weston, 1995; McMillan et al., 2018; Poss et al., 2000; Sehring and Weidinger, 2020). By 2 days post amputation (dpa), each fin ray has formed a blastema. By 4 dpa, regenerates possess a major compartment of fibroblasts, in addition to a minor compartment of newly aligned osteoblasts in the process of depositing new ray matrix. Complete regeneration of new structures occurs within 2-3 weeks of amputation (Fig. 1A,B).
To chart dynamic chromatin changes occurring during zebrafish fin regeneration during a stage of rapid cell division with concomitant skeletal patterning, we performed an assay for transposase-accessible chromatin followed by high-throughput sequencing (ATAC-seq) and RNA-seq on freshly amputated (0 dpa) and regenerating (4 dpa) fin tissue. We achieved an average of 55 million total reads per ATAC-seq library in this study, varying between 31 and 147 million reads for each of the libraries reported here, with an average of 24 million unique reads per library (Tables S1, S2, S3). First, to survey heterogeneous cell populations, we collected all cells from the regenerating regions of the fin plus one ray segment proximal to the amputation plane (Fig. 1C). We found 4265 transcripts differentially expressed and 50,873 chromatin regions differentially accessible at 4 dpa compared with 0 dpa (Fig. S1A; Tables S4, S5). We then plotted 13,892 of these differential ATAC-seq peaks to the closest differentially expressed transcript to assess genes potentially modulated by putative regulatory sequences during regeneration (Fig. 1D,E; Table S6). Most sequences displaying increased accessibility during regeneration were in intergenic regions (46.0%), 3 kb upstream of transcriptional start sites (25.7%) or within introns (18.0%). The remaining regions were located in UTRs, exons or immediate downstream regions of genes (10.3%) (Fig. S1B). ChromVar analysis indicated that the most variable transcription factor binding sites at 4 dpa compared to 0 dpa are predicted binding sites of components of the AP-1 complex, known to promote cell proliferation and survival (Fig. S1C) (Karin et al., 1997; Schep et al., 2017). To examine differential transcription factor binding accessibility, footprinting analysis was performed on a subset of transcription factor consensus sequences from our differential 4 dpa ATAC dataset, revealing minor changes in accessibility near the selected sequences FOS:JUN, EPAS1 and ZNF263 (Fig. S1D-F) (Ou et al., 2018a). To identify upstream regulators and canonical pathways involved in fin regeneration, we analyzed these RNA-seq data using ingenuity pathway analysis (IPA) (Krämer et al., 2014). The most highly activated upstream regulators were ERBB2, an EGF co-receptor implicated in zebrafish fin regeneration (Rojas-Muñoz et al., 2009), and MYC, a transcription factor central to cell growth and proliferation (Meyer and Penn, 2008) implicated in regeneration of liver, Müller glia and lateral line neuromasts in zebrafish (Jagtap et al., 2020; Lee et al., 2016; Mitra et al., 2019). The most downregulated upstream factor was TP53, a major cell cycle and apoptotic regulator that was described to be suppressed to facilitate salamander limb regeneration (Fig. S1G; Table S7) (Yun et al., 2013). Thyroid hormone metabolism, which has been reported to impact fin regeneration in medaka (Sekimizu et al., 2007), was predicted by IPA to be the most increased canonical pathway in our dataset, whereas melatonin degradation was predicted to be one of the most downregulated pathways (Fig. S1H; Table S8).
To assess gene regulation during an earlier stage – active formation of the initial regeneration blastema – we profiled chromatin accessibility and RNA levels at 1 dpa compared with 0 dpa, and found 5400 transcripts differentially expressed and 25,398 chromatin regions differentially accessible (Fig. S2A; Tables S9, S10). We were able to merge the ATAC-seq and RNA-seq data by pairing 8720 dynamic chromatin regions to nearby differentially expressed genes (Fig. 1F,G; Table S11). Regions more accessible at 1 dpa were again mainly intergenic (45.4%), within 3 kb upstream of the start site (27.2%), or intronic (17.5%), with the remaining 9.9% in UTRs, exons or immediate downstream regions of genes (Fig. S2B). Of the 4265 transcripts differentially expressed at 4 dpa compared with 0 dpa, 61.7% were also differentially expressed at 1 dpa (Fig. 1H). Similar to in the 4 dpa dataset, ChromVar analysis revealed that the most variable transcription factor binding sites at 1 dpa compared with 0 dpa implicate the AP-1 complex (Fig. S2C) (Karin et al., 1997; Schep et al., 2017). Footprinting analysis near the selected transcription factor consensus sequences FOS:JUN, EPAS1 and ZNF263 again revealed accessibility changes (Fig. S2D-F). Upstream factor analysis of the differential 1 dpa RNA-seq dataset by IPA showed similar results as the differential 4 dpa dataset, with upregulated upstream factors MYC and ERBB2, and the downregulated upstream factor TP53 (Fig. S2G; Table S12). IPA of canonical pathways upregulated at 1 dpa showed an increase in stress response, DNA damage and DNA replication pathways (Fig. S2H; Table S13). We interpret the overlap in differential transcript expression to include stable regenerative programs that are initiated at or around 1 dpa. Differences may reflect transient programs like injury response and blastema formation likely to be enriched at 1 dpa, as well as programs underlying patterning that would be initiated after this stage.
Comparative analysis of merged differential ATAC peaks paired with nearby differential transcripts revealed 1178 transcripts with nearby differential peaks shared between the 1 dpa and 4 dpa dataset, with 2981 unique to the 1 dpa dataset and 4637 unique to the 4 dpa dataset (Fig. S2I). GO Term analysis of the shared transcripts revealed involvement in a variety of regenerative and developmental processes, such as cartilage, connective tissue, skeletal system and pectoral fin development, as well as appendage, epithelial and embryonic organ morphogenesis (Fig. S2J). The 1 dpa dataset showed greater enrichment with dynamic accessibility near differential transcripts involved in neuronal, mesenchymal and cell differentiation, whereas the 4 dpa dataset had greater enrichment with dynamic accessibility near differential transcripts involved in cell cycle processes such as DNA replication, translation and mitotic cell cycle processes (Fig. S2K,L).
TREEs near fgf20a and mdka control distinct epidermal expression domains
To examine putative TREEs indicated by the dataset using functional tests, we employed two distinct strategies: one method to test sufficiency and one method to test requirement. To assess the sufficiency of noncoding sequences from our ATAC-seq datasets to direct gene expression, we used a transgenic enhancer reporter system with the candidate sequence inserted immediately upstream of a 95-bp minimal mouse c-fos promoter and EGFP gene cassette (Goldman et al., 2017) (Fig. 2A). To examine the importance of these same regions in their natural genomic context, we used CRISPR to excise DNA containing the putative cis-regulatory element and generate homozygous mutant deletion lines. We prioritized putative TREE regulatory regions that were near genes previously implicated in fin growth and regeneration, or involved in other regeneration contexts.
fgf20a encodes a ligand expressed by epithelial cells in the distal tip of the regenerating fins, and fgf20a null mutations block the regeneration of amputated fins (Shibata et al., 2016; Whitehead et al., 2005). We discovered a putative TREE 48 kb downstream of the transcriptional start site of fgf20a that displays increased chromatin accessibility at both 1 and 4 dpa. We subcloned this 953 bp fragment, referred to as fgf20aE48 (Figs 1I and 2A), and used it to generate eight independent lines of fgf20aE48:EGFP zebrafish. We compared the EGFP expression domains in these enhancer reporter lines with those of the published HGn21A:EGFP line, in which a construct containing a heat shock minimal promoter and EGFP expression cassette randomly inserted 1.6 kb upstream of the fgf20a transcriptional start site (Shibata et al., 2016). Both fgf20aE48 and HGn21A direct EGFP expression in larval brain (4 of 8 fgf20aE48 lines) and in larval skeletal muscle (3 of 8 fgf20aE48 lines, data not shown) (Fig. 2B-C′), with HGn21A:EGFP fluorescence being considerably stronger in each domain. During fin regeneration, fgf20aE48-directed fluorescence was induced in a similar distal region of fin epidermis at 4 dpa as HGn21A:EGFP (Fig. 2G-G′). In general, the EGFP fluorescence domains were similar but broader than endogenous fgf20a mRNA detected by in situ hybridization. To assess whether this was a result of persistent EGFP fluorescence, we examined EGFP RNA localization in HGn21A:EGFP and fgf20aE48:EGFP regenerates. Wound epidermal expression of EGFP mRNA at 4 dpa in both reporter lines was more restricted and similar to endogenous fgf20a, although much stronger in HGn21A:EGFP regenerates versus fgf20aE48:EGFP regenerates (Fig. S4AE,AF). Despite the similar domains at 4 dpa, the dynamics of fgf20a:EGFP and HGn21A:EGFP expression were distinct. In uninjured adults, HGn21A:EGFP expression was present at the very distal edge of the caudal fin, whereas fgf20aE48:EGFP expression was undetectable (Fig. 2H-K). fgf20aE48-directed EGFP was not detectable at 1 dpa, was present in the 4 dpa regeneration epidermis and was undetectable again at 7 dpa, whereas HGn21A:EGFP was strong throughout these stages (Fig. 2L-Q). CRISPR deletion of the fgf20aE48 region did not notably affect fgf20a transcript levels in uninjured fins (Fig. 2R) or 2 days post fertilization (dpf) embryos (Fig. S4AJ). We compared expression levels normalized to a standard gene (actb2), as well as relative increases in expression from the 0 dpa levels of the corresponding genotype. Following amputation, fgf20aE48Δ/Δ fins trended toward increased fgf20a expression at 1 dpa (Fig. 2R,S), although with variability in expression. At 4 dpa, fgf20aE48Δ/Δ fins had similar fgf20a levels as wild-type regenerates (Fig. 2R), with the extent of increase versus 0 dpa samples being lower than wild types (Fig. 2S). The expression domain of fgf20a was not altered in 4 dpa fgf20aE48Δ/Δ fins (Fig. 2D,E). The lengths of fgf20aE48Δ/Δ fin regenerates were slightly decreased, by ∼5-7%, at 4, 7 and 14 dpa (Fig. 2T), while fin patterns were grossly normal (Fig. S4F-J). Together, our results indicate that the fgf20aE48 regulatory sequence is likely to direct a component of fgf20a expression, particularly at 4 dpa, and there are other essential regulatory sequences that maintain sufficient levels of Fgf20a ligand to permit regeneration.
mdka encodes a heparin-binding growth factor that activates MAPK or AKT pathways and can promote repair, proliferation, survival, angiogenesis and migration (Aynacioğlu et al., 2019). In zebrafish, Mdka is implicated in heart and retina regeneration (Gramage et al., 2015; Lien et al., 2006; Nagashima et al., 2019). We found that mdka transcript levels increase sharply during fin regeneration. A putative TREE 20 kb upstream of the mdka start site displayed increased chromatin accessibility at 1 and 4 dpa (Fig. 1J). We subcloned a 1.3 kb mdkaE20 fragment corresponding to this region into our reporter construct (Fig. 3A). This sequence directed EGFP expression throughout transgenic zebrafish larvae, mainly in the head, eye, notochord and caudal fin fold (Fig. 3B,B′), but with minimal expression in uninjured adult fins (Fig. 3C). Eight out of 10 independent stable transgenic reporter lines generated with this fragment directed mdkaE20:EGFP expression in epidermal and/or mesenchymal cell expression during fin regeneration, similar to the ISH pattern of mdka (Fig. 3D,F,F′), with EGFP mRNA detection by ISH in mdkaE20:EGFP regenerates showing similar results (Fig. S4AG). mdkaE20-directed EGFP was present at 1 dpa both proximal and distal to the injury site, and this expression persisted until ∼7 dpa, fading by 21 dpa (Fig. 3G-J). Deletion of the entire mdka-coding sequence (mdkaΔ/Δ) did not impact animal viability, and mutant fin regenerates measured ∼4-8% shorter than wild-type clutchmates at 4, 7 and 14 dpa, respectively (Fig. 3K), with overall patterning being normal (Fig. S4L-P). Homozygous deletions of the mdkaE20 TREE via CRISPR resulted in similar mdka expression levels as in wild-type fins at 0, 1 and 4 dpa (Fig. 3L), and in similar increases at 1 dpa and 4 dpa over 0 dpa as wild types (Fig. 3M), although embryonic expression of mdka was reduced by 38% in 2 dpf mdkaE20Δ/Δ embryos (Fig. S4AJ). The expression pattern of mdka in mdkaE20Δ/Δ regenerating adult fins was similar to wild types (Fig. 3D,E), and mdkaE20Δ/Δ fins regenerated normally without any detectable patterning defects (Fig. 3N; Fig. S4R-V). Thus, our data indicate that mdkaE20 is used to direct embryonic mdka expression, directs expression mainly in epidermis and/or distal blastema during regeneration, and that it is not required to activate mdka expression after fin amputation.
Genome-wide profiling of uninjured fin and blastemal fibroblasts
Profiles of purified cell populations can provide lineage-specific information that is masked by bulk tissue analysis. Fibroblasts are the predominant cell type in zebrafish fins and in the regeneration blastema, and recent findings indicate that blastemal fibroblasts possess positional information for regeneration (Tornini et al., 2016). To genetically label uninjured and regenerating fibroblasts for purification, we took advantage of the fact that immediate upstream regulatory sequences of tph1b, a gene encoding the rate-limiting enzyme in serotonin synthesis, direct expression in blastemal fibroblast progenitors upon injury. Thus, if a high tamoxifen dose is administered to tph1b:CreER; ubi:switch zebrafish 2 days after a proximal fin amputation, most blastemal fibroblasts and their progeny permanently express mCherry (Fig. 4A). If this fin is allowed to fully regenerate (i.e. for 30 days or more), a high percentage of fibroblasts stably expressing mCherry occupy the fin (Tornini et al., 2016).
We performed ATAC-seq and RNA-seq on stably labeled fin fibroblasts sorted from 0 dpa and regenerating 4 dpa tph1b:CreER; ubi:switch tissue (Fig. 4A). This analysis revealed 4629 differentially regulated transcripts and 33,732 differentially accessible chromatin regions (Fig. S3A; Tables S14, S15). We were able to merge the ATAC-seq and RNA-seq data by pairing 10,197 differential chromatin regions with nearby differential RNA transcripts (Fig. 4B,C; Table S16). Comparative analysis of the fibroblast differential transcriptome to the corresponding whole-fin 4 dpa differential transcriptome showed overlap, with 51.9% of the differential fibroblast transcripts present in differential whole-fin 4 dpa data (Fig. 4D). Regions with increased accessibility in fibroblasts were mainly in intergenic regions (65.8%), introns (17.9%) or exons (10.0%). The remaining 6.4% of differential chromatin regions in 4 dpa fibroblasts were within 3 kb upstream or downstream of a gene, or in UTR regions (Fig. S3B). Similar to in the 4 dpa dataset, ChromVar analysis revealed that the most variable transcription factor binding sites in the fibroblast data implicate components of the AP-1 complex (Fig. S3C) (Karin et al., 1997; Schep et al., 2017), and footprinting analysis again identified changes in accessibility near the three selected transcription factor consensus sequences FOS:JUN, EPAS1 and ZNF263 (Fig. S3D-F). As in whole-fin tissue, IPA indicated the most highly activated upstream regulator as ERBB2, and the most downregulated upstream regulator as TP53 (Fig. S3G; Table S17). IPA of canonical pathways upregulated in fibroblasts revealed a notable increase in oxidative phosphorylation, with one of the most downregulated pathways again being melatonin degradation (Fig. S3H; Table S18).
Comparative analysis of merged differential ATAC peaks paired with nearby differential transcripts revealed 651 transcripts with nearby differential peaks shared between the fibroblast and 4 dpa dataset, with 4460 unique to the fibroblast dataset and 2419 unique to the 4 dpa dataset (Fig. S3I). GO Term analysis of the shared transcripts revealed overlapping involvement in wound healing and regeneration processes, as well as axonal development (Fig. S3J). The fibroblast dataset showed greater enrichment with dynamic accessibility near differential transcripts involved in sensing external stimuli, cell growth and neuronal guidance, whereas the 4 dpa whole-fin dataset showed greater enrichment with dynamic accessibility near differential transcripts involved with vasculature and epithelium (Fig. S3K,L).
We selected a putative TREE showing increased accessibility in 4 dpa fibroblast samples, more prominent than in 1 or 4 dpa whole-fin samples, and located 24 kb downstream of the cx43 gene (Fig. 5A). cx43 encodes a gap junction protein that facilitates the exchange of small molecules between neighboring cells (Hoptak-Solga et al., 2007). cx43 is expressed in caudal fin joints and is induced in blastemal fibroblasts and joint-forming osteoblasts during fin regeneration (Sims et al., 2009). short fin (sof) zebrafish possessing homozygous loss-of-function mutations at the cx43 locus exhibit shortened ray segments and an overall shortened fin (Iovine et al., 2005). In our fibroblast RNA-seq dataset, cx43 RNA levels were high at 0 dpa, a possible result of collection from regenerated fins as part of the fibroblast purification protocol, and further elevated at 4 dpa (Fig. 5A). We generated transgenic reporter lines using a 945 bp region containing the putative enhancer, named cx43E24 (Fig. 5B). Coupled with minimal promoter and EGFP sequences, cx43E24 directed expression in the trunk musculature, fin fold and pectoral fin buds of zebrafish larvae (Fig. 5C,C′). EGFP expression was low or undetectable in joint osteoblasts of uninjured fins, and the strongest EGFP-expressing lines exhibited EGFP expression in the three dorsal-most rays (Fig. 5D). Five out of five stable cx43E24:EGFP lines displayed induced EGFP fluorescence in blastemal mesenchyme during regeneration that is similar to endogenous blastemal cx43 expression, and it was also detectable in most blastemal osteoblasts (Fig. 5G,G′). Interestingly, we detected spatial restriction of EGFP mRNA more selectively in what appeared to be joint-forming osteoblasts (as well as fibroblasts), a difference that is likely explained by perdurance of cx43E24-directed EGFP fluorescence (Fig. S4AH). We saw strong cx43E24:EGFP fluorescence in regenerating fin rays at 4 dpa, which lingered until 7 dpa and faded once regeneration was completed (Fig. 5H-J). Deletion of cx43E24 by CRISPR revealed similar cx43 transcript levels at 0 and 1 dpa in cx43E24Δ/Δ and wild-type regenerates, and similar increases in expression from 0 to 1 dpa in both genotypes (Fig. 5K,L). By contrast, cx43 levels were 26.1% lower in cx43E24Δ/Δ 4 dpa regenerates than in wild types, and increases at 4 dpa compared with 0 dpa in cx43E24Δ/Δ fins were lower (3.9-fold) compared with wild types (5.9-fold) (Fig. 5L). Expression of cx43 was also reduced by 33% in cx43E24Δ/Δ embryos (Fig. S4AJ). No difference in the expression pattern of cx43 mRNA was observed in cx43E24Δ/Δ 4 dpa fins compared with wild type (Fig. 5E,F). Overall cx43E24Δ/Δ fin regeneration lengths at 4 dpa and 7 dpa were normal compared with wild types (Fig. 5M; Fig. S4X-AB). However, we found that ray segments distal to the amputation plane in regenerating cx43E24Δ/Δ 7 dpa fin tissue were 6.2% shorter than wild types, and those proximal to the amputation plane were 5.2% shorter (Fig. 5N). These changes are modest, but given the established phenotype of sof mutants (Iovine et al., 2005), and as we found fgf20aE48Δ/Δ and mdkaE20Δ/Δ ray segment lengths were no different from wild-type clutchmates (Fig. S4AI), they are possibly of biological significance. Thus, our results indicate that cx43E24 is sufficient to direct blastemal fibroblast expression following fin amputation and contributes to induction and/or maintenance of cx43 expression at 4 dpa, although additional elements regulate cx43 expression.
Deletion of the LEN element abolishes injury-induced expression of leptin b
LEN was originally identified as a sequence region ∼6 kb upstream of the lepb transcriptional start site that becomes enriched with the H3K27Ac mark during zebrafish heart regeneration. During fin regeneration, lepb is sharply induced from essentially undetectable levels, and LEN was shown to be inactive in uninjured fins but able to direct EGFP expression in regenerating fin fibroblasts when linked to a lepb 2 kb minimal promoter (Kang et al., 2016) (Fig. 6A,B). To date, the chromatin signature of LEN during fin regeneration has not been assessed. In our datasets, lepb levels were elevated in 0 dpa fibroblasts, similar to the cx43 expression data and possibly again due to the double amputation protocol. RNA levels increased from 0 to 4 dpa in fibroblasts, and sharply from 0 to 1 or 4 dpa in whole-fin samples. LEN was accessible in both whole-fin samples and in purified fibroblast samples, although the peak was not dynamic during regeneration and would not have been considered differential during regeneration by our bioinformatic analyses (Fig. 6C). To explain this, it is possible that LEN is not regulated by context-specific chromatin remodeling and is held accessible in a primed or poised state prior to injury-induced activation (Heinz et al., 2015). Alternatively, ATAC-seq is not informative for this locus versus other marks of active regulatory sequences.
To test the requirement for LEN in control of endogenous lepb expression during regeneration, we designed gRNAs to generate a ∼3.4 kb deletion encompassing LEN. Homozygous LEN deletion mutants survived and regenerated fins normally; this was expected, as lepb-coding sequence mutants we previously generated also survive and regenerate normally (Kang et al., 2016) (Fig. 6D). Importantly, qPCR and transcriptome sequencing from regenerating wild-type and LENΔ/Δ fin regenerates revealed no detectable elevation of lepb at 2 dpa in the absence of LEN (Fig. 6E,F,H). RNA-seq of LENΔ/Δ fins also indicated normal upregulation of the lepb-linked gene impdh1b during regeneration, suggesting specificity of LEN to the lepb target gene (Fig. 6G,H). Therefore, LEN is genetically required for lepb regulation during regeneration, and there is no apparent compensation by alternative enhancer elements at this locus.
Distinct chromatin accessibility patterns in regenerating osteoblasts and fibroblasts
While this manuscript was being prepared, Johnson and colleagues published a report assessing DNA methylation and chromatin accessibility during zebrafish fin regeneration in purified osteoblast (sp7+) and non-osteoblast (sp7−) cells at 0 and 4 dpa (Lee et al., 2020). We compared our purified cell datasets and observed that while the number of differentially expressed genes during regeneration was similar between fibroblasts and osteoblasts, only 41.9% of transcripts differentially expressed during regeneration in fibroblasts were also indicated as differentially expressed in osteoblasts (Fig. 7A). We used the methylation datasets from Lee et al. to determine the methylation states of the enhancers identified in our study, finding that methylation states of fgf20aE48, mdkaE20, cx43E24 and LEN were not dynamic between uninjured and regenerating fins, as a key conclusion from their work predicts. Interestingly, each of these enhancers contained a cluster of unmethylated regions in both sp7+ and sp7− datasets (Fig. S5G-J). We examined chromatin accessibility across datasets for sequences validated in transgenic lines from Lee et al. and from the current study (Fig. 7B-H; Fig. S5A-F). Interestingly, the osteoblast TREEs cdc42ep3+2k and chst3b+5k exhibited significantly decreased accessibility in fibroblasts during regeneration, opposing the directions observed in sp7+ osteoblast samples (Fig. 7C,D). The sp7− TREEs mntr1aa+31k and ptk2aa+9k exhibited an increase in accessibility in whole fin chromatin (Fig. 7E,F), consistent with findings from the sp7− data. The cx43E24 TREE identified from fibroblast samples also increased accessibility in sp7+ osteoblasts at 4 dpa, with cx43 RNA levels increasing in both cell types (Fig. 7G). This was predicted, as cx43 expression was localized in osteoblast populations in the regenerating fin as well as in fibroblasts (Iovine et al., 2005), and as revealed by cx43E24-driven EGFP expression (Fig. 5F). Interestingly, fgf20aE48 increases accessibility and fgf20a RNA levels increase in the sp7+ dataset as in whole-fin samples, but not in fibroblasts (Fig. 7H). fgf20a has been considered an epidermal marker, with some mesenchymal expression possibly present but only within hours of amputation (Shibata et al., 2016; Whitehead et al., 2005). Although there are multiple possible explanations for detecting fgf20a in purified osteoblasts, including adhesion of different cell types in the sorting process, these data from purified cell populations could also potentially reveal unexpected regulation, like sp7+ regulatory sequence activation in fgf20a-expressing cells, or vice versa.
Conclusions
We provide a resource of the chromatin accessibility landscape of regenerating zebrafish caudal fins during blastema formation and active regenerative outgrowth, revealing thousands of putative TREEs from whole-fin tissue. Profiling purified blastemal fibroblasts further allowed us to filter out noise from the heterogeneity of the whole-fin samples and detect fibroblast-relevant signals. The resources presented in this study, combined with other complementary datasets (Lee et al., 2020), identify many enhancers and putative enhancers that direct gene expression during appendage regeneration. This resource also describes enriched motifs and network analysis for predicted transcriptional regulators, as well as regeneration-related regulatory sequences to examine for conservation between species or for ectopically expressing gene cassettes during fin regeneration.
Differentially accessible sequences we attempted to validate were consistently able to direct reporter gene expression that approximates expression of a nearby endogenous gene. Only in the case of LEN did we see observe that enhancer deletion abrogated all detectable endogenous gene expression, whereas other TREE deletions had partial or no detectable effects on expression during regeneration. Studies of enhancers in other contexts have indicated that individual elements can be redundant, or their loss can lead to compensation or chromosome structural changes that impact expression (Andersson and Sandelin, 2020; Cannavò et al., 2016; El-Sherif and Levine, 2016; Heinz et al., 2015; Scholes et al., 2019). Thus, we suspect that additional regulatory elements, either detectable in ATAC-seq datasets or cloaked, contribute to expression levels of fgf20a, mdka, and cx43. Although we have shown that transposase accessibility assays, and, most likely, assays for other chromatin features like H3K27Ac, might not directly identify TREEs that are essential for regeneration-responsive expression, assessment of interactions between genomic regions near induced genes, using chromosome conformation capture (Grob and Cavalli, 2018) or tiled deletion approaches (Diao et al., 2017), may help identify additional regulatory sequences related to the genes studied here or other genes involved in regeneration.
MATERIALS AND METHODS
Zebrafish maintenance and procedures
Wild-type or transgenic zebrafish of the outbred Ekkwill (EK) or EK/AB strain ranging from 4 to 12 months were used for all experiments. Transgenic and mutant zebrafish lines are presented in Table S19. Water temperature was maintained at 27.5°C, and fish were kept on a 14/10 light/dark cycle. Fish were anesthetized in 0.75% v/v 2-phenoxyethanol (Sigma-Aldrich) in fish water. Fins were amputated at 50% of their original length using razor blades (VWR). All work with zebrafish was performed in accordance with Duke University guidelines.
Enhancer reporter zebrafish lines (fgf20aE48:EGFPpd320, mdkaE20:EGFPpd321 and cx43E24:EGFPpd322) were generated using I-SceI transgenesis. Enhancer fragments (954 bp fgf20aE48, 1310 bp mdkaE20 and 946 bp cx43E20) were subcloned via Gateway Cloning using Gateway LR Clonase II (Thermo Fisher Scientific) upstream of a mouse c-fos minimal promoter, EGFP and poly-A tail, flanked by I-SceI sites (Goldman et al., 2017). Primers are provided in Table S20. Reporter construct (500 pl of 25 ng/μl), 333 U/ml I-SceI (NEB), 1x I-SceI buffer and 0.05% Phenol Red (Sigma-Aldrich) solution were injected into one-cell zebrafish embryos. F0 founders were discovered by screening for fluorescence in outcrossed F1 embryos.
Deletion lines (LENΔ/Δpd281, fgf20aE48Δ/Δpd323, mdkaΔ/Δpd324, mdkaE20Δ/Δpd325 and cx43E24Δ/Δpd326) were created by CRISPR/Cas9 genome editing (3440 bp LEN deletion, 2517 bp fgf20aE48 deletion, 1263 bp mdkaE20 deletion and 1658 bp cx43E24 deletion). Double-stranded DNA (dsDNA) templates were generated from a Universal gRNA 3′ oligo combined with the CRISPR Primers (designed by CRISPRscan), listed in Table S20, using Phusion DNA Polymerase (New England Biolabs), then purified using a DNA Clean and Concentrator-5 kit (Zymo). The purified dsDNA templates were used for in vitro transcription to generate gRNA using a T7 or SP6 RNA Polymerase kit (New England Biolabs or Thermo Fisher Scientific), then purified using an RNA Clean and Concentrator-5 kit (Zymo). A gRNA pair (500 pL of 30 ng/μl), 300 ng/μl Cas9 protein (PNA Bio) and 0.05% Phenol Red (Sigma-Aldrich) solution was injected into one-cell zebrafish embryos. Two to 4 dpf embryos were genotyped by PCR and agarose gel electrophoresis using primers in Table S20 to determine gRNA efficiency. To screen F0 founders, genomic DNA was extracted from outcrossed F1 embryos and genotyped by PCR and agarose gel electrophoresis with KAPA2G Genotyping Mix (Kapa Biosystems) using the primers in Table S20. Deletions were confirmed via Sanger sequencing (Eton Bioscience).
RNA isolation and qPCR
Caudal fin tissue from three fish or from 15 2 dpf embryos, was pooled for each biological replicate and collected in Tri-Reagent (Sigma). RNA was isolated by ethanol precipitation followed by purification with an RNA Clean & Concentrator Kit-5 (Zymo). cDNA was synthesized from 1 μg of total RNA with a First Strand cDNA Synthesis Kit (Roche). Quantitative PCR was performed on a Roche LightCycler 480 using LightCycler 480 SYBR Green I Master Mix (Roche) with primers listed in Table S20. fgf20aE48Δ/Δ, mdkaΔ/Δ, mdkaE20Δ/Δ and cx43E24Δ/Δ qPCRs were completed in technical triplicates for each timepoint, and LENΔ/Δ qPCRs in technical duplicates. All experiments normalized transcript expression levels to actb2 as a control. PRISM software was used to perform statistical analysis with paired t-tests and to generate graphs (GraphPad PRISM 8.4.0).
Library preparation and sequencing
For whole-fin libraries, biological triplicate pools of caudal fin clips were collected at 0 dpa (freshly amputated), 1 dpa or 4 dpa time points via razor blade (VWR) from 15 fish per pool. Cells were dissociated using 1.18 U/ml Liberase DH (Roche) in Hank's Balanced Salt Solution (Gibco) stirring at 37°C for 2 h, collected in 15 min increments, then strained through a 70 μm nylon cell strainer (Corning). Samples were quantitated for 100 K live cells using the Countess Automated Cell Counter (Thermo Fisher Scientific). ATAC library preparation was performed as described previously (Buenrostro et al., 2013).
For whole-fin RNA-seq, triplicate Ekkwill strain caudal fin clip biological replicate pools were collected at 0 dpa, 1 dpa or 4 dpa time points via razor blade (VWR) from 15 fish per pool aged between 3 and 12 months. LENΔ/Δ and corresponding wild-type tissue were collected similarly but at 0 dpa and 2 dpa. Fin tissue was collected in Tri-Reagent (Sigma) and isolated by ethanol precipitation. RNA was further purified with the RNA Clean & Concentrator Kit-5 (Zymo). Libraries were prepared using a Stranded mRNA-seq Kit (KAPA).
For fibroblast libraries, triplicate tph1b:creERpd250; ubi:switchcz1701 caudal fins were clipped and treated at 2 dpa for 12 h in 4 μM tamoxifen (Sigma-Aldrich) (Tornini et al., 2016), and allowed to regenerate fully for 2 months. Seventy-five fish fin clips were combined for each biological replicate pool, collected at 0 dpa or 4 dpa time points. Cells were dissociated using 1.18 U/ml Liberase DH (Roche) in Hank's Balanced Salt Solution (Gibco) stirring at 37°C for 2 h, collected in 15 min increments, strained through a 70 μm nylon cell strainer (Corning), then sorted for 100 K mCherry+ fluorescent cells on a B-C Astrios cell sorter by the Duke Cancer Institute Flow Cytometry Shared Resource. ATAC-seq library preparation was performed as detailed by Buenrostro et al. (2013). Libraries were prepared using a Stranded mRNA-seq Kit (KAPA).
Sequencing for the Whole Fin 0/4, Fibroblast and LENΔ/Δ libraries was performed on a HiSeq 4000, 50 bp SE, by the Duke Center for Genomic and Computational Biology. Sequencing for the Whole Fin 0/1 libraries was performed on a DNBSEQ-MGI2000, 100 bp PE, by BGI Americas.
Histology and imaging
Histology was performed on cryosections of fin regenerates fixed in 4% paraformaldehyde as described previously by Tornini et al. (2016). RNA probes were amplified from 4 dpa fin cDNA using primers listed in Table S20, transcribed with T7 RNA polymerase (New England Biolabs), and labelled using a DIG RNA Labeling Mix (Roche). In situ hybridization was performed using an InSituPro VSi robot (Intavis). Immunohistochemistry to enhance visualization of EGFP in enhancer reporter fish was performed with anti-GFP antibodies (rabbit, Invitrogen, A-11122; 1:200) followed by anti-rabbit antibodies (Alexa Fluor 488 conjugated; goat, Invitrogen, A-11034; 1:500). Imaging was performed on a Leica DM6000B compound microscope.
Anesthetized zebrafish were imaged using a Zeiss AxioZoom microscope at the time points indicated. All raw images were processed using either ZEN (Zeiss) or FIJI (Schindelin et al., 2012) software. Lengths of fin regenerates were measured using ZEN Blue software (Zeiss), averaging the lengths of the second and third dorsal- and ventral-most rays for each animal. Fin segment lengths were measured by taking the distance between the third and fourth joint on the second and third dorsal-most rays in each fin. PRISM software was used to perform paired two-tailed t-tests on fin amputations and generate graphs (GraphPad PRISM 8.4.0). Adobe Illustrator was used to format graphs (Adobe Illustrator 23.0.6).
Bioinformatic analysis
RNA-seq reads were trimmed using Trim Galore (Krueger, 2015) (0.4.1, with -q 15 --paired) and then mapped with TopHat (Trapnell et al., 2009) (v 2.1.1, with parameters --b2-very-sensitive --no-coverage-search and supplying the UCSC danRer10 refSeq annotation). Gene-level read counts were obtained using the htseq-count (Anders et al., 2015) (v1.6.1) by the reads with MAPQ greater than 30. Bioconductor package DESeq2 (Love et al., 2014) (v 1.26.0) was employed for differential expression (DE) analysis. A fold change greater than two and a false discovery rate of less than 0.05 was used to filter the significant DE genes. The zebrafish gene IDs were mapped to mouse homologous gene IDs to perform pathway and upstream regulator analysis using QIAGEN Ingenuity Pathway Analysis software (Krämer et al., 2014).
ATAC-Seq reads were trimmed using Trim Galore (0.4.1, with -q 15) then mapped with bowtie2 (Langmead and Salzberg, 2012) (2.2.5, with parameters --very-sensitive) to zebrafish genome (UCSC danRer10 refSeq annotation). The mapped reads were filtered by a MAPQ greater than 30 using samtools (Li et al., 2009) (v 1.5) and duplicated reads were removed by picard (v 1.91). The peaks were called by MACS2 (Zhang et al., 2008) (v 2.1.0, with --nomodel --shift 37 --extsize 73 -q 0.05 -g 1.5e9). The bioconductor package DiffBind (Ross-Innes et al., 2012) (v 2.14.0) was employed for differential open region analysis. A fold change greater than two and a P-value of less than 0.05 was used to filter the significant differential peaks. The enriched motifs were analyzed using chromVAR (Schep et al., 2017) (v 1.8.0). The motifs were a collection of jasper2018, jolma2013 and cisbp_1.02 from package motifDB (v 1.28.0) and were merged by a distance smaller than 1e-9 calculated by MotIV::motifDistances function (v 1.42.0). The merged motifs were exported by motifStack (Ou et al., 2018b) (v 1.30.0). The factor footprints were plotted by ATACseqQC package (version 1.12.3) (Ou et al., 2018a).
Bioconductor package ChIPpeakAnno (Zhu et al., 2010) (v 3.21.4), trackViewer (Ou and Zhu, 2019) (v 1.23.2) and complexHeatmap (Gu et al., 2016) (v2.2.0) were used to annotate the peaks, plot tracks and generate heatmaps. The differential peaks were annotated by nearest gene start site to the center of peaks. ATAC-Seq peaks were paired to RNA-Seq differential expression data by annotated nearest gene symbols. The methylation/unmethylation data were downloaded from GSE126702 (Lee et al., 2020) and plotted using the trackViewer package (version 1.24.1) (Ou and Zhu, 2019).
Acknowledgements
We thank Jim Burris, Karina Olivieri, Larry Frauen, Colin Dolan, Eileen Gu, Daniel Stutts, Tim Curtis, Nicolette Mariano and Shane Miller of the Duke Zebrafish Core for zebrafish care; Alexias Safi for assistance with ATAC-seq; Atsushi Kawakami for the Hgn21A line; Valerie Tornini for guidance on transgenic lines; and Chen-Hui Chen for comments on the manuscript. This work is dedicated to our friend, mentor and colleague Stephen L. Johnson.
Footnotes
Author contributions
Conceptualization: J.D.T., G.E.C., K.D.P.; Methodology: J.D.T.; J.K.; Formal analysis: J.D.T., J.O., J.K. V.C., L.S.; Investigation: J.D.T., N.L., K.S., J.K.; Data curation: J.O.; Writing - original draft preparation: J.D.T.; Writing - review & editing: J.D.T., J.O., J.K., G.E.C., K.D.P.; Supervision: K.D.P., G.E.C., J.K.; Project administration: K.D.P.; Funding acquisition: G.E.C., J.K., K.D.P.
Funding
This work was supported by a Schweizerischer Nationalfonds zur Förderung der wissenschaftlichen Forschung Postdoctoral Fellowship to V.C., by University of Wisconsin-Madison start-up funds (AAC8355, AAC8979, AAC6429 and AAG9756) to J.K. and by grants from the National Institutes of Health (R01 AS076342 and R01 GM074057) to K.D.P. Deposited in PMC for release after 12 months.
Data availability
The ATAC-seq and RNA-seq data have been deposited in GEO under accession number GSE146960.
Peer review history
The peer review history is available online at https://dev.biologists.org/lookup/doi/10.1242/dev.191262.reviewer-comments.pdf
References
Competing interests
The authors declare no competing or financial interests.