Topologically associating domains (TADs) have been proposed to both guide and constrain enhancer activity. Shh is located within a TAD known to contain all its enhancers. To investigate the importance of chromatin conformation and TAD integrity on developmental gene regulation, we have manipulated the Shh TAD – creating internal deletions, deleting CTCF sites, and deleting and inverting sequences at TAD boundaries. Chromosome conformation capture and fluorescence in situ hybridisation assays were used to investigate the changes in chromatin conformation that result from these manipulations. Our data suggest that these substantial alterations in TAD structure have no readily detectable effect on Shh expression patterns or levels of Shh expression during development – except where enhancers are deleted – and result in no detectable phenotypes. Only in the case of a larger deletion at one TAD boundary could ectopic influence of the Shh limb enhancer be detected on a gene (Mnx1) in the neighbouring TAD. Our data suggests that, contrary to expectations, the developmental regulation of Shh expression is remarkably robust to TAD perturbations.
At the megabase-scale, the mammalian genome is partitioned into self-interacting topologically associated domains (TADs) (Dixon et al., 2012; Nora et al., 2012). Mammalian TAD boundaries are enriched in CTCF sites, with their relative orientation appearing crucial to function (Narendra et al., 2015; Rao et al., 2014; Sanborn et al., 2015). TADs are formed by dynamic cohesin-driven loop extrusion (Fudenberg et al., 2016; Nora et al., 2017; Rao et al., 2017; Schwarzer et al., 2017; Vian et al., 2018) and convergent CTCF sites act to impede loop extrusion by enabling WAPL-mediated release of cohesin from the chromosome (Haarhuis et al., 2017).
The regulatory landscapes of developmental genes are frequently found to be contained within the same TAD (Dixon et al., 2012; Rao et al., 2014). TADs have, therefore, been proposed to act as functional regulatory units within which contacts between enhancers and their target gene are favoured and aberrant interactions of enhancers across TAD boundaries are limited (Fudenberg et al., 2016; Sun et al., 2019). In support of this hypothesis, some studies have found that deletion or inversion of CTCF sites at TAD boundaries can promote TAD boundary crosstalk and rewire enhancer-promoter contacts (de Wit et al., 2015; Guo et al., 2015; Narendra et al., 2015; Rodríguez-Carballo et al., 2017). Moreover, a number of recent studies have suggested that changes to TAD structure can disrupt gene regulation through enhancer-rewiring in human disease (Flavahan et al., 2016; Franke et al., 2016; Lupiáñez et al., 2015). However, other studies report that, although depletion of CTCF erases the insulation between TADs, it has limited effects on gene expression (Nora et al., 2017; Soshnikova et al., 2010).
To further study the CTCF-mediated function of TADs in developmental gene regulation, we have exploited the sonic hedgehog (Shh) regulatory domain – a paradigm locus for long-range regulation. The SHH morphogen controls the growth and patterning of many tissues during embryonic development, including the brain, neural tube and limbs. Spatial and temporal Shh expression is regulated by tissue-specific enhancers located within the gene, and upstream in a large gene desert and within neighbouring genes (Jeong et al., 2006; Anderson and Hill, 2014). Shh and its cis-acting elements are all contained within a well-characterised ∼960 kb TAD (Anderson et al., 2014; Williamson et al., 2016). In the developing mouse limb bud, Shh expression is solely determined by the ZRS (also known as MFCS1) enhancer (Lettice et al., 2003; Sagai et al., 2005) located 850 kb upstream of Shh within an intron of the widely expressed Lmbr1 (Fig. 1A). Fluorescence in situ hybridisation (FISH) showed that Shh and the ZRS are consistently located in relatively close proximity to each other in all cell types and tissues examined, which we infer to be a consequence of the underlying invariant TAD structure. In contrast, we observed increased ZRS-Shh colocalisation in the Shh-expressing posterior portion of developing limb buds (Williamson et al., 2016), consistent with a specific gene-enhancer contact.
Here, we genetically manipulate the Shh TAD and its TAD boundaries to investigate the importance of chromatin architecture on TAD structure and on the regulation of gene expression. We use a chromosome conformation assay (5C) and FISH to investigate how these manipulations affect structures within the Shh TAD and its interactions with adjacent TADs. We determine how these alterations affect the expression pattern of Shh and nearby developmentally regulated genes in vivo. We also examine the phenotypic consequences of these manipulations. Our results question the importance of TADs for correct spatial and temporal gene regulation.
A large deletion within the Shh TAD does not disrupt local genome organisation or limb-specific activation of Shh
Prominent features of the Shh TAD include five CTCF binding sites preserved across multiple cell types (Fig. 1A), and two sub-TADs with overlapping boundaries located within the gene desert between the forebrain enhancers and Rnf32 (Figs 1F and 2A). This region of the gene desert includes less well defined CTCF peaks that differ across cell types but because of their location may have some role in defining these sub-TADs (Fig. 1A) (Rosenbloom et al., 2013).
To determine the contribution of TAD internal sequence to 3D chromatin organisation and gene expression, we exploited our previous work that used the local hopping activity of the sleeping beauty (SB) transposon to probe the Shh regulatory domain (Anderson et al., 2014). Transposition of the SB leaves one loxP site at the initial integration site and inserts a second site where it re-integrates, enabling Cre recombinase to create deletions of the intervening DNA. The orientation of the re-integration means the lacZ gene [encoding β-galactosidase (β-gal)] carried by the SB is retained in the deleted chromosome allowing remaining enhancer activity to be monitored. Using this approach, we deleted ∼700 kb (∼70%) of the internal Shh TAD sequence, including the sub-TAD boundaries but leaving the five CTCF binding sites at the TAD extremities intact (Fig. 1A). The Δ700 deletion removes many of the known Shh enhancers and relocates the ZRS to within 96 kb of the Shh promoter (Fig. 1A). Removal of the Shh forebrain and epithelial enhancers in the Δ700 deletion is shown by changes in the β-gal staining of ShhΔ700/+ embryos. Staining is observed only within the floor plate and hind brain, presumably driven by the proximal enhancers SFPE1/2 and SBE1, and within the limbs driven by the ZRS (compare the wild type in Fig. 1B with the ShhΔ700/+ embryo in Fig. 1C). Homozygous ShhΔ700/Δ700 embryos show phenotypes very similar to those of Shh−/− embryos but with normal limb and digit patterning (Chiang et al., 1996). These data indicate that, despite its incorrect position now only 96 kb from the Shh promoter, ZRS is able to function normally to drive Shh expression in limb development (Fig. 1D,E).
5C on whole embryonic day (E) 11.5 ShhΔ700/ Δ700 and wild-type embryos shows that the Shh TAD boundaries and the adjacent TADs are unaffected by the Δ700 deletion (Fig. 1F,G; Fig. S1). Therefore, neither sequence elements nor chromatin interactions within the deleted region is needed for maintaining the location of the TAD boundaries. In addition, the large genomic distance between Shh and its limb enhancer ZRS is not required for correct function.
Interactions within the Shh TAD are delineated by CTCF sites either side of Shh and within Lmbr1
Our previous 5C analyses on cells dissected from whole limbs, bodies and heads of E11.5 embryos showed enriched interactions between the genomic region containing Shh, located at one TAD boundary, and a genomic region within Lmbr1 close to ZRS, located ∼70 kb from the other TAD boundary (Williamson et al., 2016). That this enrichment can be identified throughout the E11.5 embryo, a stage when we have shown that high levels of Shh-ZRS colocalisation occur only in the posterior distal limb, excludes active Shh-ZRS colocalisation as the sole driver of this apparent chromatin loop (Williamson et al., 2016). To gain further insight into the nature of these interactions, we dissected E11.5 limb buds to compare cell populations with no ZRS activity (anterior two-thirds of bud) with those in which ZRS is active (posterior one-third) (Fig. 2A; Fig. S2A,C).
The 5C heatmaps of the Shh TAD are similar in both anterior and posterior limb bud cell populations, and comparable with dissected E11.5 bodies (compare Fig. 1F with Fig. 2A). At high (15 kb) resolution, the strongest enrichment involved interactions between both Shh and the genomic region immediately 3′ of Shh, and a locus ∼20 kb from ZRS in intron 5 of Lmbr1 (Fig. 2B; Fig. S2B,D, left- and right-hand heatmaps). ENCODE data (Rosenbloom et al., 2013) indicates these three loci are all bound by CTCF across a range of cell and tissue types (Fig. 1A), with the underlying DNA containing CTCF-binding motifs in a convergent orientation consistent with a role blocking loop extrusion (Fig. 2A).
CTCF site deletions reduce Shh intra-TAD interactions and disrupts Shh/ZRS proximity
To examine the role of CTCF sites on the architecture of the Shh TAD we used CRISPR-Cas9 to make small (∼1 kb) deletions of sequences containing the five major CTCF binding sites in mouse embryonic stem cells (ESCs) (Table S1). We first generated ESC lines homozygous for deletions of the CTCF binding regions 3′ and 5′ of Shh (Fig. 1A, sites 1 and 2, respectively) and assayed chromatin conformation by 5C and FISH.
The Shh TAD structure in wild-type ESCs is similar to that in E11.5 embryos (Fig. 3A; Fig. S3A,C). Deletion of CTCF site 1 (ΔCTCF1), which delineates the TAD boundary 3′ of Shh, results in Shh losing interactions (arrows) with the rest of its own TAD and gaining interactions (arrowheads) with regions just 5′ of En2 and Rbm33 (Fig. 3A; Fig. S3A,C). The TAD boundary re-locates by ∼60 kb to 5′ of Shh beyond CTCF2 (Fig. 3B,C; Fig. S3B,D). There is also loss of interactions with a locus upstream of the forebrain enhancers near the sub-TAD boundary within the larger Shh sub-TAD (Fig. 3B, strong blue diagonals located between SBE3 and Rnf32). These data are consistent with CTCF1 forming the Shh TAD boundary by blocking loop extrusion emanating from within the En2 TAD.
The left hand Shh TAD boundary is not affected by deletion of CTCF site 2; however, the 5′ Shh region does gain contacts (arrowheads) with the En2 TAD in a similar manner to the loss of CTCF1 (Fig. 3A-C; Fig. S3A-D), suggesting that both CTCF1 and 2 are necessary to optimally block loop extrusion emanating from the En2 TAD. Although loss of CTCF2 results in decreased interactions between the Shh locus and the rest of its own TAD (arrows), this is compensated for by increased interactions with CTCF1, thereby maintaining the boundary position (Fig. 3A,B; Fig. S3, right-hand heatmaps). There are also enriched interactions within the Shh sub-TAD in ΔCTCF2 cells (Fig. 3B; Fig. S3B,D).
We also analysed possible alterations of chromosome conformation due to the CTCF site deletions with 3D-FISH using probes for Shh, ZRS and Shh brain enhancer 2 (SBE2), an enhancer that is located 460 kb upstream of the Shh coding sequence in the middle of the TAD (Jeong et al., 2006) (Figs 1A and 3D). Interprobe distances between all three probe pairs were significantly increased in CTCF deletion cells compared with wild-type ESCs (Fig. 3E; Table S2), consistent with the reduced interactions between Shh and the rest of its TAD identified by 5C. Conversely, distances between Shh and Cnpy1 (in the neighbouring En2 TAD) were significantly decreased in ΔCTCF1 cells compared with wild type (Fig. 3E; Table S2), consistent with relocation of the TAD boundary.
Deleting either CTCF1 or CTCF2 disrupts Shh-ZRS spatial proximity in ESCs and, more generally, results in reduced 5C interactions between Shh and the rest of the regulatory TAD that may be due to the relocation of the TAD boundary (ΔCTCF1) or greater sub-division of the TAD (ΔCTCF2). The TAD boundary adjacent to Shh is sharply defined by CTCF1, whereas the boundary location of the neighbouring En2 TAD cumulatively results from both CTCF1 and 2, possibly by blocking loop extrusion emanating from this TAD. However, neither of these deletions on their own is sufficient to cause merging of the two neighbouring TADs.
Shh-ZRS proximity is disrupted by the deletion of ZRS/Lmbr1 CTCF sites
Both CTCF1 and CTCF2 have enriched interactions with the CTCF site ∼20 kb from ZRS in intron 5 of Lmbr1 (CTCF3) (Fig. 1A). Therefore, we deleted both copies of CTCF3 (ΔCTCF3), described as i5 in Paliou et al. (2019).
Although whole TAD integrity was unaffected by ΔCTCF3 (Fig. 4A,C; Fig. S4A,C), intra-TAD reorganisation occurred in a manner similar to the loss of CTCF2, with enriched interactions within the sub-TADs (Fig. 4B; Fig. S4B,D). Loss of interactions between CTCF3 and Shh/CTCF2 in ΔCTCF3 cells appears to be somewhat compensated for by enriched contacts (arrowheads) between the ZRS locus and the Shh region of the TAD, particularly CTCF1 (Fig. 4B, inset heatmap adjacent to ΔCTCF3). Ectopic CTCF binding at ZRS has recently been identified following the loss of neighbouring CTCF sites including CTCF3 (Paliou et al., 2019). Despite this compensation identified by 5C, FISH showed significantly increased interprobe distances between Shh, SBE2 and ZRS in ΔCTCF3 cells compared with wild type (Fig. 4D,E; Table S2). These data suggest that loss of any one of the three CTCF binding sites (1, 2 or 3) can disrupt the spatial proximity of Shh, SBE2 and ZRS (Figs 3E and 4E).
Finally, we generated ESC lines with deletions of CTCF binding sites at the Lmbr1 promoter (ΔCTCF4) and 5′ Lmbr1 (ΔCTCF5), both of which are located at the boundary between the Shh TAD and the adjacent TAD containing Mnx1 (Fig. 1A). The CTCF motif within CTCF5 is oriented towards the Mnx1-containing TAD and deletion of CTCF5 caused a loss of interactions between this boundary region and the Mnx1 TAD (Fig. 4A,B; Fig. S4B,D) with the TAD boundary shifted toward Nom1 (Fig. 4C). 5C reveals increased interactions in the ZRS-Lmbr1 region in ΔCTCF5 (Fig. 4B) and FISH also shows increased spatial proximity between ZRS and Lmbr1 (Fig. 4F; Table S2).
FISH revealed significantly increased interprobe distances between Shh and ZRS in ΔCTCF5 cells and between SBE2 and ZRS in ΔCTCF4 cells, which is not detected by 5C (Fig. 4E). There are also decreased distances seen between ZRS and Mnx1 in the adjacent TAD in the absence of Shh-CTCF4, also not apparent in the 5C data (Fig. 4F).
We conclude that deletion of CTCF binding sites at either of the Shh TAD and sub-TAD boundaries, especially CTCF1, 2 and 3, affects local chromatin organisation in ESCs and disrupts Shh/ZRS spatial proximity.
Reduced Shh-ZRS colocalisation in the limb upon the loss of CTCF1, 2 and 3
To test how disrupted TAD organisation impacts on chromosome conformation and Shh gene expression during embryonic development, we generated mouse lines carrying each of the homozygous CTCF deletions. We have previously reported enhanced Shh-ZRS colocalisation in the limb bud at the time and place of Shh expression that depends on a fully functional ZRS (Lettice et al., 2014; Williamson et al., 2016). Therefore, we assayed the spatial proximity of Shh, SBE2 and ZRS by FISH in E11.5 embryo sections that include posterior (ZPA) and anterior distal limb tissue from wild-type and homozygous ΔCTCF mutant embryos (Fig. 5A).
In both regions of the wild-type limb bud analysed (ZPA and anterior), Shh-ZRS distances were shorter than between either Shh-SBE2 or SBE2-ZRS, consistent with Shh and ZRS being maintained in spatial proximity across the limb bud (Fig. 5B,C). Similar to our observation in ESCs (Figs 3 and 4), distances between Shh and both SBE2 and ZRS were significantly increased in ΔCTCF1, 2 and 3, but not in ΔCTCF4 and 5 embryos (Fig. 5B,C;Tables S3 and S4). The frequency of Shh-ZRS colocalisation (<200 nm) in the ZPA of ΔCTCF1, 2, and 3 mutant embryos was reduced to levels seen in non-expressing parts of the wild-type limb bud (Fig. 5D; Table S5).
Shh expression patterns and development are unaffected in CTCF site-deletion mice
Our data indicate that deletion of individual CTCF sites can affect TAD boundaries, intra- and inter-TAD interactions and enhancer-promoter colocalisation frequencies. These alterations in 3D chromosome conformation might be predicted to affect gene expression. However, we found that mice homozygous for any of the ΔCTCF deletions are viable, fertile and have no overt deleterious phenotype. In situ hybridisation in homozygous mutant embryos showed a normal pattern of Shh expression in the brain (Fig. 6A), and body (Fig. 6B) at similar levels to wild type. At E11.5 expression is detected only within the developing midline of the brain, the zona limitans intrathalamica and the medial ganglionic eminence in the head, and staining is visible in the floor plate and notochord, the zone of polarising activity (ZPA) of the limb buds and umbilicus in the body. No ectopic expression is detected at the midbrain/hindbrain junction driven by neighbouring En2 or Cnpy1 enhancers (Fig. 6B). Conversely, in embryos homozygous for either ΔCTCF1 or ΔCTCF2 there is no evidence for ectopic En2 and Cnpy1 expression in any of the normal sites of Shh expression in the brain (Fig. 6C,D).
Similarly, no ectopic Shh expression is detected in motor neurons driven by Mnx1 enhancers in the TAD beyond ZRS/Lmbr1 (Lee et al., 2004) (Fig. 6A), and Mnx1 was not expressed ectopically in any of the normal sites of Shh expression in embryos carrying homozygous deletions of CTCF3, 4 or 5 (Fig. 6E). These findings indicate that despite the alterations to Shh TAD architecture and chromosome conformation, enhancer/promoter specificity is maintained in the ΔCTCF embryos and that, in the absence of these CTCF sites, there is no crosstalk across TAD boundaries to result in ectopic expression driven by Shh enhancers.
In situ hybridisation is good for determining spatial expression patterns but is at best a semi-quantitative technique. Therefore, we used RNA FISH to detect nascent Shh transcripts in regions of the developing brain in order to quantify the number of expressing alleles in individual cells (representative images are shown in Fig. S5A). In a region where Shh expression is driven by SBE2 (Z. Crane-Smith, personal communication) we detected a small but significant reduction in the percentage of Shh expressing alleles in embryos carrying homozygous deletions of CTCF2, 3 and 5 (Fig. 6F). However, when we used qRT-PCR to examine RNA from entire heads, in which expression is controlled by multiple enhancers, no significant changes in Shh mRNA levels were observed (Fig. 6G).
Shh expression in the limb bud driven by ZRS lasts only 48 h from initiation to downregulation. qRT-PCR performed from limb buds at E10.5 can detect changes in expression in response to deletions within the Shh TAD (Paliou et al., 2019). However, we detected no differences in the percentage of expressing Shh alleles on E11.5 developing limb buds using RNA FISH (Fig. S5B), and qRT-PCR shows no significant differences in Shh mRNA expression levels, with the exception of ΔCTCF5 mutants (Fig. S5C).
Mice heterozygous for a Shh null allele express only 50-60% wild-type levels of Shh in the limb bud but develop normally; in fact, in the limb Shh levels must fall to ∼20% of wild type before development is perturbed and digits are lost (Lettice et al., 2017). As the TAD boundary moves beyond the 5′ end of Shh in ΔCTCF1 cells, arguably separating the coding region from its enhancers, we also made compound heterozygotes carrying both the ΔCTCF1 and Shh null alleles to uncover subtle effects on Shh expression. These ShhΔCTCF1/− mice develop normally and are viable and fertile, further suggesting that deletion of CTCF1 results in no deleterious changes in Shh expression.
A 35 kb deletion that removes the Lmbr1 promoter and TAD boundary disrupts chromatin conformation with no deleterious phenotype
Deletion of CTCF1 3′ of Shh showed that this position was important for the TAD boundary location and for Shh physical proximity with its regulatory domain (Fig. 3; Fig. S3), but the loss of this site had no apparent phenotypic consequence (Fig. 6). In a similar manner, deleting CTCF5 at the Lmbr1 TAD boundary affected the Shh TAD boundary location but resulted in a minimal loss of proximity between ZRS and Shh, at least in E11.5 limb tissue (Fig. 4; Figs S4 and S5). Loss of TAD boundary regions can result in the merging of adjacent TADs and the ectopic activation of genes in one TAD by enhancers in the other merged TAD, with phenotypic consequences (Fabre et al., 2017; Lupiáñez et al., 2015). However, this involved the deletion of sizeable stretches of DNA across the boundaries in question, tens of kilobases rather than individual CTCF sites. In addition to CTCF binding sites, a number of features are found enriched at TAD boundaries, including those associated with active promoters (Dixon et al., 2012). To determine whether a more extensive deletion across the Lmbr1 boundary results in the merging of adjacent TADs, a homozygous 35 kb deletion (Δ35) was generated in mice which removed CTCF4 and CTCF5 and covering a region containing the first two exons of Lmbr1 and 13 kb upstream (Fig. 1A). RT-PCR in mouse embryos confirmed that this deletion eliminates transcription throughout the 5′ end of Lmbr1 in both isolated limb buds and the rest of the body (Fig. S6A).
5C from homozygous Δ35 ES cells showed that this deletion caused relocation of the TAD boundary a further ∼60 kb 5′ of the Lmbr1 promoter towards the promoter of Nom1 rather than a merging of the adjacent TADs (Fig. 7A; Fig. S6B,C). Interactions within the region extending from CTCF3 to Nom1 are enriched in Δ35 cells compared with wild type, and the CTCF3/ZRS genomic region gains interactions into the adjacent TAD up to Mnx1 (Fig. 7A, arrowheads). Nom1 loses interactions within its own TAD (Fig. 7A, arrows; Fig. S6B,C). Consistent with this, 3D-FISH (Fig. 7B) showed that distances between ZRS and Shh, SBE2 and Mnx1 were all significantly decreased in Δ35 (Fig. 7C; Table S6). The reduced spatial distance between ZRS and Mnx1 was not due to reduction of the linear genomic distance caused by the 35 kb deletion, as similar effects were seen in cells carrying an inversion of this 35 kb of DNA (Fig. 7C).
Deletions of the Shh TAD boundary at the Lmbr1 promoter relocates the boundary to the promoter of Nom1, and the ZRS has enhanced ability to contact sequences both within its own TAD and the Mnx1 TAD. Despite these differences, Δ35 homozygous mice were viable, fertile and had no apparent phenotype. The Shh expression pattern is also indistinguishable from wild type (Fig. 7D,E) – in particular midline expression is detected in the floor plate and notochord as one stripe down the body (Fig. 7E, arrowhead), with no evidence for expression as two more lateral stripes driven by Mnx1 motor neuron enhancers (Lee et al., 2004; Fig. 6E). Because of the changes in interactions across the TAD boundary observed in Δ35, we made compound heterozygotes with the Shh null chromosome to highlight subtle changes in expression. Even in this sensitised Shh background, compound ShhΔ35/− mice are phenotypically normal.
Interestingly, given the decreased distances measured by FISH between ZRS and Mnx1, in situ hybridisations indicate that the limb expression of Mnx1 is increased in ShhΔ35/Δ35 embryos in comparison with wild-type embryos (Fig. 7F-H). This is supported by qRT-PCR which detects a modest, but not significant, upregulation of Mnx1 expression in limb buds, whereas Shh expression is unchanged (Fig. 7I). These data suggest that the deletion of 35 kb encompassing the TAD boundary enhances the ability of the Mnx1 promoter to respond to the ZRS. However, no upregulation of Mnx1 expression is seen in the pharyngeal endoderm and developing lungs, which would be driven by the enhancers neighbouring ZRS, MACS1 and MFCS4 (Fig. 1A) (Sagai et al., 2009).
A systematic genetic approach to delete individual CTCF sites, and to delete or invert large regions, including those encompassing a TAD boundary, has enabled us to use chromosome conformation capture and imaging to assay the resulting perturbations to chromosome organisation within the Shh regulatory TAD, and between this and neighbouring TADs. Analysing CTCF deletions, we detected little or no disruption to gene regulation during embryonic development, and no detectable phenotype in animals that can be attributed to these altered chromosome conformations.
ZRS activity is not distance dependent
5C analysis confirmed that TAD boundaries were unaffected by removal of most of the internal region of the Shh TAD (Δ700) (Fig. 1; Fig. S1), with Shh and its remaining enhancers still located within the same, but smaller, TAD. This large deletion did cause extensive disruption to the developing embryo, mainly, it can be assumed, due to the loss of several known forebrain and epithelial enhancers within the deleted region. However, even in embryos homozygous for the 700 kb deletion, which relocates ZRS to less than 100 kb distant from Shh, ZRS function is maintained, with no detrimental effects on limb bud-specific Shh activation, and normal development of the limbs occurs. Therefore, the large genomic distance from Shh is not intrinsic to the function of the ZRS. This is in contrast to the loss of interactions following similar perturbations between a limb-specific enhancer and Hoxd13 that resulted in loss of Hoxd13 activity (Fabre et al., 2017).
Loss of CTCF sites at the Shh TAD boundaries disrupts chromatin architecture and impacts Shh/ZRS spatial proximity
We have previously shown that Shh and ZRS are in spatial proximity (∼300 nm) in the early embryo in both expressing limb tissue and the non-expressing adjacent flank (Williamson et al., 2016). Here, using 5C on cells dissected from E11.5 anterior and posterior limb buds, we show that this is driven by an interaction between the sites 3′ and 5′ of Shh (containing CTCF1 and CTCF2 sites) and a region within intron 5 of Lmbr1 ∼20 kb from ZRS (CTCF3) (Fig. 2; Fig. S2). This loop is also present in ESCs, and spatial proximity of Shh and ZRS is lost upon the deletion of any one of the three CTCF sites in both ESCs and E11.5 limb bud tissue (Figs 3-5; Figs S3 and S4). Deleting CTCF sites at the Lmbr1 promoter TAD boundary (ΔCTCF4 and ΔCTCF5) had a lesser effect on Shh/ZRS spatial proximity. Increased interprobe distances between either Shh or ZRS and the forebrain enhancer SBE2 located at the centre of the TAD suggest that the loss of spatial proximity may be due to a general decompaction throughout the TAD, rather than a loss of interactions which could be detected by 5C.
Shh responds to its developmental enhancers regardless of TAD disruption
5C analysis in ESCs suggests that the disruption caused by CTCF site deletions can remove Shh from its regulatory TAD (ΔCTCF1) or re-enforce contacts within sub-TAD domains such that the Shh forebrain enhancers are sequestered in one, and ZRS and the long-range epithelial enhancers are sequestered in the other, with a loss of interactions between both sub-TADs and either sub-TAD with Shh (ΔCTCF2 and ΔCTCF3). Nevertheless, in all of these configurations, the expression pattern of Shh during embryonic development appears to be normal and mRNA levels are largely unchanged, with the resulting mice having no detectable phenotype. This indicates that communication between Shh and its extensive set of developmental enhancers is remarkably robust to TAD perturbation.
Ectopic expression across disrupted TAD boundaries is not common
Loss of CTCF1 not only moves the TAD boundary ∼60 kb to beyond the 5′ end of Shh but also enables greater interactions between Shh and the adjacent TAD, which contains other genes and their enhancers that are active during brain development, but in a pattern distinct from Shh. En2 is expressed at the mid-hindbrain boundary, a pattern at least partly dependent on an enhancer binding Pax2/5/8 (Li Song and Joyner, 2000). Similarly, Cnpy1 expression at the mid-hindbrain boundary is thought to be important for FGF signalling (Hirate and Okamoto, 2006). Despite increased chromatin interactions over the Shh TAD boundary in ΔCTCF1, there is no ectopic expression of Shh in the mid-hindbrain driven by the En2/Cnpy1 enhancers and, vice versa, there is no ectopic expression of En2/Cnpy1 at sites driven by Shh enhancers (Fig. 6).
The Lmbr1 TAD boundary has been suggested to be less precise than that at the Shh end of the TAD from both a structural and regulatory point of view (Anderson et al., 2014; Symmons et al., 2016). Deletion of CTCF5 weakened the boundary of the neighbouring Mnx1 TAD, and increased proximity between ZRS and Mnx1 was detected in ΔCTCF4. However, in neither case was there evidence for enhanced expression of Mnx1 – e.g. in limb buds driven by ZRS – beyond that detected in wild-type embryos. Interestingly, even in wild-type situations, Mnx1 has a weak expression domain concomitant with the limb bud ZPA, suggesting that this gene may be influenced by ZRS activity emanating from the adjacent TAD. Nor was there evidence of the Mnx1 motor-neuron enhancer (Zelenchuk and Brusés, 2011) driving expression of Shh in motor neurons of the developing neural tube in any of the mutant embryos.
A larger (35 kb) deletion of this boundary, removing CTCF4, CTCF5 and the promoter/first two exons of Lmbr1, enhanced ZRS contacts across both the Shh TAD and into the neighbouring Mnx1 TAD (Fig. 7; Fig. S6). Increased Mnx1 expression in the ZPA of embryos homozygous for the 35 kb deletion suggests that the potentially increased contacts between Mnx1 and ZRS identified in ESCs could be enabling greater activation of this gene by the Shh limb enhancer.
Perturbations of the Shh TAD boundaries can negatively impact on gene-enhancer colocalisation but are insufficient to cause a deleterious phenotype
It is commonly held that enhancer-driven gene activation requires ‘contact’ or very close apposition of the enhancer and promoter. Inversions encompassing the Shh TAD boundaries that disrupted TAD integrity and significantly increased the genomic distance between Shh and ZRS result in severe limb malformations, suggesting that these rearrangements prevent ZRS from contacting/regulating the Shh promoter (Symmons et al., 2016). These data and our 5C and FISH analyses, which show that the Shh TAD forms a compact discrete regulatory hub (Williamson et al., 2016), suggest that 3D organisation of the Shh TAD could allow distal enhancers to come into close proximity to selectively regulate Shh expression. However, in the functionally relevant cells of the limb bud ZPA, ZRS colocalisation (<200 nm) with Shh was reduced to levels of the non-expressing distal anterior levels in ΔCTCF1, ΔCTCF2 and ΔCTCF3 homozygous embryos without adversely affecting Shh expression (Fig. 6) and with no subsequent phenotypic effects. This is consistent with evidence showing reduced Shh neural enhancer-promoter colocalisation in expressing cells and tissues (Benabdallah et al., 2019).
All embryos homozygous for one of the five CTCF binding domain deletions or the 35 kb deletion of the Lmbr1 boundary developed normally and were able to reproduce. Moreover, sufficient Shh expression was maintained for compound heterozygote embryos carrying either ΔCTCF1 or the 35 kb deletion opposite a Shh null allele to have no abnormal phenotype. A contemporaneous study on the same genomic territory has largely recapitulated these results – deletions of Lmbr1 CTCF sites and the gene promoter caused perturbations to local chromatin conformation but Shh expression, although reported to be reduced, was enough to drive normal limb development (Paliou et al., 2019). The Shh regulatory landscape is set up to ensure optimal activation of the gene and here, we have shown that this is robust to perturbations of TAD integrity and structure. Similarly, recent work on the Sox9-Kcnj2 locus suggests that even manipulations that result in the fusion of neighbouring TADs have no major effects on gene expression (Despang et al., 2019). However, large-scale disruptions incorporating boundaries which cause TADs to merge do result in developmental defects (Lupiáñez et al., 2015).
Our data suggest that CTCF binding has a role in TAD structure and loss of sites perturbs internal interactions and the position of boundaries. However, at the Shh locus these major disruptions have no effect on gene expression patterns and little effect on expression levels. We speculate that the largely unvarying organisation of TADs could have provided the necessary stable genomic environment for the accumulation of regulatory elements over evolutionary time rather than being essential for target gene activation.
MATERIALS AND METHODS
Cell culture and CRISPR-Cas9 mediated deletions
E14TG2A mouse ESCs (a kind gift from Austin Smith, Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, UK) were cultured under standard conditions (Anderson et al., 2014). CRISPR guides were made by cloning annealed oligos (Table S1) into pSpCas9(BB)-2A-GFP (PX458, Addgene plasmid #48138), a gift from Feng Zhang (Ran et al., 2013). We transfected 2 µg of vector DNA into 8×105 ESCs using Lipofectamine 2000 (Thermo Fisher Scientific) following the manufacturer's instructions. After 48 h, GFP-positive cells were sorted using fluorescence-activated cell sorting and plated at low density. Ten days later, individual clones were picked and screened for correct deletion using PCR and Sanger sequencing (primers are listed in Table S1).
Mouse lines and embryo analysis
The ShhΔ700 deletion was created by crossing the line SBLac96 (Anderson et al., 2014) to a line carrying a pCAGGS-Cre recombinase gene (Araki et al., 2006). With the exception of the Δ35 kb and Inv35 kb mouse lines, which were made by injection of the ESCs into blastocysts, all of the other mouse lines were created as in Lettice et al. (2017) by direct microinjection into C57Bl6/CBA F2 zygotes of the same guides as were used in ESCs. Resultant G0 mice were screened by PCR using flanking primers (Table S1) and the deletions confirmed using Sanger sequencing. Lines were then established by crossing founder mice to C57Bl6 wild types. lacZ expression analysis, in situ hybridisations and RT-PCR reactions were conducted as in Anderson et al. (2014).
All mouse work has been approved by the University of Edinburgh Animal Welfare and Ethics Review board and is conducted under the authority of Home Office Licences.
Heads and limb buds were dissected from individual E11.5 embryos, snap frozen in separate tubes and stored at −80°C. The rest of the embryo was used to make DNA, and the genotype established by PCR using the primers listed in Table S1. RNA was extracted from the tissues of mutant and wild-type embryos using TRIzol (Invitrogen) and first strand cDNA synthesised with a Transcriptor First strand cDNA synthesis kit (Roche) following the manufacturers' instructions. qRT-PCR was run on an LC480 lightcycler (Roche) and made use of the Universal ProbeLibrary (probe 32 for Shh, probe 60 for Mnx1 and the Universal ProbeLibrary Mouse GAPD Gene Assay). PCR primers are listed in Table S7. Each gene was assayed in separate wells and each sample run in triplicate. Gene expression data was analysed by the del del Ct method, with mutants from each line compared with their wild-type littermates. (n=3-9 embryos). Student's unpaired t-test was used for statistical validations.
E11.5 embryos were collected, fixed, embedded, sectioned, antibody stained for SHH expression and processed for FISH as previously described (Morey et al., 2007; Lettice et al., 2014), except that sections were cut at 8 µm. Regions expressing Shh were identified by antibody staining with an anti-Shh antibody (Ab86462, Batch GR182460-5, Abcam; 1:150), which has been shown to give the correct expression pattern in immunostaining (https://www.abcam.com/sonic-hedgehog-antibody-rm0128-4a37-ab86462.html). Fosmid clones (Fig. 1A; Table S8) were prepared and labelled as previously described (Morey et al., 2007). Between 160-240 ng of biotin- and digoxigenin-labelled fosmid probes were used per slide, with 16-24 µg of mouse Cot1 DNA (Invitrogen) and 10 µg salmon sperm DNA. For four-colour FISH, a similar quantity of the additional fosmid was labelled with either Green496-dUTP (Enzo Life Sciences) or red-dUTP (Alexa Fluor 594-5-dUTP, Invitrogen).
For 3D FISH on ESCs, 1×106 cells were seeded on slides overnight. Cells were fixed in 4% paraformaldehyde for 10 min at room temperature and then permeabilised using 0.5% Triton X for 10 min (Eskeland et al., 2010).
Custom Stellaris® RNA FISH Probes were designed against Shh nascent mRNA (pool of 48 unique 22-mer probes) using the Stellaris® RNA FISH Probe Designer (Biosearch Technologies; www.biosearchtech.com/stellarisdesigner, version 4.2). The slides were hybridised with the Shh Stellaris FISH Probe set labelled with Quasar 570 (Biosearch Technologies), following the manufacturer's instructions (www.biosearchtech.com/stellarisprotocols). Briefly, FFPE tissue sections from E11.5 embryos were deparaffinised in xylene, hydrated in ethanol and permeabilised in 70% ethanol overnight at 4°C. Slides were incubated in 10 µg/ml proteinase K in 1× PBS for 20 min at 37°C followed by washes in 1× PBS and wash buffer (2× SSC, 10% deionised formamide). Shh RNA FISH probes were diluted in Stellaris RNA FISH hybridisation buffer (#SMF-HB1-10) to 125 nM and hybridised to slides overnight in a humidified chamber at 37°C. Slides were washed twice for 30 min in wash buffer (2× SSC, 10% deionised formamide) at 37°C, counterstained with 5 ng/ml DAPI, washed in 1× PBS and mounted in Vectashield (Vector Laboratories).
Slides were imaged using a Photometrics Coolsnap HQ2 CCD camera and a Zeiss AxioImager A1 fluorescence microscope with a Plan Apochromat 100×1.4NA objective, a Nikon Intensilight Mercury based light source and either Chroma #89014ET (three-colour) or #89000ET (four-colour) single excitation and emission filters (Chroma Technology Corp.) with the excitation and emission filters installed in Prior motorised filter wheels. A piezoelectrically driven objective mount (PIFOC model P-721, Physik Instrumente) was used to control movement in the z dimension. Step size for z stacks was set at 0.2 µm. Hardware control, image capture and analysis were performed using Nikon Nis-Elements software. Images were deconvolved using a calculated point spread function with the constrained iterative algorithm of Volocity (PerkinElmer). Image analysis was carried out using the Quantitation module of Volocity. For DNA FISH, only alleles with single probe signals were analysed, to eliminate the possibility of measuring sister chromatids.
3C library preparation
Limbs buds and bodies (with the limbs and heads removed) from wild-type embryos, and entire ShhΔ700/Δ700 embryos were dissected at E11.5 and the tissue dissociated by pipetting in just enough PBS to cover them. The cells were fixed with 1% formaldehyde for 10 min at room temperature. For ESCs, 5×106-1×107 cells were fixed. Crosslinking was stopped with 125 mM glycine for 5 min at room temperature followed by 15 min on ice. Cells were centrifuged at 400 g for 10 min at 4°C, supernatants removed, and cell pellets flash frozen on dry ice before storage at −80°C.
Cell pellets were treated as previously described (Dostie and Dekker, 2007; Ferraiuolo et al., 2010; Williamson et al., 2014). HindIII-HF (New England Biolabs) was the restriction enzyme used to digest the crosslinked DNA.
5C primer and library design
5C primers covering the Usp22 (mm9, chr11: 60,917,307-61,003,268) and Shh regions (mm9, chr5: 28,317,087-30,005,000) were designed using ‘my5C.primer’ (Lajoie et al., 2009) with the following parameters: optimal primer length of 30 nt, optimal primer melting temperature (TM) of 65°C, default primer quality parameters (mer:800, U-blast:3, S-blasr:50). Primers were not designed for large (>20 kb) and small (<100 bp) restriction fragments, for low complexity and repetitive sequences, or where there were sequence matches to >1 genomic target. The Usp22 region was used to assess the success of each 5C experiment but was not used for further data normalisation or quantification.
The universal A-key [CCATCTCATCCCTGCGTGTCTCCGACTCAG-(5C-specific)] and the P1-key tails [(5C-specific)-ATCACCGACTGCCCATAGAGAGG] were added to the forward and reverse 5C primers, respectively. Reverse 5C primers were phosphorylated at their 5′ ends. An alternating design consisting of 365 primers in the Shh region (182 forward and 183 reverse primers) was used. Primer sequences are listed in Table S9.
5C library preparation
5C libraries were prepared and amplified with the A-key and P1-key primers as described in Fraser et al. (2012). Briefly, 3C libraries were first titrated by PCR for quality control (single band, absence of primer dimers, etc.), and to verify that contacts were amplified at frequencies similar to those usually obtained from comparable libraries (same DNA amount from the same species and karyotype) (Dostie and Dekker, 2007; Dostie et al., 2007; Fraser et al., 2010). We used 1-10 µg of 3C library per 5C ligation reaction.
5C primer stocks (20 µM) were diluted individually in water on ice and mixed to a final concentration of 2 nM. Mixed diluted primers (1.7 µl) were combined with 1 µl of annealing buffer (10× NEBuffer 4, New England Biolabs) on ice in reaction tubes. Then 1.5 µg salmon testis DNA was added to each tube, followed by the 3C libraries and water to a final volume of 10 µl. Samples were denatured at 95°C for 5 min and annealed at 55°C (48°C for ESCs) for 16 h. Ligation with Taq DNA ligase (10 U) was performed at 55°C (48°C for ESCs) for one hour. One tenth (3 μl) of each ligation was then PCR-amplified individually with primers against the A-key and P1-key primer tails. We used 26 cycles based on dilution series showing linear PCR amplification within that cycle range. The products from three to five PCR reactions were pooled before purifying the DNA on MinElute columns (Qiagen).
5C libraries were quantified by bioanalyser (Agilent) and diluted to 26 pmol (for Ion PGM Sequencing 200 Kit v2.0). We then used 1 μl of diluted 5C library for sequencing with an Ion PGM Sequencer. Samples were sequenced onto Ion 316 Chips following the Ion PGM Sequencing 200 Kit v2.0 protocols as recommended by the manufacturer (Life Technologies).
5C data analysis
Analysis of the 5C sequencing data was performed as described in Berlivet et al. (2013). The sequencing data were processed through a Torrent 5C data transformation pipeline on Galaxy (https://main.g2.bx.psu.edu/). Before normalising, interactions between adjacent fragments were removed owing to the high noise:signal ratio likely to occur here. Average read count values over 21 kb bins were calculated from the raw sequencing data and 5C data were further processed for visualisation. First, the matrices were normalised to sum up to 50,000 reads (excluding the first two diagonals of the matrix). Then adaptive coarsegraining of the matrices was performed to reduce noise using cooltools.numutils.adaptive_coarsegrain, with the three lowest coverage bins masked and a cutoff of ten reads. The level of coarsegraining for all ESC matrices was determining using the merged wild-type data to ensure identical bin sizes across conditions. For comparison of 5C matrices across conditions, we performed observed/expected normalisation by dividing each diagonal of the matrix by its mean. For high resolution, zoomed in (15 kb) heatmaps in Fig. 2 and Fig. S2 raw data were used, with comparison zoomed in heatmaps normalised to total read count of compared limb anterior and posterior tissue samples. All 5C heatmaps in the figures contain the summed read counts of at least two biological replicates apart from E11.5 embryos in Fig. 1, each individual replicate is shown in supplemental figures associated with the main figures The number of total reads and of used reads is provided for each experiment in Table S10.
For insulation score analysis, we used raw 5C data without coarsening and applied cooltools.numutils._insul_diamond_dense to it with window=25 and without normalisation by median. The curves were further smoothed using LOWESS implementation from statsmodels.nonparametric.smoothers_lowess.lowess with frac=0.2, and plotted after inversion, as in raw insulation score valleys correspond to peaks of insulation, and peaks are easier to interpret visually. cooltools.lib.peaks.peakdet was used to determine location of peaks in inverted smoothed data with prominence of at least 0.2, and they were shown below the plots.
We thank the staff of the Institute of Genetics and Molecular Medicine advanced imaging resource and technical services for their assistance with imaging and sequencing. We also thank Lorraine Rose, Kyle Davies and the staff at the Biomedical Research Facility/Evans Building for expert technical assistance. We thank Maxim Imakaev for help with coarsegraining.
Conceptualization: I.W., R.E.H., W.A.B., L.A.L.; Methodology: I.W., L.K., P.S.D., E.A., F.K., R.E.H., L.A.L.; Validation: I.W., L.A.L.; Formal analysis: I.W., L.K., I.M.F., L.A.L.; Investigation: I.W., L.K., P.S.D., E.A., F.K., L.A.L.; Data curation: I.W., L.K., L.A.L.; Writing - original draft: I.W., L.K., L.A.L.; Writing - review & editing: I.W., L.K., I.M.F., R.E.H., W.A.B., L.A.L.; Visualization: I.W., L.K., L.A.L.; Supervision: I.W., R.E.H., W.A.B., L.A.L.; Funding acquisition: R.E.H., W.A.B.
L.K. and E.A. were funded by PhD studentships from the UK Medical Research Council (MRC). I.M.F. is funded by a PhD studentship from The Darwin Trust of Edinburgh. Work in the W.A.B. lab is funded by an MRC University Unit grant (MC_UU_00007/2) and in the R.E.H. lab by an MRC University Unit grant (MM_UU_00007/8).
5C datasets have been deposited in GEO under accession number GSE135840.
The authors declare no competing or financial interests.