The Evf2 long non-coding RNA directs Dlx5/6 ultraconserved enhancer(UCE)-intrachromosomal interactions, regulating genes across a 27 Mb region on chromosome 6 in mouse developing forebrain. Here, we show that Evf2 long-range gene repression occurs through multi-step mechanisms involving the transcription factor Sox2. Evf2 directly interacts with Sox2, antagonizing Sox2 activation of Dlx5/6UCE, and recruits Sox2 to the Dlx5/6eii shadow enhancer and key Dlx5/6UCE interaction sites. Sox2 directly interacts with Dlx1 and Smarca4, as part of the Evf2 ribonucleoprotein complex, forming spherical subnuclear domains (protein pools, PPs). Evf2 targets Sox2 PPs to one long-range repressed target gene (Rbm28), at the expense of another (Akr1b8). Evf2 and Sox2 shift Dlx5/6UCE interactions towards Rbm28, linking Evf2/Sox2 co-regulated topological control and gene repression. We propose a model that distinguishes Evf2 gene repression mechanisms at Rbm28 (Dlx5/6UCE position) and Akr1b8 (limited Sox2 availability). Genome-wide control of RNPs (Sox2, Dlx and Smarca4) shows that co-recruitment influences Sox2 DNA binding. Together, these data suggest that Evf2 organizes a Sox2 PP subnuclear domain and, through Sox2-RNP sequestration and recruitment, regulates chromosome 6 long-range UCE targeting and activity with genome-wide consequences.
Ultraconserved elements (UCEs) were identified as 200 bp (or greater) segments of 100% DNA conservation between humans, mice and rats, many associated with key developmental regulators (Bejerano et al., 2004; Sandelin et al., 2004; Woolfe et al., 2005). Removal of a select few UCEs in mice initially suggested that UCEs are dispensable (Ahituv et al., 2007). However, removal of UCE sequences near developmental regulators Arx, Gli and Shox2 causes neurological and growth defects (Dickel et al., 2018; Osterwalder et al., 2018), and limb defects (Nolte et al., 2014), revealing specific developmental roles. Transcription of UCE sequences and enhancer-regulating activity of UCE transcripts (Calin et al., 2007; Feng et al., 2006) was followed by the identification of genome-wide scale enhancer transcripts with enhancer-like activities (Ørom et al., 2010; Ørom and Shiekhattar, 2011). Together, these data support mechanistic and functional diversity of RNA regulatory roles (Rinn and Chang, 2020).
Our studies on the Evf2 ultraconserved enhancer lncRNA (Dlx5/6UCE-lncRNA, overlapping with Dlx6OS1) support complex RNA regulatory roles for UCE sequences during embryonic forebrain development, specifically at sites of GABAergic interneuron birth in E13.5 mouse ganglionic eminences (E13.5 GEs) (Berghoff et al., 2013; Bond et al., 2009; Cajigas et al., 2018, 2015; Feng et al., 2006). Evf2 is a 3.7 kb, spliced and polyadenylated lncRNA, containing Dlx5/6UCE sequences responsible for enhancer-regulating activities (Feng et al., 2006). Evf2 controls a mouse embryonic brain interneuron gene regulatory network (GRN), adult hippocampal and cortical circuitry, and seizure susceptibility (Bond et al., 2009; Cajigas et al., 2018; Feng et al., 2006). Evf2 positively and negatively regulates gene expression through cis (same chromosome as Evf2 expression site) and trans (different chromosome) mechanisms (Berghoff et al., 2013; Cajigas et al., 2018). Mechanisms of Evf2 gene activation and repression are distinguished by different regional requirements of the RNA, with the 5′-UCE containing region of the lncRNA controlling gene repression and the 3′ end controlling gene activation (Cajigas et al., 2018).
Evf2 RNA cloud formation is similar to lncRNAs that regulate dosage compensation (Xist; Brockdorff et al., 1992; Brown et al., 1992) and imprinting (Kcnq1ot1; Pandey et al., 2008; Redrup et al., 2009). Evf2 assembles a ribonucleoprotein complex (RNP87) containing at least 87 functionally diverse proteins, including transcription factors (TFs; Dlx1 and Sox2), chromatin remodelers (Smarca4, Smarcc2 and Smarcb1), regulators of chromosome topology (Smc1a and Smc3) and lamin B1 (Cajigas et al., 2015) (Fig. 1A). RNP87 was previously identified by comparing proteomic profiles from anti-DLX, affinity-purified Evf2+/+ and Evf2TS/TS E13.5GE complexes, showing that the number of Dlx-associated proteins is 87 in the presence of Evf2, and 15 in the absence of Evf2. A description of the site of transcription stop (TS) insertion that generates mice lacking Evf2 (Evf2TS/TS) is shown in Fig. S1. Evf2-dependent gene regulation across a 27 Mb region of mouse chromosome 6 (chr6) is characterized by recruitment of individual RNPs and regulation of histone modifications at key DNA regulatory sites, including the Dlx5/6UCE (Cajigas et al., 2018).
The identification of the TF Sox2, as a component of Evf2-RNP87 (Cajigas et al., 2015), raised questions about its role in this lncRNA-mediated gene regulation. Sox2 is a well-characterized pioneer TF (Dodonova et al., 2020) that maintains pluripotency through lineage-specific gene repression (Avilion et al., 2003; Takahashi and Yamanaka, 2006). Sox2 associates with lncRNAs involved in pluripotency and neuronal differentiation (Guo et al., 2018; Ng et al., 2013, 2012), and binds both DNA and RNA through its high mobility group domain (HMG) (Holmes et al., 2020). Crystal structures of HMG-POU-DNA ternary complexes support Sox2 multivalency and concentration-dependent enhancer regulation (Remenyi et al., 2003, 2004; Williams et al., 2004). In this report, we show that Evf2 regulates Dlx5/6UCE targeting and activity through mechanisms involving the Evf2-RNP Sox2 complex, revealing multi-step contributions of Sox2 TF-RNA interactions. Sox2 colocalizes with Evf2 RNA clouds in subnuclear domains that we have termed protein pools (PPs), detectable both in the presence and absence of Evf2 RNA. Evf2 controls Sox2 PP targeting and sizes at repressed genes Rbm28 and Akr1b8, and recruits Sox2 to key DNA regulatory sites, including Dlx5/6 intergenic enhancers and enhancer-chromosome interaction sites. At the genome-wide level, Evf2 co-recruitment of Sox2 with the RNPs Smarca4 and Dlx affects Sox2-DNA recognition. We propose that the Evf2 lncRNA functions as a Sox2 subnuclear domain organizer, controlling Dlx5/6UCE targeting and activity by distributing Sox2 and the associated RNPs Smarca4 and Dlx to key DNA regulatory sites on chr6, with genome-wide effects.
Evf2 gene repression through Sox2 antagonism
Evf2 activates and represses genes across a 27 Mb region on mouse chromosome 6, raising questions regarding the mechanistic basis for Evf2-dependent differential gene regulation (Cajigas et al., 2018). The Evf2 RNA cloud is a scaffold for the assembly of the Evf2-RNP (Fig. 1A) (Cajigas et al., 2015). Evf2 directly binds chromatin remodelers Smarca4 and Smarcc2/1 through promiscuous RNA-protein interactions, and indirectly to the Dlx homeodomain TF. Previous work showed that Smarca4 bridges the Evf2 RNA with the protein Dlx1, and other RNA-binding proteins within the RNP (Cajigas et al., 2015). In order to investigate the role of individual Evf2-RNP87 proteins in gene regulation, we further studied the role of the pioneer TF Sox2 in Evf2-regulated gene expression. First, we studied Sox2 interactions with Evf2-RNP components. In the absence of Evf2, there is a ∼25% decrease in total Sox2 protein levels (Fig. S1A,B) and a ∼50% increase in Sox2 RNA (Fig. S1C), supporting the involvement of both transcriptional and post-transcriptional control mechanisms. Sox2 directly binds Dlx1 (Fig. 1B) and Smarca4 (Fig. 1C), supporting multiple protein partners within the Evf2-RNP. We next used ChIP-reChIP to show that Sox2 and Dlx simultaneously bind Dlx5/6UCE (Fig. 1D). RNA electrophoretic mobility shift assays (REMSAs) show that Evf2 RNA binding to Sox2 has low sequence specificity (Fig. 1E), requiring Sox2 amino acids 41-109 [containing the high mobility group (HMG) DNA-binding domain and N-terminal nuclear localization signal (NLS); Fig. 1F]. These data are consistent with a recent report showing the requirement for the HMG domain in high-affinity/low-specificity Sox2-ES2 lncRNA interactions (Holmes et al., 2020). Together, these data support that Sox2 is similar to Smarca4, forming multivalent interactions and potentially functioning as a protein bridge between non-RNA binding proteins in the Evf2-RNP and Evf2 RNA (Fig. 1A).
In a previous report, we showed that Evf2 represses adjacent genes Dlx6 and Dlx5, and long-range target genes Rbm28 and Akr1b8, and activates long-range target genes Umad1 and Lsm8 on mouse chr6 (Cajigas et al., 2018). In order to determine whether Sox2 contributes to Evf2-dependent gene activation or repression, we analyzed gene expression in E13.5 GEs (mouse embryonic GABAergic interneuron progenitors) from Sox2fl/fl;Dlx5/6cre+, a genetic model in which floxed Sox2 (Shaham et al., 2009) removes Sox2 from Dlx5/6+ GABAergic progenitors (Monory et al., 2006). Dlx5/6cre-mediated removal of Sox2 in E13.5GE decreases the expression of Evf2-repressed target genes (Dlx6, Dlx5, Rbm28 and Akr1b8), but does not affect Evf2-activated target gene expression (Umad1 and Lsm8) (Fig. 2A). Loss of one copy of Evf2 from Sox2fl/fl;Dlx5/6cre+ E13.5 GEs (Evf2TS/+;Sox2fl/fl;Dlx5/6cre) rescues the effects of Sox2 loss on repressed target genes (Fig. 2A). Evf2 transcripts resulting from Evf2TS insertion have been previously reported (Bond et al., 2009) and are schematized in Fig. S1A. Sox2 expression in Sox2fl/fl;Dlx5/6cre+ Sox2+/Dlx5/6− subpopulations persists at ∼40% of wild-type levels (Fig. 2A). Despite heterogeneity, we are able to detect gene expression effects. Furthermore, Dlx5/6UCE-luciferase reporter assays show that Evf2 antagonizes Sox2 activation of Dlx5/6UCE activity in E13.5 GEs (Fig. 2B), supporting a mechanism of Evf2-Sox2 antagonism during gene repression.
Evf2-5′ end-mediated regulation of Sox2 binding to the Dlx5/6eii shadow enhancer
In order to determine whether Evf2/Sox2-mediated regulation of Dlx5/6UCE activity involves Sox2 recruitment to the Dlx5/6 intergenic enhancers, we used the native ChIPseq method CUT&RUN (Fig. 2C) (Meers et al., 2019a,c; Skene and Henikoff, 2017). In the CUT&RUN method, sequencing of <120 bp and >150 bp fragments distinguishes between proteins directly bound to DNA (less than 120 bp) and indirect binding through protein-protein interactions (more than 150 bp) (Meers et al., 2019a,b,c). Analysis of >150 bp CUT&RUN peaks has the potential to detect proteins associated with the large Evf2-RNP, which was not previously possible using crosslinked ChIPseq (X-ChIP) methods. CUT&RUN analysis shows that whereas Evf2 recruits Evf2-RNPs Dlx and Smarca4 to Dlx5/6UCE, Sox2 binding to Dlx5/6UCE is Evf2 independent (Fig. 2C). However, Evf2 recruits Sox2 to Dlx5/6eii, a shadow enhancer (Furlong and Levine, 2018; Zerucha et al., 2000) located adjacent to Dlx5/6UCE and regulated by both Dlx and Evf2 in trans assays, similarly to Dlx5/6UCE (Feng et al., 2006; Zerucha et al., 2000). The significance of Evf2-mediated Sox2-Dlx5/6eii recruitment with respect to gene repression is supported by rescue in Evf1TS/TS, a genetic model in which Evf2 repression is also rescued (Cajigas et al., 2018) (Fig. 2C). In Evf1TS/TS, the transcription stop sequence is inserted into exon 3, preventing expression of Evf1 (and also the Evf2-3′ region), but producing a truncated Evf2-5′ transcript (Fig. S1A). Loss of Dlx and Smarca4 binding to Dlx5/6UCE in Evf2TS/TS (expressing only Evf1-3′, overlapping with the Evf2-3′ region) is not rescued in Evf1TS/TS (expressing only Evf2-5′); transcripts resulting from Evf1TS and Evf2TS insertion are schematized in Fig. 2D and Fig. S1A. These data suggest that Evf2-5′ and -3′ regions are required for Dlx and Smarca4 recruitment to Dlx5/6UCE, linking these events to gene activation. Furthermore, a role for Evf2-5′-regulated Sox2 binding to the Dlx5/6eii shadow enhancer supports a functional role for Sox2 recruitment in gene repression, building on previous work showing that the Evf2-5′ region is sufficient for repression (Cajigas et al., 2018). Thus, Evf2-5′ and 3′ differentially contribute to recruitment in a site-specific and RNP-dependent manner, linking individual recruitment events to gene repression and activation.
Linking Evf2 and Sox2-regulated Dlx5/6UCE-chr6 targeting, and RNP recruitment
Previous work showed that Evf2 regulated Dlx5/6UCE targeting near long-range gene targets involves cohesin binding, specifically Smc1a and Smc3 (Cajigas et al., 2018), raising the possibility of roles of additional Evf2-RNPs in topological control. We performed ChIPseq using crosslinked E13.5GE chromatin to analyze Evf2-regulated Sox2 and Smc3 binding across chr6 (Fig. S1E). Analysis of overlapping regulatory sites identifies antagonistic sites of Evf2 positively (+) regulated Sox2 binding and Evf2 negatively (−) regulated Smc3 binding (Fig. S1E). In order to explore the possibility that Sox2 contributes to Dlx5/6UCE targeting, we used chromosome conformation capture (4Cseq) to compare Dlx5/6UCE interaction (Dlx5/6UCEin) profiles across chr6 in the presence (Sox2fl/fl;Dlx5/6cre-) and absence (Sox2fl/fl;Dlx5/6cre+) of Sox2 in E13.5GE GABAergic progenitors. A subset of Evf2-regulated Dlx5/6UCEins across chr6 originally reported by Cajigas et al. (2018) overlaps with Sox2 regulated Dlx5/6UCEins (Fig. 3), including the Rbm28-5′ Dlx5/5UCEins (Fig. 3A). CUT&RUN analysis compares Evf2-5′ (rescued in Evf1TS/TS) and -3′ (lost in Evf2TS/TS, not rescued in Evf1TS/TS) -mediated recruitment of Sox2 and Smarca4 at Rbm28-5′ and Akr1b8-3′ Dlx5/6UCEins (Fig. 3A). The complete Sox2- and Evf2-regulated 4Cseq-Dlx5/6UCEin counts across chr6 are shown in Fig. S2. Complete CUT&RUN profiles for Rbm28-5′ are shown in Fig. S3A.
Sox2 binding at the Rbm28-5′-Dlx5/6UCEin is rescued in Evf1TS/TS, linking Sox2 recruitment to gene repression. In contrast, Sox2 does not regulate Dlx5/6UCE interaction 3′ of the long-range repressed target gene Akr1b8 [Fig. 3A, green arrow at Akr1b10 (Akr1b8-3′-Dlx5/6UCEin), complete profiles shown in Fig. S3B]. In contrast to the Rbm28-5′-Dlx5/6UCEin, Sox2 binding to the Akr1b8-3′-Dlx5/6UCEin is not detected in Evf2+/+ or Evf2TS/TS, and a small peak is detected only when Evf2 is truncated (in Evf1TS/TS). While Evf2 recruits Smarca4 to the Akr1b8-3′-Dlx5/6UCEin, Smarca4 recruitment is not rescued in Evf1TS/TS, decoupling Evf2-RNP recruitment at the Akr1b8-3′-Dlx5/6UCEin from gene repression. Together, these data support the theory that distinct Evf2-Sox2 mechanisms contribute to long-range repression of target genes Rbm28 and Akr1b8, with Evf2-Sox2 interactions at the Rbm28-5′-Dlx5/6UCEin, but not the Akr1b8-3′-Dlx5/6UCEin, contributing to long-range repression.
Across chr6, Evf2 and Sox2 co-regulate Dlx5/6UCEins near specific genes, as shown in graphs (Fig. 3B-D) and categorized into synergistic positive (green +/+, Fig. 3E), synergistic negative (red−/−) and antagonistic sites (red/green +/− and −/+) (Fig. 3F). Dlx5/6UCEins found in both Evf2+/+ and Evf2TS/TS are categorized as independent sites (I, Fig. 3E,F). Evf2-Sox2 co-regulated Dlx5/6UCEins frequently overlap with combinations of Evf2-regulated histone marks and RNP binding (Sox2, Dlx and Smarca4; Figs S3 and S4). For example, at the Evf2-Sox2 positively regulated Ing3-Dlx5/6UCEin, Evf2 regulates H3K4me3, H3K27Ac and Evf2-RNP recruitment (Smc1a, Sox2, Dlx and Smarca4) (Fig. S3C). Similar to profiles at the Rbm28-5′ Dlx5/6UCEin, Sox2 loss in Evf2TS/TS is rescued in Evf1TS/TS at additional Dlx5/6UCEins, including Ing3, Ezh2 and Rmnd5 (Figs S3C and S4A), supporting a role for the Evf2-5′ region in RNP recruitment at these sites. However, at the Evf2-Sox2 co-regulated Umad1 and Cc8b1 Dlx5/6UCEins, Sox2 recruitment is not rescued in Evf1TS/TS (Figs S3D and S4A), suggesting that the Evf2-5′ is not sufficient for recruitment at these sites.
In order to determine the relationships between Evf2-regulated RNP binding (Sox2, Dlx and Smarca4), and Evf2 (+) and (−) regulated Dlx5/6UCEins across chr6, we combined RNP CUT&RUN results and Dlx5/6UCE-4Cseq data from Evf2+/+ and Evf2TS/TS E13.5 GEs (Cajigas et al., 2018; Fig. 4A). Evf2 increases Sox2 binding [120 bp and 150 bp fragments at both Evf2 (+) and (−) regulated Dlx5/6UCEins (Fig. 4A, Fig. S4B,C)]. Dlx recruitment differs from Sox2 and Smarca4, as Evf2 decreases Dlx binding overall, with significant differences at Evf2 (−), but not at Evf2 (+) regulated Dlx5/6UCEins (Fig. 4A). Consistent with site-specific analysis, Evf1TS/TS rescues a subset of the overall effects, distinguishing roles of the Evf2-5′ and -3′ regions in RNP recruitment at Dlx5/6UCEins (Fig. S4C).
We next asked whether Evf2 co-regulated RNP recruitment occurs at Dlx5/6UCEins and/or at the genome-wide level. Evf2 co-regulated Sox2/Dlx/Smarca4 (RNPco) binding sites overlap Dlx5/6UCEins (150 bp CUT&RUN fragments, Venn diagram in Fig. 4B). Evf2 (+) regulates RNPco binding at 22 Dlx5/6UCEins, where 14/22 coincide with Evf2 (+) regulated Dlx5/6UCEins, and 8/22 coincide with Evf2 (−) regulated Dlx5/6UCEins. Evf2 (−) regulates RNPco binding at fewer Dlx5/6UCEins (5), where 3/5 coincide with Evf2 (+) regulated Dlx5/6UCEins and 2/5 coincide with Evf2 (−) regulated Dlx5/6UCEins. Although the significance of chromosome-wide effects remains to be determined, correlations between Evf2-RNPco sites and Evf2-regulated Dlx5/6UCEins raises the possibility of a wider role for Evf2-Sox2 and RNP co-recruitment in topological control.
We next analyzed enrichment of Sox2 DNA motifs at Evf2-regulated Sox2 peaks (Fig. 4B). At Evf2 (+)-regulated Sox2 peaks, 38% of 120 bp fragments contain Sox2 motifs, when compared with 8% of 150 bp fragments, while at Evf2 (−)-regulated Sox2 peaks, the numbers are 43% for 120 bp and 7% for 150 bp. These data are consistent with CUT&RUN analysis of Sox2-binding profiles where Sox2 120 bp fragments are motif enriched, while 150 bp fragments are motif depleted, and reflect nucleosomal binding (Meers et al., 2019b). Comparison of Sox2 motif enrichment at Evf2-regulated RNPco sites compared with Sox2 singly bound sites shows reductions at both Evf2 (+)-regulated and Evf2 (−)-regulated sites (∼3-fold and ∼4.4-fold, respectively) (Fig. 4B, 150 bp analysis). Even greater differences are identified in the 120 bp analysis, where Sox2 motifs are not detected at Evf2 RNPco sites, but are detected in 38% Evf2 (+)-regulated and 43% Evf2 (−)-regulated Sox2 singly bound sites. Together, these findings suggest that co-recruitment of Sox2 with Dlx and Smarca4 decreases genome-wide Sox2-DNA motif binding.
Evf2 regulates the Sox2 RNP protein pool targeting to Dlx5/6 and repressed genes
We previously found that the Evf2-5′ expressed in Evf1TS/TS is sufficient for both RNA cloud formation and colocalization with Dlx5/6UCE (Cajigas et al., 2018). Here, we find that Evf1-3′ RNA (overlapping Evf2-3′) continues to be expressed in E13.5GE Evf2TS/TS nuclei, forming clouds that are properly targeted to repressed genes (Akr1b8 and Rbm28) (Fig. 5A-D). These data suggest that Evf2-5′ and Evf1-3′ RNA clouds localize to regulated target genes, but that RNA cloud targeting is not sufficient for gene activation or repression. As Evf1-3′ lacks the UCE containing Evf2-5′, but overlaps with the Evf2-3′, Evf2TS/TS and Evf1TS/TS provide ideal models to compare Evf2-5′ and -3′ end functions in vivo.
Confocal microscopy in wild-type E13.5 GE nuclei previously showed that Evf2-RNPs are enriched within Evf2 RNA clouds [Dlx (Feng et al., 2006), Smarca4 (Cajigas et al., 2015) and Smc1a (Cajigas et al., 2018)]. Therefore, we used RNA fluorescent in situ hybridization combined with immunofluorescence to investigate whether Evf2 affects Sox2 localization. Unlike diffusely distributed Smarca4 protein enriched in Evf2-RNA clouds (Cajigas et al., 2015), Sox2 forms heterogeneously sized nuclear condensates in E13.5GE nuclei (Fig. 5A,G). Here, we name Evf2-RNP spherical subnuclear domains as protein pools (PPs), to distinguish these from spherical Evf2 RNA subnuclear ‘clouds', a term originally used to describe nuclear domains formed by Kcnq1ot1 lncRNA (Redrup et al., 2009). Sox2 PPs colocalize with Evf2 RNA clouds and with Evf2-repressed target genes Akr1b8 and Rbm28 (Fig. 5A, additional examples in Fig. S5). Examples of monoallelic, biallelic and non-colocalized Sox2 PPs-Evf2 RNA clouds are detected (Fig. S5). Sox2 PP colocalization with Dlx5/6 and repressed target genes is shown in Fig. 5G, with additional examples in Fig. S6.
Colocalization and size analysis of Sox2PPs and Evf2 RNA clouds using IMARIS software after 3D reconstruction indicates that a subset of Sox2 PPs colocalizes with Evf2-5′ and -3′ RNA clouds (Fig. 5E,F). While the percentages of Evf2-5′ and -3′ RNA clouds colocalized with Sox2 PPs does not significantly change in Evf1TS/TS and Evf2TS/TS nuclei, the sizes of Evf2-5′ and -3′ RNA clouds are significantly larger when colocalized with Sox2-PPs or with Evf2-repressed target genes (Fig. 5F). Furthermore, Evf2-5′ RNA clouds colocalizing with Sox2 PPs and repressed target genes are larger than non-colocalized (Fig. 5F).
Visualization of Sox2 PPs in Evf1TS/TS and Evf2TS/TS nuclei indicates that Sox2 PPs continue to be targeted to Rbm28, Akr1b8 and Dlx5/6 (Fig. 5G). However, in Evf2TS/TS the percentage of Sox2-PPs colocalized with Akr1b8 increases at the expense of Rbm28, a shift that is rescued in Evf1TS/TS (Fig. 5H). Total and non-colocalized Sox2-PPs are larger in Evf2TS/TS, and smaller in Evf1TS/TS nuclei (Fig. 5I), where non-colocalization is defined as non-overlapping with Rbm28, Akr1b8 or Dlx5/6, using IMARIS software parameters normalized according to Evf2+/+ values. Increased variability of Evf2TS/TS colocalized Sox2-PPs (Fig. 5I) led to binning Sox2-PPs into three size groups (<0.1 µm3, 0.1-1 µm3 and >1 µm3) for comparisons of size distributions at specific targets in Evf2+/+, Evf2TS/TS and Evf1TS/TS nuclei (Fig. 5J). Colocalized Sox2PP sizes are increased in Evf2TS/TS (red arrows) and decreased in Evf1TS/TS (green arrows), with larger effects observed at Akr1b8 (alone or complexed with Rbm28 and Dlx5/6) (Fig. 5J). At Dlx5/6, the percentage of Evf2TS/TS Sox2-PPs with a volume greater than 1 µm3 increases at the expense of 0.1-1 µm3 volumes (Fig. 5J, red arrow). This shift in Sox2PP sizes is rescued in Evf1TS/TS, supporting the significance of Sox2PP size regulation at Dlx5/6UCE during gene repression. Together, these data support the observations that Evf2-5′ and -3′ differentially regulate Sox2-PP gene targeting and size distributions in a site-specific manner.
We next used fluorescent in situ hybridization analysis of N-terminally tagged, mCherry-Sox2 (mch-Sox2) transfected into E13.5 GEs to determine whether ectopically expressed Sox2 associates with endogenous Evf2 RNA clouds and/or Dlx5/6. Transfected mch-Sox2 forms PPs that colocalize with endogenous Evf2 RNA clouds and/or Dlx5/6 (Fig. 6A), through a minimal region spanning the HMG nucleic acid binding domains and adjacent NLSs (orange) (Sox240-120, Fig. 6A-C). In Sox2 mutant 3 (Sox240-67, 98-317), crucial RNA/DNA-binding amino acids within the HMG domain are deleted: Δ66-97 deletes W79, K80, K87 and K95, defined as RNA and/or DNA binding (Holmes et al., 2020). However, Sox2 mut3 colocalization with endogenous Evf2 RNA clouds and/or Dlx5/6 increases, supporting the theory that nucleic acid binding is dispensable, while amino acids in the NLSs are crucial for PP formation and RNA/DNA localization. Future experiments that distinguish between Sox2 nuclear localization and RNA cloud/Dlx5/6 localization and define the role of individual Sox2-RNP interactions in localization will be important for understanding Evf2 RNP assembly and targeting in vivo.
Understanding lncRNA-dependent chromosome topological control requires mechanistic experiments that define individual contributions of lncRNA-RNPs. In addition to RNA cloud formation, Evf2 shares functional characteristics with Xist, one of the most well-studied lncRNAs (gene repression, chromatin remodeling effects and topological control) (Giorgetti et al., 2016; Jégu et al., 2019; Nora et al., 2012), and formation of a similar sized RNP (Xist-RNP85; Chu et al., 2015), where 16/85 proteins are shared with the Evf2-RNP87 (Cajigas et al., 2015). The Xist-RNP is among the most well-characterized lncRNA regulatory complexes, with validated functions on individual proteins (Chen et al., 2016; Chu et al., 2015; Dossin et al., 2020; Minajigi et al., 2015; Yi et al., 2020). Cohesin recruitment is a shared function between Evf2 (Cajigas et al., 2018), Xist (Minajigi et al., 2015) and ThymoD lncRNAs (Isoda et al., 2017), raising the possibility that topological control is a shared function of cloud-forming lncRNAs. Given that components of the Evf2-RNP interact with other lncRNAs, the study of gene regulation by the Evf2-RNP provides important insight into general mechanisms of gene regulation by lncRNAs, as well as potentially unique characteristics in developing interneurons.
Multi-step Evf2-Sox2 interactions support distinct gene repression mechanisms at long-range target genes
Although Sox2 transcriptional activities have been extensively characterized, new roles have emerged for Sox2 RNA-binding activities. The Sox2 HMG region contains both DNA- and RNA-binding domains (Holmes et al., 2020), which are also necessary for direct Evf2 binding (Fig. 1F). Together with genetic epistasis experiments and luciferase reporter assays (Fig. 2A,B), these experiments support the observation that Evf2 gene repression occurs through direct antagonism in which Evf2 lncRNA binds to Sox2, reduces the binding of Sox2 to DNA regulatory elements and decreases enhancer activity.
However, evidence in this work also demonstrates a role for Sox2-Evf2 lncRNA interactions in regulating enhancer targeting and Evf2-Sox2 protein recruitment, in events that likely precede direct effects on enhancer activity (models in Fig. 6D-F). We propose that key to this multi-step model of regulation is the formation of the Evf2-RNP, which is assembled on both lncRNA and protein scaffolds. lncRNA scaffolds can bridge individual RNA-binding proteins through low sequence-specific and/or promiscuous RNA-binding properties, as shown in this report for Sox2 (Evf2-Sox2; Fig. 1E) and previous work for chromatin remodelers [Evf2-Smarca4, Evf2-Smarcc2 and Evf2-Smarcc1 (Cajigas et al., 2015)] (Fig. 1A). Multivalent RNA-binding proteins Sox2 (Fig. 1B) and Smarca4 (Cajigas et al., 2015) can bridge non-RNA binding RNPs (Dlx, Smarcb1) to the lncRNA. Low sequence-specific RNA-binding explains the ability of the lncRNA to act as an organizer, with ‘glue' like properties, controlling the availability of specific RNPs in a site-specific manner. In such a model, DNA site specificity is determined by enhancer transcription and sequence as follows: (1) enhancer transcription produces an lncRNA that is retained and provides a scaffold where the RNP grows; and (2) enhancer sequences recruit and stabilize TF binding (Sox2 and Dlx).
Evf2 RNA clouds (yellow circles) and Evf2-RNP-Sox2 PP (green stars) sizes are larger when colocalized with Dlx5/6UCE and/or specific DNA target genes than non-localized (Fig. 6D), supporting the idea that the Evf2-RNP grows at key DNA regulatory sites. As revealed by analysis of Evf2 mutants, Evf2 shifts Sox2 PPs from Akr1b8 towards Rbm28, and also limits Sox2PP size at Akr1b8, linking PP targeting and size regulation to functionality (gene repression).
Incorporation of Evf2-Sox2 regulated events (chromosome topology, TF recruitment, PP size and targeting, and genetic and biochemical data) leads to a multi-step model of gene repression (Fig. 6E, steps 1-6). Linear organization of Evf2 gene repression on chr6 shows relationships between the following: the Evf2 transcription site across Dlx5/6UCE(*); the Evf2 RNP assembly on Dlx5/6UCE containing the Evf2 RNA cloud (yellow) and Sox2PP (green); repressed target genes (red boxes); and the Evf2/Sox2 negatively regulated Dlx5/6UCE interaction site 5′ of Rbm28 (double red arrow) associated with gene repression. At Akr1b8, the Evf2-5′ (dotted red arrow) reduces the size of Evf2-Sox2 RNPs (step 1) and shifts Evf2-Sox2 RNPs towards Rbm28 (step 2). In this model, Evf2-5′ repression of Akr1b8 occurs by limiting the availability of Sox2 (an activator of Dlx5/6UCE) through RNP targeting and size regulation, thereby sequestering Sox2 PPs within the Evf2-RNP. Furthermore, Evf2-5′ (which is sufficient for Akr1b8 repression) properly balances the numbers of Evf2-Sox2RNPs at Akr1b8 and Rbm28 in Evf1TS/TS, linking rescue of Evf2-Sox2RNP targeting to rescue of gene repression.
In step 3, Evf2-5′ recruits Sox2 to the Rbm28-5′-Dlx5/6UCEin (double red arrow), shifting Rbm28-5′-Dlx5/6UCEin towards Rbm28-5′ (step 4) (double green arrow). One caveat to Evf2 gene repression through limiting Sox2 PP activator is that in Evf2TS/TS, Sox2-PPs targeted to Rbm28 (also a repressed target gene) decrease, while Rbm28 gene expression increases. One possibility is that the extent of repression is affected by a combination of Sox2 targeting and size regulation at the two repressed target genes: loss of Evf2 causes a ∼15-fold increase in Akr1b8, but a ∼2-fold increase in Rbm28 (Cajigas et al., 2018). Another possibility is that mechanisms of Evf2-Sox2 antagonism differ between Akr1b8 and Rbm28, as reflected by target gene dependence. This is supported by 4Cseq analysis showing that Sox2 shifts Dlx5/6UCE towards Rbm28-5′ (overlapping with Evf2-Dlx5/6UCE regulation), but does not affect Dlx5/6UCEin near Akr1b8. Site-specific Evf2-Sox2 synergistic and antagonistic regulation of Dlx5/6UCEins across chr6 further supports variable combinations of events during topological control.
In step 5, Evf2 recruits RNPs (Dlx and Smarca4) to Dlx5/6UCE and Sox2 to Dlx5/6eii (Fig. 2C), consistent with previous reports of Evf2-regulated recruitment (Bond et al., 2009; Cajigas et al., 2015). Although Dlx5/6eii lacks ultraconserved sequences, Dlx5/6 intergenic enhancers are functionally similar, both are regulated by Dlx and Evf2 (Feng et al., 2006; Zerucha et al., 2000). Deletion of Dlx5/6eii in mice alters gene expression in developing and adult interneurons (Fazel Darbandi et al., 2016), supporting overlapping and distinct functions of the Dlx5/6UCE/Dlx5/6eii enhancer pair in vivo. A key distinguishing feature of Dlx5/6UCE is transcription into a spliced polyadenylated lncRNA (Evf2), whereas stable transcripts from Dlx5/6eii are not detected. With respect to Evf2-RNP binding, Sox2 binds both enhancers, while Dlx and Smarca4 binding is limited to Dlx5/6UCE. Evf2-RNP recruitment also differs: Evf2 regulates Sox2 binding to Dlx5/6eii, but not to Dlx5/6UCE, whereas Evf2 regulates Dlx, Smarca4 binding to Dlx5/6UCE, but not to Dlx5/6eii. Such differential recruitment is consistent with a ‘separation of inputs' hypothesis that enables shadow enhancer pairs to buffer noise better than duplicate enhancers (Waymack et al., 2020). Thus, Evf2 recruitment of Sox2 to Dlx5/6eii may reflect the need to precisely regulate levels of Sox2, beyond that necessary for factors that are recruited to Dlx5/6UCE, where on/off decisions are made.
In step 6, Evf2 decreases Dlx5/6UCE activity by directly binding to the Sox2-HMG DNA-binding domain, antagonizing activation. This model is supported by REMSAs (Fig. 1E,F) and luciferase reporter assays (Fig. 2B), and by competition between RNA and DNA at the Sox2 HMG domain (Holmes et al., 2020). The ability of Evf2 lncRNA to directly inhibit chromatin remodeling by inhibiting ATPase activity (Cajigas et al., 2015), and extension of this mechanism to X-inactivation (Jégu et al., 2019), adds an extra dimension to lncRNA-mediated transcriptional regulation that is not schematized in the model.
Together, these data suggest that Evf2 long-range gene repression involves multiple levels of Evf2-RNP regulation, including Sox2 PP targeting and size regulation, and site-specific recruitment and sequestration that ultimately affect both Dlx5/6UCE targeting and activity (shown in the predicted arrangement in Fig. 6F).
Predicting functional significance
Although the majority of this work focuses on Evf2-Sox2 differential gene repression of long-range target genes Rbm28 and Akr1b8, chr6-wide and genome-wide effects are also detected. Consistent with our previous reports of Evf2 topological and histone modification control (Cajigas et al., 2018), Evf2 regulated RNP-binding sites and Evf2-Sox2 regulated Dlx5/6UCEins are also detected outside the Evf2 27Mb GRN region, revealing chromosome-wide effects. A subset of overlapping Evf2-Sox2 regulated Dlx5/6UCEins map across chr6, and are located within 50 kb of Evf2 RNP recruited sites and histone modifications. Notably, analysis at Ing3 and Rmnd5a (Figs S3C and S4A) reveal the combinatorial nature of Evf2-RNP regulation. Interestingly, Sox2-negative regulation of Dlx5/6UCEin at the Ezh2-5′ promoter overlaps with Evf2 (+)-regulated Sox2 and Smarca4 and Evf2 (−)-regulated Dlx, suggesting topological control of Ezh2, which has been identified as a crucial component of the Polycomb group complex responsible for histone methylation and gene silencing (Cao et al., 2002) (Fig. S4A). Recently, Ezh2 inhibitors have been approved to treat cancer (Richart and Margueron, 2020), but also potentially increase seizure susceptibility in mice (Wang et al., 2020). Thus, it will be important to address whether Evf2-mediated regulation of RNPs in E13.5 GE is predictive and, specifically, whether Evf2-Sox2-controlled Dlx5/6UCEins near Ezh2-5′, contribute to the seizure susceptibility phenotype observed in adult mice lacking Evf2 (Cajigas et al., 2018).
In addition to Evf2-regulated RNP recruitment at Dlx5/6UCEins, genome-wide analysis identifies thousands of Evf2-regulated Sox2-binding sites, a subset that are co-regulated with Dlx and Smarca4 (RNPco, Fig. 4B). Evf2 recruits Sox2 to 714,312 sites, but also inhibits Sox2 binding to 677,303 sites, supporting a role in balancing Sox2 recruitment and sequestration on a genome-wide scale. Surprisingly, we find that Evf2 (+) and (−)-regulated RNPco sites contain significantly fewer Sox2 DNA motifs compared with singly bound Sox2 sites (Fig. 4B). Given that CUT&RUN 120 bp fragments are associated with direct Sox2 DNA binding, and 150 bp fragments are associated with nucleosomal binding (Meers et al., 2019b), loss of Sox2 motifs at RNPco sites supports a role for Dlx and Smarca4 co-recruitment in Sox2 DNA interactions. Similar effects are observed at both Evf2 (+) and Evf2 (−)-regulated sites, suggesting that individual Evf2 RNPs influence Sox2-DNA interactions, even in the absence of the Evf2 RNA, in vivo. Although the significance of RNPco sites and Evf2 genome-wide effects is unknown, these results highlight the importance of designing in vivo experiments to determine whether lncRNA-RNPs influence TF-DNA motif recognition as a general mechanism.
How individual Evf2-RNPs regulate Evf2 RNA cloud assembly at specific DNA sites is not known. Identification of Nono in the Evf2 RNP (Cajigas et al., 2015) and the known role for Nono in NEAT lncRNA assembly into paraspeckles raise the possibility that Nono plays a similar role in the Evf2-RNP (Yamazaki et al., 2018). Experiments that tag lncRNA clouds and follow assembly through high-resolution microscopy will be important in validating static studies of Evf2 RNA clouds and Evf2-RNPs. As only one or two Evf2 RNA clouds are detected in the nucleus, either the Evf2-RNP moves with Dlx5/6UCE to repressed target genes, or DNA looping (Rao et al., 2017) brings Evf2-RNP-bound Dlx5/6UCE closer to target genes (Cajigas et al., 2018). It will be important to determine whether additional Evf2-RNPs form PPs and are regulated similarly to Sox2 PPs with respect to size and targeting.
Experiments in this report reveal a complex multi-step role for Evf2-Sox2 interactions in enhancer targeting and direct antagonism of Dlx5/6UCE activity during gene repression. How each of the individual Evf2 RNPs (of the 87 identified) contributes to Dlx5/6UCE targeting and activity regulation, and the relationship of these functions to RNP assembly and to common mechanisms of lncRNA transcriptional control remain unresolved. Our data suggest that Evf2 selectively represses genes across megabase distances by coupling recruitment and sequestration of Sox2, a crucial pioneer transcription factor, affecting key steps of enhancer targeting and activity, with genome-wide effects. We propose that Evf2 RNA clouds function as both chromosome and protein organizers, contributing to dynamic regulation at enhancers.
MATERIALS AND METHODS
Mice were housed and treated according to approved IACUC guidelines (Northwestern University IACUC). Embryonic brain E13.5 GEs contained mixtures of males and females. The following mouse strains were used: Evf2+/+ Evf2TS/TS (Bond et al., 2009), Evf1TS/TS (Cajigas et al., 2018), Sox2fl/+ (Jackson laboratory 013093) and Dlx5/6cre (Jackson laboratory 008199).
Recombinant protein pulldown
Co-immunoprecipitation experiments on 6×-his tagged proteins purified from E.coli or flag-tagged baculovirus proteins were performed as described (Cajigas et al., 2015). Proteins were incubated in 200 μl NETN buffer [100 mM NaCl, 0.5 mM EDTA, 20 mM Tris-HCl (pH 8.0), 0.5% NP-40] for 1 h at 4°C with rotation. Glutathione agarose beads (30 μl) were washed with NETN buffer and added to each sample. Samples were incubated for 1 h at 4°C with rotation. Beads were pelleted by centrifugation and washed three times with 1 ml NETN buffer and once with 1×PBS. Proteins were eluted by adding protein loading buffer [62.5 mM Tris (pH 6.8), 10% glycerol, 2% SDS, 5% β-mercaptoethanol, 0.002% Bromophenol Blue] and incubating at 95°C for 5 min. Samples were analyzed by western blot.
RNA electrophoretic mobility assay
The NIR probe was generated by in vitro transcription of pGEM-Evf2(UCR) (Cajigas et al., 2015), a plasmid containing 115 nt of Evf2, including the ultraconserved sequence. 1 μg of SalI linearized DNA template, 5 mM DTT, 0.6 μl RNasin (Promega), 0.5 mM ATP, 0.5 mM CTP, 0.5 mM GTP, 12.5 μM UTP, 20 mM Aminoallyl-UTP-Atto680, 1 μg BSA and 2 μl (100 U) T7 RNA polymerase were incubated in 20 μl 1×RNA polymerase buffer for 1 h at 37°C. 2 μl Turbo DNase (Life Technologies) and 2 μl 10×Turbo DNase Buffer were added to the reaction and incubated at 37°C for 15 min. The RNA was denatured and separated on a 6% urea-polyacrylamide gel, cast on a Hoeffer miniVE apparatus and pre-run 20 min before loading. Full-length probe was excised, eluted overnight at 4°C in 0.5 M ammonium acetate/1 mM EDTA and ethanol precipitated. The concentration of the NIR labeled RNA probe was measured by absorption at 260 nm using the NanoDrop 1000 (Thermo Scientific). The RNA competitors were generated by in vitro transcription. pGEM-T Easy was linearized with BsrBI to generate a 209 bp RNA competitor. pGEM-Evf2(UCR) was linearized with SalI to generate a 206 bp RNA competitor. The linearized templates were treated with proteinase K and ethanol precipitated. RNA was transcribed as follows: 1.25 μg DNA template, 10 mM DTT, 1.5 μl (80 U) RNasin (Promega), 2 mM A, C, G and UTP (Roche), and 2 μl (100 U) T7 RNA polymerase (NEB) in 50 μl 1×RNA polymerase buffer were incubated at 37°C for 1 h. Samples were incubated with 2 μl Turbo DNase (Life Technologies) in 1×Turbo DNase Buffer for 15 min at 37°C. Samples were treated with Proteinase K (Roche), ethanol precipitated and quantified using the Quantifluor RNA System (Promega).
To generate GST-tagged Sox2 proteins, full-length Sox2 and Sox2 truncations (Sox21-205, Sox21-109, Sox2206-317 and Sox241-317) were subcloned into pGEX4T1. GST fusion proteins were purified from bacteria using standard protocols. The recombinant proteins were incubated with 0.15 pmoles Evf2 NIR-labeled probe in 10 μl reactions for 30 min at room temperature. For all competition experiments, protein and competitor RNA were pre-incubated for 10 min at room temperature before adding probe. 5 μg tRNA and 0.5 μl RNasin (Promega) were included in all the RNA electrophoretic mobility assay (REMSA) reactions. Pre-electrophoresis of 4% native polyacrylamide gels was performed for 20 min, REMSA reactions loaded and electrophoresed at 200 V for 40 min, and data visualized in the Odyssey Infrared Imager (LI-COR Biosciences).
Primary embryonic brain MGE transfections
For all transfections, E13.5 MGE tissues were dissected from Swiss Webster mouse embryos, dissociated in L15 media by pipetting several times, and spun through a cell strainer for single cell preparations. Cells were seeded at a density of 2.5×105 cells per cm2 (Flandin et al., 2011) in neurobasal medium [DMEM/F-12 supplemented with L-glutamate, B-27 (Gibco), N2 supplement (Gibco), bovine pituitary extract (35 µg/ml; Life Technologies), mito+ serum extender (BD Biosciences), penicillin (100 U/ml; Gibco), streptomycin (100 µg/ml; Gibco) and glutamax (0.8 mM; Gibco)]. One day before seeding cells, plate wells were coated with poly-L-lysine (Sigma) and laminin (Sigma).
For luciferase experiments, 78,300 cells per well were cultured in a 96-well microplate treated for tissue culture. Cells were allowed to attach for 24 h before changing the medium to neurobasal media without antibiotics. Transfections using Fugene 6 were performed as recommended. Cells were harvested 48 h after transfection with 1×passive lysis buffer (Promega) supplemented with 0.1% digitonin for cell lysis. To ensure thorough cell lysis, lysates were subjected to two freeze-thaw cycles prior to performing Dual Luciferase Reporter assays. All transfections were normalized to the internal control expressing Renilla luciferase, performed at least in triplicate and a minimum of two times. For transfection of mCherry-Sox2 fusions, 850,000 cells per well were cultured in a 12-well tissue culture plate. mCherry-Sox2 plasmids were generated by subcloning Sox2 or Sox2 truncations (Sox240-206 and Sox240-120) into the mCherry2-C1 plasmid using SacI and KpnI. Quick Change Site Directed Mutagenesis (Agilent) mutagenesis was used to generate Sox2Δ68-97. Plasmid (1 μg) was transfected using Fugene 6, according to the manufacturer's instructions. Cell were harvested by scrapping after 72 h of incubation and nuclei were isolated for combined RNA/DNA fluorescence in situ hybridization and immunofluorescence using an anti-mCherry antibody (see method below).
Combined DNA-RNA fluorescence in situ hybridization with immunofluorescence
DNA fluorescence in situ hybridization probes were generated by nick translation using the fluorescence in situ hybridization Tag DNA Kit following manufacturer's recommendations. The templates for the nick translation reactions were obtained from the BACPAC Resources Center (Children's Hospital Oakland Research Institute): Dlx5/6 region, WI1-1693G2; Akr1b8 region, RP23-120B14; Rbm28 region, RP23-276H18. The digoxigenin-labeled RNA probe was generated as described previously (Feng et al., 2006).
E13.5 whole ganglionic eminences were dissected in L15. Tissues were pooled for each genotype, triturated by pipetting and filtered through a cell-strainer capped 5 ml polystyrene round-bottomed tube (BD Falcon) to make single-cell suspensions. Cells were pelleted by centrifugation at 100 g for 5 min at 4°C. The supernatant was removed and cells were gently resuspended in 500 μl Nuclear Extraction Buffer [0.32 M sucrose, 5 mM CaCl2, 3 mM Mg(Ac)2, 0.1 mM EDTA, 20 mM Tris-HCl (pH 8.0), 0.1% TritonX-100] and incubated on ice for 10 min. Cells were centrifuged at 100 g for 2.5 min at 4°C and the supernatant was removed. Cells were washed gently with ice-cold 1×PBS with 2 mM EGTA. Cells were centrifuged at 100 g for 2.5 min at 4°C. The supernatant was removed and cells were gently resuspended in 500 μl of ice-cold fixative (3:1, methanol:glacial acetic acid). The cells were fixed for 10 min on ice. 5 μl of cells in fixative were transferred to Superfrost Plus microscope slides (Fisher Scientific) and allowed to air dry. The slides were transferred to a slide holder, vacuum sealed and stored at −80°C.
Slides were incubated with 50 μg/ml pepsin in 0.01 M HCl at 37°C for 7 min, and washed twice with 2×SSC. Cells were fixed in 4% paraformaldehyde for 5 min at room temperature and washed three times with 2×SSC for 5 min. The slides were incubated in 1×PBS with 1% hydrogen peroxide for 30 min at room temperature and rinsed twice with 2×SSC. The slides were dehydrated by incubation for 2 min in 70%, 80% and 100% ethanol. 200 μl denaturation solution (70% formamide in 2×SSC) were added and the slides were incubated at 85°C for 10 min. Slides were dehydrated in ice-cold 70%, 80% and 100% ethanol for 2 min and allowed to air dry. 150 μl pre-hybridization buffer (50% formamide, 0.1% SDS, 300 ng/ml Salmon Sperm DNA and 2×SSC) were added and the slides were incubated overnight at 37°C.
DNA probes and RNA probe in hybridization buffer (50% formamide, 10% dextran sulfate, 0.1% SDS, 300 ng/ml Salmon Sperm DNA and 2×SSC) were denatured in the presence of 2 μg mouse Hybloc DNA (Applied Genetics Laboratories) at 80°C for 7 min and re-annealed at 37°C for 1 h. Slides were incubated for 5 min in 2×SSC with 50% formamide, 2 min in 4×SSC with 0.1% Tween-20 and 2 min in 2×SSC at 45°C. The slides were dehydrated in ethanol and denatured as described above. 10 μl of fluorescence in situ hybridization probe solution was added, coverslips were sealed with rubber cement and the slides were incubated overnight at 37°C.
Slides were incubated in 2×SSC with 50% formamide for 10 min (three times), in 2×SSC for 10 min and in 2×SSC with 0.1% NP40 for 5 min at 45°C. The slides were rinsed with 1×PBS and incubated in 1% blocking solution (Tyramide Signal Amplification Kit) for 1 h. The appropriate antibody was diluted 1:500 in blocking reagent, along with a mouse monoclonal anti-digoxigenin (DIG, 1:500), added to the slides and incubated at 4°C overnight. Slides were washed three times in 1×PBS for 3 min at room temperature, incubated with 1:100 HRP-goat anti-mouse IgG in blocking solution for 1 h at room temperature and tyramide labeled according to manufacturer's instructions (TSA Kit). A second tyramide labeling step was performed for immunostaining. Slides were washed three times in 1×PBS for 3 min at room temperature after the first round of labeling. The slides were then incubated with 1:100 HRP-goat anti-rabbit IgG for 1 h at room temperature and tyramide labeled. The slides were washed three times with 1×PBS for 3 min and incubated with 5 mg/ml DAPI for 5 min, rinsed with 1×PBS and mounted using SlowFade Gold antifade reagent (Thermo Fisher Scientific).
Nuclei were visualized using a Zeiss Laser Scanning Microscope 880 using the Zen 2.1 software. A 100× immersion oil objective was used to generate z-stacks of 0.3 μm (2D colocalization for Sox2 PP transfections) and a 63× immersion oil objective used to generate z-stacks of 0.1 μm intervals (3D colocalization for volume analysis). Colocalization of transfected mCherry-fused SoxPPs with Evf2 RNA clouds and Dlx5/6UCE was determined by Zen2.1 software and manual inspection of z-stacks through each nucleus. Imaris software was used for 3D reconstruction, colocalization analysis and size measurements. Colocalization was defined as any regional overlap in one of four channels. The numbers of overlapping clouds and/or Sox2PPs with DNA target genes were determined by IMARIS software following 3D reconstruction, and in conjunction with size determination. Threshold settings were determined in pilot experiments performed in Evf2+/+, and remained constant throughout the analysis between genotypes.
Ten E13.5 whole ganglionic eminences were dissected for X-ChIP, native ChIP and ChIP-reChIP, as previously reported (Cajigas et al., 2018). For ChIP-reChIP, the pre-cleared chromatin from Swiss Webster E13.5GE was preadsorbed with rabbit IgG (5 μg), followed by 1st round incubations with anti-DLX (5 μg) and 2nd round incubations with anti-Sox2 or anti-LaminB1 (1 μg).
CUT&RUN was performed as previously described (Skene et al., 2018), with some modifications. The number of samples was determined according to the number of antibodies to be tested and the number of replicates (two replicates per antibody and two replicates for IgG). The appropriate volume of Concanavalin A magnetic beads (10 μl per sample) was mixed into 1.5 ml of ice-cold binding buffer [20 mM HEPES-KOH (pH 7.9), 10 mM KCl, 1 mM CaCl2, 1 mM MnCl2] and placed on a magnetic stand for 2 min. The supernatant was removed, and the beads were washed with 1.5 ml binding buffer. After magnetic separation of the beads, the supernatant was removed and a volume of binding buffer equal to the initial beads volume (10 μl per sample) was added to the beads. The beads were placed on ice. E13.5 ganglionic eminences were isolated from embryos in L15 medium (six embryos per genotype). Tissues were pooled for each genotype, triturated by pipetting and filtered through a cell-strainer capped 5 ml polystyrene round-bottomed tube (BD Falcon) to generate single-cell suspensions. Cells were counted using the Luna Automated Cell Counter (Logos Biosystems). The appropriate volume of cells to obtain 250,000 cells per sample was centrifuged at 600 g for 3 min at 4°C. The supernatant was removed and the cell pellet was gently resuspended in ice-cold wash buffer [20 mM HEPES (pH 7.5), 150 mM NaCl, 0.5 mM spermidine and EDTA-free protease inhibitor cocktail]. The cells were centrifuged at 600 g for 3 min at 4°C. The supernatant was removed and cells were gently resuspended in 1 ml of wash buffer. The Concanavalin A bead suspension was added to the cells, while gently vortexing (∼100 g), and the tube was incubated with rotation for 10 min at 4°C. The cells/beads suspension was split into aliquots, according to the number of samples previously determined. Tubes were placed on a magnetic stand and the supernatant removed. 50 μl of antibody buffer [20 mM HEPES (pH 7.5), 150 mM NaCl, 0.5 mM spermidine, 2 mM EDTA, 0.02% digitonin, EDTA-free protease inhibitor cocktail] containing 0.5 μg (Sox2, Smarca4, 1:100 dilution) or 0.07 μg (Dlx, 1:75 dilution) of antibody was added to each tube, while gently vortexing. The samples were incubated with rotation for 2 h at 4°C. The samples were centrifuged for 5 s at 100 g, and placed on the magnetic stand. The supernatant was removed, and the pellet resuspended gently in 1 ml ice-cold digitonin buffer [20 mM HEPES (pH 7.5), 150 mM NaCl, 0.5 mM spermidine, 0.02% digitonin and EDTA-free protease inhibitor cocktail]. The digitonin buffer wash was repeated once. Tubes were placed on the magnetic stand, the supernatant removed and 50 μl of pA-Mnase solution (final concentration 700 ng/ml in digitonin buffer) was added to each tube while gently vortexing. The samples were incubated with rotation for 1 h at 4°C. The samples were centrifuged for 5 s at 100 g and placed on the magnetic stand. The supernatant was removed and the pellets were washed in ice-cold digitonin buffer twice, as described above. After removal of supernatant from washes, 150 μl of digitonin buffer were added to each sample while gently vortexing. Tubes were placed on a metal block on ice (0°C) for 5 min. Tubes were removed from ice briefly to add 3 μl 100 mM CaCl2 while gently vortexing, then tubes were returned to 0°C for 30 min. 100 µl of stop buffer (340 mM NaCl, 20 mM EDTA, 4 mM EGTA, 0.02% digitonin, 0.05 mg/ml RNAseA, 2 pg/ml heterologous spike-in DNA) were added to each sample and mixed with gentle vortexing. Samples were incubated for 10 min at 37°C, then centrifuged for 5 min at 4°C at 16,000 g. The tubes were placed on the magnetic stand and the supernatant was transferred to a clean 1.5 ml microcentrifuge tube. DNA extraction was performed using standard phenol chloroform and ethanol precipitation methods as described (Skene et al., 2018). Samples were ethanol precipitated overnight at −20°C. DNA pellets were dissolved in 20 μl 0.1×TE buffer [1 mM Tris-HCl (pH 8), 0.1 mM EDTA]. The Qubit High-Sensitivity Assay was used for DNA quantification. CUT&RUN libraries were prepared using the KAPA Hyper Prep Kit protocol, with some modifications. The total volume of CUT&RUN DNA was used for library construction. For adapter ligation, 5 μl of 3 μM adapter stock from the KAPA Dual-Indexed Adapter Kit was used. The ligation was incubated at 20°C for 15 min. The library was amplified using the following cycling conditions: 14 cycles of 98°C for 45 s, 98°C for 15 s and 60°C for 10 s; 72°C for 1 min. After library amplification, the libraries were purified using 50 μl of KAPA Pure Beads and eluted in 20 μl of water. CUT&RUN sample quality was analyzed by TapeStation prior to sequencing on a NovaSeq 6000 (SP 100 cycles).
CUT&RUN data processing
Paired-end CUT&RUN reads of different transcription factors and their respective Ig-controls from Evf2+/+ and Evf2TS/TS Dlx5/6UCE samples were first mapped on mm9 genome using Bowtie2 v2.1.0 (Langmead and Salzberg, 2012) with options ‘--local --very-sensitive-local --no-unal --no-mixed --no-discordant --phred33 -I 10 -X 700’. We used the Picard toolkit command ‘MarkDuplicates’ to mark PCR duplicates and remove them from the final mm9 genome mapped bam files. Next, we separated the sequence fragments into ≤120 and ≥150 bp classes that provided the mapping of the local vicinity of a DNA-binding protein. The base-pair sizes can vary depending on the steric access to the DNA by the tethered MNase (Skene et al., 2018). Fragments mapping to repeat elements were removed, and replicates were joined before peak calling. The peak calling was performed using MACS2 (Zhang et al., 2008) callpeak options ‘-t -c –f BED -g mm --keep-dup all --bdg –nomodel --slocal 500 --llocal 5000 –-extsize 120/150’. An FDR cutoff of 0.05 was used to call the final set of peaks (Janssens et al., 2018). Differential CUT&RUN analysis of transcription factor-binding peaks between two Dlx5/6UCE conditions was performed using MACS2 program by treating one of the samples as the ‘control' for the other.
Multi-overlap of CUT&RUN peaks
The multi-overlapping differential and continuous CUT&RUN peaks (FDR<0.05) of Sox2, Dlx and Smarca4 transcription factors belonging to ≤120 and ≥150 bp classes from Evf2 (+) and Evf2 (−) conditions were quantified using bedtools ‘multintersect’ function (Quinlan and Hall, 2010) separately. The Venn diagram of the respective overlapping set was plotted using ‘ggpubr’ R package (Kassambara).
CUT&RUN Sox2 peaks containing Sox2 DNA-binding motifs
Sox2 frequency matrices (MA0143.1 and MA0143.2) from the JASPAR database (Fornes et al., 2020) were used to scan the respective CUT&RUN peak sequences using the PWMEnrich Bioconductor package (Stojnic and Diez, 2020, PWMEnrich: PWM enrichment analysis, R package version 4.26.0.). An FDR cutoff of 0.01 was used to determine significant motif enrichment.
4C using Dlx5/6UCE as bait was performed as previously described (van de Werken et al., 2012), with some modifications. E13.5 GEs from Sox2fl/fl;Dlx5/6 cre+ and Sox2fl/fl;Dlx5/6 cre-single embryos were dissected in L15 and kept in separate tubes. A single-cell suspension was obtained through gentle pipetting of the tissue in 250 μl of L15. Cells (∼2×106 cells per embryo) were transferred to a tube containing 5 ml 2% paraformaldehyde/10% fetal bovine serum (FBS) and incubated with rotation for 10 min at room temperature. 710 μl of 1 M glycine was added to quench the formaldehyde and tubes were placed on ice. Cells were pelleted by centrifugation at 400 g for 8 min at 4°C. The supernatant was removed and cells were gently resuspended in 2.5 ml ice-cold 4°C lysis buffer [50 mM Tris (pH 7.5), 150 mM NaCl, 5 mM EDTA, 0.5% NP-40, 1% Triton X-100 and protease inhibitors] and incubated for 10 min on ice. Cells were pelleted by centrifugation at 750 g for 5 min at 4°C. The supernatant was removed, and the cells were washed with 1 ml ice-cold 1×PBS by gentle resuspension. Cells were transferred to 1.5 ml microcentrifuge tubes and centrifuged at 600 g for 2 min at 4°C. The supernatant was removed and the cell pellets were flash frozen in liquid nitrogen and stored at −80°C.
The cell pellets were resuspended in 440 μl of molecular grade water and 60 μl of cutsmart buffer (New England Biolabs) were added. Tubes were incubated at 37°C and 15 μl of 10% SDS were added. Samples were incubated for 1 h at 37°C while shaking at 900 rpm. 75 μl of 20% Triton X-100 was added to the samples and incubated at 1 h at 37°C while shaking at 900 rpm. 200 U of EcoRI-HF were added to the samples and incubated overnight at 37°C while shaking. The next day, 200 U of EcoRI-HF were added and incubated overnight at 37°C with shaking. Complete digestion was confirmed by agarose gel electrophoresis. If undigested DNA was still present, another 200 U of restriction enzyme was added and incubated overnight. The enzyme was inactivated at 65°C for 20 min. Samples were transferred to a 15 ml conical tube. 100 U of T4 DNA Ligase was added (2 ml reaction volume) and incubated overnight at 16°C. The next day, 500 μl of molecular grade water, 50 μl fresh T4 ligase buffer and 100 U of T4 DNA ligase were added to the samples and incubated overnight at 16°C. Complete ligation was confirmed by agarose gel electrophoresis. The DNA was extracted using standard ethanol precipitation procedures (van de Werken et al., 2012).
The DNA pellet was resuspended in 450 μl of molecular grade water, then 50 μl of DpnII buffer and 50 U of DpnII were added and incubated at 37°C overnight. Complete digestion was confirmed by agarose gel electrophoresis. The enzyme was inactivated at 65°C for 20 min. Samples were transferred to a 50 ml conical tube and ligation was performed using 100 U of T4 DNA ligase in a total volume of 3.5 ml at 16°C overnight. The DNA was precipitated using standard ethanol precipitation procedures. The samples were purified using the Qiaquick PCR purification kit (two columns per sample). The following steps were performed to generate the 4C library for sequencing. First, overhangs were added to the 4C template using PCR amplification with primers containing the bait sequence, as follows: 200 ng 4C template, 0.2 mM dNTPs, 35 pmol primer Dlx5/6UCE-Fwd, 35 pmol primer Dlx5/6UCE-Rev, 1.75 U Expand Long Template Enzyme Mix (Roche) and 1×Buffer I underwent 29 cycles of 94°C for 2 min, 94°C for 10 s, 55°C for 1 min and 68°C for 3 min; followed by 68°C for 5 min. The PCR product was purified using the High Pure PCR Product Purification Kit (Roche). The 4C DNA containing the overhangs was then used as a template for a second PCR that added index sequences and Illumina sequencing adapters to generate the 4C library for sequencing. PCR reaction (50 μl) was made up of 225 ng DNA template, 0.5 mM dNTPs, 5 μl Nextera XT Index1 primer (N7XX, Illumina), 5 μl Nextera Index 2 primer (S5XX, Illumina), 3.5 U Expand Long Template Enzyme Mix (Roche) and 1×Buffer I. The reaction was 8 cycles of 94°C for 5 min, 94°C for 10 s, 55°C for 30 s and 68°C for 1 min; followed by 68°C for 7 min. The PCR product was purified using the High Pure PCR Product Purification Kit (Roche).
Dlx5/6UCE bait 4C-seq differential data analysis
Analysis has previously been described for Evf2+/+ and Evf2TS/TS (Cajigas et al., 2018). 4C reads were first mapped at the EcoRI restriction enzyme cut sites on chromosome 6 of mm9 reference genome using Bowtie2 v2.1.0 (Langmead and Salzberg, 2012). The mapped reads were further filtered based on their reproducibility between the pair of replicates. An EcoRI cut-site was deemed to reproducibly interact or not interact with the 4C bait if the two replicates in a given condition (Evf2+/+ and Evf2TS/TS) both have non-zero or zero counts, respectively. We identified 1108 and 1266 non-zero count EcoRI restriction cut sites that are reproducible in both replicates of Evf2+/+ and Evf2TS/TS, respectively. Across the two conditions (Evf2+/+ and Evf2TS/TS), we retained a total of 997 reproducible 4C sites that have reproducible interactions in the two replicates of either one condition or in both conditions. We then performed a DESeq2 (Love et al., 2014) -based differential contact count analysis on these sites to identify Evf2-regulated sites [P-adjusted<0.05 and a log2 fold change≥2 for positively regulated (+) or ≤−2 for negatively regulated (−)] and Evf2-independent (I) (P-adjusted>0.05 and an absolute log2 fold change<2) 4C interaction sites. We also performed the 4Cseq analysis using the FourCSeq program (Klein et al., 2015). In FourCSeq program models, the overall decreasing interaction frequency was related to genomic distance by fitting a smooth monotonically decreasing function to suitably transformed count data. With this transformed and normalized count data, FourCSeq performs differential analysis between conditions to obtain significant differential interactions. We applied FourCSeq on our Evf2+/+ and Evf2TS/TS 4Cseq samples and retrieved Evf2 (+) and (−)-regulated Dlx5/6UCE interactions (P-adjusted<0.05 and an absolute log2 fold change≥2) and Evf2-independent interactions (I) (P-adjusted>0.05 and an absolute log2 fold change<2). To avoid method-specific biases, interaction sites that were assigned the same label (+ 50,−73, I, 167) by the two different approaches (DESeq2 and FourCSeq) were called 4Cseq-intersectional computational method sites.
Sox2flfl; Dlx5/6+cre/-cre 4C-seq differential data analysis
Sox2 regulated Dlx5/6UCE interactions were determined as described above for Evf2+/+ and Evf2TS/TS, except that sites were identified based on non-zero counts in at least two out of four of the replicates in each genotype and DESeq2 (Love et al., 2014) [P-adjusted<0.05 and a log2 fold change≥2 for positively regulated or ≤−2 for negatively regulated (+) 4C interaction sites]. We applied FourCSeq on Sox2flfl +Dlx5/6cre and Sox2flfl-cre 4Cseq samples, and retrieved Sox2-regulated 4C interactions (P-adjusted<0.05 and an absolute log2 fold change≥2). To avoid method-specific biases, we retained a common set of 244 Sox2 (+) and (−)-regulated Dlx5/6UCE interaction sites.
Dlx5/6UCE 4C counts for Sox2 (+)-regulated (enriched in Sox2fl/fl;Dlx5/6 cre-) and Sox2 (−)-regulated (enriched in Sox2fl/fl ;Dlx5/6 cre-), and those previously reported for Evf2 (+)-regulated (enriched in Evf2+/+), Evf2 (−)-regulated (enriched in Evf2TS/TS) and Evf2 (I) independently regulated (detected in both Evf2+/+ and Evf2TS/TS) are included in Table S1.
Dlx5/6UCE and Sox2flfl +/− cre site overlap
Evf2+/+ and Evf2TS/TS Dlx5/6UCE and Sox2flfl +/cre 4C-peak overlap was measured using ‘bedtools window’ function (Quinlan and Hall, 2010) with a window span of 50 kb.
Dlx5/6UCE 4C-seq and CUT&RUN signal overlap
Evf2+/+ and Evf2TS/TS Dlx5/6UCE 4C-seq peaks were first mapped on CUT&RUN differential transcription factor peaks using bedtools ‘intersect’ function (Quinlan and Hall, 2010). The overlapping set of 4C-seq peaks from a Dlx5/6UCE condition were then mapped on the respective CUT&RUN Ig-normalized transcription factor signal data. The log2 fold-enrichment of Ig-normalized signal was generated using MACS2 ‘bdgcmp’ command (Zhang et al., 2008). The violin plots were made using ‘ggpubr’ R package (Kassambara).
Quantification and statistical analysis
Quantification and statistical analysis were performed using R. Significance levels are *P<0.05, **P<0.01 and ***P<0.001. In violin plots, the colored dotted line represents the mean of the respective class. An unpaired t-test was used to measure the significance. For ChIP-seq and CUT&RUN, a peak is defined as a region with q<0.05; for a 4C-seq experiment, a significant peak is defined with FDR<0.05 and an absolute log2 fold enrichment≥2. Addition statistical details can be found in the figure legends.
Antibodies and reagents
The following antibodies and reagents were used: anti-DLX (Kohtz Lab, Northwestern University, Evanston, IL, USA; Bond et al., 2009; Cajigas et al., 2015; Feng et al., 2006), anti-Smarca4 (Wang Lab, NIH; Wang et al., 1996), anti-Sox2 (A301-740A; RRID:AB_1211355, Bethyl Laboratories), anti-Lamin B1 (ab16048; RRID:AB_1010782, Abcam), anti-SMC3 (ab9263; RRID:AB_307122, Abcam), anti-mCherry (ab205402; RRID:AB_2722769, Abcam), anti-Flag M2 (F1804; RRID:AB_262044, Sigma), UTP-Atto680 (NU-821-680, Jena Bioscience), (pA-MNase; Henikoff Lab; Skene et al., 2018), Concanavalin A magnetic beads (BP531, Bangs Laboratories), Protein G Agarose (11719416001, Roche), Expand Long Template Enzyme Mix (11681834001, Roche), Fugene 6 (E2691, Promega), Dual Luciferase reporter assay (E1910, Promega), FISH TAG DNA kit (F32951, Thermo Fisher), Nextera XT Index Kit (FC-131-1001, Illumina), TruSeq Nano DNA Library Prep Kit (FC-121–4003, Illumina) KAPA Dual Indexed Adapter Kit (KK8722, KAPA Biosystems), KAPA hyper prep kit (KK8502, KAPA Biosystems), pcDNA3-EGFP (Addgene, 13031), pGL3-mDlx5/6 (Feng et al., 2006), pcDNA-Evf2 (Addgene, 99478), pcDNA3.3-Sox2 (Addgene, 26817), mCherry2-C1 (54563, Addgene) and pGEM-Evf2(UCR) (Kohtz Lab).
Oligonucleotides for TAQman PCR were obtained from Life Technologies: Dlx6 Mm01166201_m1, Dlx5 Mm00438430_m1, Umad1 AJWR2X8, Lsm8 AJX004G, Rbm28 Mm01137037_m1, Akr1b8 Mm00484314_m1, ActB Mm00607939_s1, Sox2 Mm03053810_s and Sox2 ot Mm01291217_m1. 4C bait sequences were as follows: Dlx5/6UCE-Fwd, 5′TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGATGCCAAACCACTGTGAGTGTA3′; Dlx5/6UCE-Rev, 5′GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGTCCCCAATGTCTGCTTCAA3′.
We thank S. Henikoff for pA-MNase (Fred Hutchinson Cancer Research Center), and R. Kingston and J. Cochrane (Harvard) for flag-tagged Smarca4 protein. We also thank C. DiDonato and J. Topczewski for critically reviewing the manuscript.
Conceptualization: I.C., J.D.K.; Methodology: I.C., J.D.K.; Software: A.C., F.A.; Validation: I.C., M.L., M.B., L.C.; Formal analysis: J.D.K., F.A., A.C.; Investigation: I.C., M.L., K.R.S., M.B., L.C., H.L.; Data curation: J.D.K., A.C.; Writing-original draft: J.D.K.; Writing review & editing: J.D.K., I.C., F.A.; Visualization: J.D.K.; Supervision: J.D.K.; Project Administration: J.D.K.; Funding acquisition: J.D.K., F.A.
This work was funded by the National Institutes of Health (NIMH R01MH111267 to J.D.K. and NIGMS R35GM128938 to F.A.). Deposited in PMC for release after 12 months.
The Sox2fl/fl +/-Dlx5/6cre 4C-seq and CUT&RUN datasets generated and analyzed in this study are available in GEO under accession number GSE164301. The published article (Cajigas et al., 2018) includes the crosslinked X-ChIPseq and native ChIPSeq datasets, and Evf2 regulated and independent Dlx5/6UCE-4C-seq datasets (GSE117184) used in this study. Confocal images and Imaris data are available on Mendeley Data, V1 (http://dx.doi.org/10.17632/pfkj9d5fk7.1).
Peer review history
The peer review history is available online at https://dev.biologists.org/lookup/doi/10.1242/dev.197202.reviewer-comments.pdf
The authors declare no competing or financial interests.