Establishment of a healthy ovarian reserve is contingent upon numerous regulatory pathways during embryogenesis. Previously, mice lacking TBP-associated factor 4b (Taf4b) were shown to exhibit a diminished ovarian reserve. However, potential oocyte-intrinsic functions of TAF4b have not been examined. Here, we use a combination of gene expression profiling and chromatin mapping to characterize TAF4b-dependent gene regulatory networks in mouse oocytes. We find that Taf4b-deficient oocytes display inappropriate expression of meiotic, chromatin modification/organization, and X-linked genes. Furthermore, dysregulated genes in Taf4b-deficient oocytes exhibit an unexpected amount of overlap with dysregulated genes in oocytes from XO female mice, a mouse model of Turner Syndrome. Using Cleavage Under Targets and Release Using Nuclease (CUT&RUN), we observed TAF4b enrichment at genes involved in chromatin remodeling and DNA repair, some of which are differentially expressed in Taf4b-deficient oocytes. Interestingly, TAF4b target genes were enriched for Sp/Klf family and NFY target motifs rather than TATA-box motifs, suggesting an alternative mode of promoter interaction. Together, our data connect several gene regulatory nodes that contribute to the precise development of the mammalian ovarian reserve.
The ability to produce healthy gametes is crucial for the continuation of all sexually reproducing organisms, including humans. The process of mammalian gametogenesis begins during early fetal life with the specification and migration of primordial germ cells (PGCs) to the genital ridge. At the genital ridge, PGCs begin the process of differentiation into eggs and sperm in close concert with sex-specific somatic support cells. Thus, to understand the healthy functioning of adult gametes, we must examine multiple stages of development, including those that arise in early fetal life. An added layer of complexity is that female XX and male XY germ cells traverse this differentiation process in a highly sex-specific manner (Feng et al., 2014). Whereas some adult male germ cells become self-renewing spermatogonial stem cells (SSCs) during their development, the adult female mammalian germline is a non-renewable and finite resource termed the ovarian reserve that is steadily depleted after birth. The postnatal ovarian reserve is composed of a stockpile of primordial follicles (PFs) that contain individual primary oocytes arrested in prophase I of meiosis I, surrounded by a single layer of flattened somatic granulosa cells (Gura and Freiman, 2018). Menopause results from the timely depletion of the ovarian reserve and the mean age for menopause is 50±4 years. At least 1% of the female population worldwide experiences a fertility deficit termed primary ovarian insufficiency (POI), where menopause-like symptoms occur prematurely before 40 years of age (Chandra et al., 2013). Thus, genetic and environmental factors that perturb the establishment of the ovarian reserve in utero will have negative consequences on adult reproductive and general health outcomes and need to be understood in greater detail.
We previously identified an essential function of TBP-associated factor 4b (TAF4b) in the establishment of the ovarian reserve in the embryonic mouse ovary (Grive et al., 2014, 2016). TAF4b is a germ cell-enriched subunit of the transcription factor TFIID complex, which is required for RNA polymerase II recruitment to promoters in gonadal tissues (Gura et al., 2020). TFIID is a multi-protein complex that contains TATA-box binding protein (TBP) and 13-14 TBP-associated factors (TAFs) and is traditionally considered part of the cell's basal transcription machinery (Antonova et al., 2019). Female mice that have a targeted mutation that disrupts the endogenous Taf4b gene and prevents TAF4b protein from integrating into the larger TFIID complex (called Taf4b deficiency) are infertile and also exhibit hallmarks of POI, including elevated follicle stimulating hormone (FSH) levels and a diminished ovarian reserve (DOR) (Falender et al., 2005; Gura et al., 2020; Lovasco et al., 2010, 2015). We recently demonstrated that Taf4b mRNA and protein expression are nearly exclusive to the germ cells of the mouse embryonic ovary from embryonic day (E) 9.5 to E18.5 and that Taf4b-deficient ovaries display delayed germ cell cyst breakdown, increased meiotic asynapsis and excessive perinatal germ cell attrition (Grive et al., 2014, 2016; Gura et al., 2020). Therefore, we hypothesize that TAF4b, as part of TFIID, regulates oogenesis and meiotic gene programs. The degree to which the transcriptomic pathways in Taf4b deficiency and POI overlap and contribute to their similarities has yet to be explored.
Both human and mouse genetic studies have begun to reveal the molecular mechanisms underlying POI and its related pathologies. The most striking example is Turner Syndrome (TS), in which karyotypically single X-chromosome female individuals undergo early and severe DOR and exhibit short stature, primary amenorrhea, estrogen insufficiency and cardiovascular malformations (Gravholt et al., 2019). Recent work in mouse models of TS indicates that loss of correct dosage of the single X chromosomes in XO versus XX oocytes leads to pronounced meiotic progression defects and excessive oocyte attrition during ovarian reserve establishment (Sangrithi et al., 2017). In contrast to the high penetrance of TS, 20% of women with a premutation CGG repeat allele in the FMR1 gene, also located on the X chromosome, experience a related fragile X-associated POI (FXPOI) (Fink et al., 2018). Similar to Taf4b, other targeted mouse mutations have resulted in POI-related phenotypes, including those in Nobox and Figla, two transcription factors that regulate oocyte development; however, the relevance of specific mutations in their human orthologs and POI in women remains to be explored (Rossetti et al., 2017). More importantly, a better understanding of how these genes promote healthy establishment of the ovarian reserve and the deregulated molecular events that lead to its premature demise is needed.
To understand better the normal function of TAF4b during establishment of the ovarian reserve, we integrated published bioinformatic data with experimental Taf4b genomic assays to uncover unexpected links of Taf4b with TS and Fmr1. We show that in homozygous mutant Taf4b E16.5 oocytes, almost 1000 genes are deregulated as measured by RNA sequencing (RNA-seq). Surprisingly, the X chromosome was enriched for these deregulated genes and Taf4b-deficient oocytes display reduced X:autosome (X:A) gene expression ratios. There is a striking overlap of genes deregulated in Taf4b-deficient oocytes and XO mouse oocytes, and XO oocytes express significantly reduced levels of Taf4b at E15.5 and E18.5, further illuminating a potential molecular link between these disparate genetic contributors to POI. Furthermore, we show that Taf4b deficiency and TS both result in deregulation of genes involved in chromatin organization, chromatin modification and DNA repair. Finally, CUT&RUN of TAF4b E16.5 XX germ cells identifies direct TAF4b targets, enriched for Sp/Klf zinc-finger family and NFY binding sites, that for the first time confirm its promoter-proximal recognition properties, linking TAF4b binding to the transcriptional regulation required for proper establishment of the ovarian reserve.
Taf4b expression peaks at E16.5 in female embryonic germ cells
To observe the dynamics of Taf4b mRNA expression at single-cell resolution, we analyzed a single-cell RNA-seq (scRNA-seq) dataset of Oct4-EGFP+ oocytes from E12.5, E14.5 and E16.5 mouse ovaries (Zhao et al., 2020). We selected for Dazl-positive, high-quality (nFeature_RNA>1000, nFeature_RNA<5000, nCount_RNA>2500, nCount<30,000, mitochondrial genes<5%) oocytes and performed pseudotime analysis using Monocle3 (Fig. 1A,B). We found that Figla expression generally increased from E12.5 to E16.5 and over pseudotime, with some of the highest Figla-expressing cells appearing in E16.5 cells at the end of the pseudotime profile. We also found that expression of Stra8, which is a master regulator of meiotic initiation, declined over time and pseudotime, as expected. We then compared the expression profiles of Taf4a and Taf4b. Most cells across the time course had low Taf4a expression throughout. Taf4b mRNA expression began to rise at E14.5 and appeared highest in the E16.5 oocytes that were earliest in pseudotime (Fig. 1C), consistent with previous observations (Gura et al., 2020).
To identify which other genes were highly expressed in Taf4b-expressing oocytes, we performed differential gene expression analysis using cells separated into Taf4b-expressing (Taf4b log2 expression>0) and Taf4b-off (Taf4b log2 expression=0) populations and performed differential gene expression analysis (Table S1). We found 155 genes that were significantly (P<0.05) higher in Taf4b-expressing cells. We performed gene ontology (GO) analysis of these genes and found that the top categories included ‘meiotic cell cycle’ and ‘synaptonemal complex organization’ (Fig. 1D, Table S1). Taken together, these data suggest that Taf4b expression is highest in the E16.5 mouse oocyte and that Taf4b is co-expressed with important meiotic genes. A similar analysis from a second scRNA-seq dataset of whole ovaries from earlier (E11.5 to E14.5) time points supported these findings (Ge et al., 2021) (Fig. S1).
RNA-seq identifies TAF4b-affected genes in E16.5 XX germ cells
To understand the transcriptome-level changes in Taf4b-deficient embryonic oocytes, we performed RNA-seq at E16.5. We sorted Oct4-EGFP+ oocytes from five Taf4b-heterozygous (Taf4b+/−) and five Taf4b-deficient (Taf4b−/−) pairs of ovaries and subjected them to ultra-low-input RNA-seq (germ cell numbers for each RNA-seq sample can be found in Table S2). The resulting principal component analysis (PCA) plot shows each of the Taf4b-deficient samples mostly grouping together, with the Taf4b-heterozygous samples dispersed throughout (Fig. 2A, Table S3). This patterning of the data is largely due to the litter from which each sample originates, as we were unable to obtain sufficient numbers of our desired genotypes from a single mouse litter, but importantly the different genotypes separate when plotting litter dates individually (Fig. S2A). We identified 964 differentially expressed genes (DEGs) between Taf4b-heterozygous and Taf4b-deficient oocytes, which were defined as protein-coding, average transcripts per million (TPM) expression>1, and adjusted P<0.05 (Fig. 2B, Table S3). From this list of DEGs, 463 were increased in Taf4b-deficient oocytes and will be referred to as ‘upregulated DEGs’. Some interesting DEGs in this gene set were Fmr1 (the most common genetic cause of POI), JunD (a component of the AP-1 transcription factor complex) and Sp1 (a DNA-binding transcription factor) (Fink et al., 2018; Mechta-Grigoriou et al., 2001; Vizcaíno et al., 2015) (Fig. S2B). The remaining 501 DEGs were decreased in Taf4b-deficient oocytes and will be referred to as ‘downregulated DEGs’. As expected, Taf4b was a downregulated DEG, as was another well-known oogenesis gene, Nobox (Fig. S2C). Finding Nobox as a DEG corroborates previous research which showed that TAF4b binds directly to the promoter region of Nobox and promotes its protein expression in E18.5 oocytes (Grive et al., 2016). We also identified Fam83d, which has been implicated to play a role in ovarian cancer (Zhang et al., 2019), as a downregulated DEG. For validation of these genes, we performed quantitative real-time PCR (qRT-PCR) on E17.5 Oct4-EGFP+ oocytes, which corroborated our RNA-seq results (Fig. S2D). Analysis of known protein-protein interactions (PPIs) using STRING revealed a significant enrichment of PPIs, with major nodes including Ep300 (a histone acetyltransferase involved in chromatin remodeling) and Plk1 (a serine/threonine-protein kinase involved in cell cycle regulation) (Fig. S3). We also performed a similar RNA-seq experiment at E14.5, but found fewer DEGs, suggesting that more substantial transcriptomic effects of Taf4b-deficiency take place around E16.5 (Fig. S4A,B, Table S4).
We performed GO analysis of all the E16.5 DEGs, as well as separating the upregulated and downregulated DEGs (Fig. 2C, Fig. S2E,F). We found multiple chromatin organization and modification GO categories associated with upregulated DEGs and reproduction- and microtubule-related categories associated with downregulated DEGs (Table S3). Overall, these data suggest that TAF4b impacts the expression of many genes in the developing oocyte transcriptome, particularly those associated with chromatin structure and modification and reproduction. Moreover, the effects of TAF4b on the transcriptome take place after E15.5, correlating with the peak in Taf4b expression at E16.5 shown by scRNA-seq and bulk RNA-seq (Fig. 1) (Gura et al., 2020).
X chromosome gene expression is significantly reduced in Taf4b-deficient oocytes
Our E16.5 RNA-seq analysis led us to examine how Taf4b deficiency affects expression of each mouse chromosome. Surprisingly, we observed that there were significantly more downregulated DEGs on the X chromosome than expected and significantly fewer upregulated DEGs on the X chromosome (Fig. 3A,B, Tables S5, S6). Furthermore, the X chromosome was the only chromosome to exhibit such a phenomenon for both sets of DEGs. We then determined whether this skew in DEGs translated into overall reduced X chromosome expression compared with autosomes. When comparing the log2 fold change between Taf4b-heterozygous and Taf4b-deficient oocytes, we found that there was significantly lower expression of X chromosome genes versus autosomal genes (Fig. 3C). Two similar but slightly different dosage compensation calculation methods, the X:A ratio and relative X expression (RXE), further support the idea that the expression of X chromosome genes is reduced in E16.5 Taf4b-deficient oocytes (outliers not plotted) (Fig. 3D,E). However, we did not see a significant difference in X chromosome expression in E14.5 oocytes (Fig. S2C-E).
Ohno's hypothesis postulates that the expression of the X chromosome is uniquely regulated so that ‘housekeeping genes’ on the X largely remain on par with autosomal housekeeping gene expression (Ohno, 1966). Sangrithi et al. (2017) annotated the mouse genome for genes expressed (FPKM≥1) in all tissues they sampled (Sangrithi et al., 2017). We used this set of ubiquitously expressed genes to see whether the effects of Taf4b deficiency on X chromosome expression were specific to ubiquitously expressed genes. We found that 39% of our DEGs were members of the ubiquitous genes list (Fig. S5A), which is higher than the 25% of all genes being ubiquitously expressed. However, when we plotted the log2 fold change of ubiquitous genes on the X chromosome and autosomes, there was no significant difference between these populations (Fig. S5B, Table S7). Taken together, these data indicate that Taf4b deficiency affects the expression of the X chromosome but it is unclear whether Taf4b plays a direct role in dosage compensation.
Overlap of mouse genes deregulated by Taf4b deficiency and TS
As this is the first link of Taf4b to the regulation of X-linked gene expression, we decided to compare it with a mouse model of TS. TS is a chromosomal disorder in which a female individual has one intact X chromosome and the second X chromosome is either missing or severely compromised (Gravholt et al., 2019). We re-processed the raw data of Sangrithi et al. (2017), which used Oct4-EGFP mice covering four developmental time points in female XX and XO germ cells, ranging from E9.5 to E18.5 (Fig. 4A) (Sangrithi et al., 2017). Taf4b expression was not significantly different between the karyotypes at E9.5 and E14.5, but it was significantly reduced in E15.5 and E18.5 XO oocytes, whereas Taf4a expression was not significantly different at any time point (Fig. 4B,C, Table S8). To examine the potential overlap of transcriptomic effects between TS and Taf4b deficiency, we compared their DEGs. We first compared E15.5 TS DEGs with our E16.5 Taf4b DEGs and found 243 genes shared between the two gene sets; this overlap was statistically significant (P<0.05, hypergeometric test) (Fig. 4D). When we used these 243 genes as input for GO analysis, we found DNA-related categories enriched such as ‘DNA repair’ and ‘covalent chromatin modification’ (Fig. 4E). We then compared E18.5 TS DEGs with our E16.5 Taf4b DEGs and found 439 genes shared between the two contexts, which was also a significant overlap (P<0.05, hypergeometric test) (Fig. 4F). When we used these 439 genes as input for GO analysis, we again found DNA-related categories enriched such as ‘DNA repair’ (Fig. 4G). When we compared the E15.5 and E18.5 TS X chromosome DEGs with our E16.5 Taf4b X chromosome DEGS, we found 14 and 31 shared DEGs, respectively (Fig. S6). These data indicate that there are shared transcriptomic effects of both TS and Taf4b deficiency in mouse embryonic oocytes, and that these shared effects are related to functions concerning DNA repair and chromatin modification.
We observed similar results in an independent TS dataset (Hamada et al., 2020). By comparing Taf4b expression in XX and XO cells that had been differentiated from mouse embryonic stem cells in vitro, we found that Taf4b expression was lower in the XO cells that best resembled late embryonic oocytes (Fig. S7A,B, Table S9). Interestingly, when the cells had been further differentiated to a state similar to early postnatal oocytes, the trend reversed with Taf4b expression being significantly higher in mature oocyte-like cells derived from XO cells. This corroborates the reduction in Taf4b expression in mature mouse oocytes and suggests that oocyte expression of Taf4b normally decreases postnatally. In contrast, significant differences in Taf4a expression occurred in d6PGCLCs (differentiated oocytes similar to germ cells that have migrated to the gonadal ridge but not yet entered meiosis) and in the latter stages of differentiation that best resembled postnatal oocytes (Fig. S7C).
CUT&RUN identifies putative direct targets of TAF4b in E16.5 germ cells
To understand which DEGs identified in our E16.5 RNA-seq experiment were likely to be direct targets of TAF4b, we performed Cleavage Under Targets and Release Using Nuclease (CUT&RUN), a technique to map binding sites of specific proteins or histone modifications in the genome. We isolated E16.5 female germ cells using fluorescence-activated cell sorting (FACS) and examined the genomic localization of TAF4b, H3K4me3 (positive control and marker of promoter regions) and IgG (negative control). We performed two replicates of this experiment, with the germ cells in Replicate 1 consisting of 42,416 cells per tube (obtained from 12 embryos) and those of Replicate 2 were 63,079 cells per tube (obtained from 33 embryos). CUT&RUN data analysis using Homer identified 8129 H3K4me3 peaks and 983 TAF4b peaks in Replicate 1 and 320 H3K4me3 peaks and 1111 TAF4b peaks in Replicate 2 (Table S10). We also found that 90% and 95% of TAF4b peaks were classified as localizing to promoters/transcription start sites (‘Promoter-TSS’) for Replicates 1 and 2, respectively (Fig. 5A). Of all the genes that contained TAF4b promoter/TSS peaks, 449 overlapped between the replicates (Fig. 5B). However, it is clear when looking at some gene tracks (Polr2a, for example) that even when a TAF4b peak is identified in only one of the replicates, there is enrichment of TAF4b in the same location in the other replicate, suggesting that some TAF4b binding sites are below the limit of detection by our peak calling criteria (Fig. S8). When plotting the enrichment profile of TAF4b and H3K4me3 relative to TSSs, we found the highest TAF4b enrichment upstream of the TSS in both replicates (Fig. 5C). To examine more closely the localization of TAF4b signal near TSSs, we plotted the distance of TAF4b ‘promoter-TSS’ peaks from both replicates to the TSS (Fig. 5D). There was strong enrichment of TAF4b peaks between −200 bp and +50 bp from the TSS, with the highest number of TAF4b peaks located at −60 to −40 bp away from the TSS.
We performed GO analysis of the shared TAF4b-bound gene promoter-TSSs between the two replicates and found categories related to mRNA processing, DNA repair and chromatin remodeling (Fig. 5E). To determine whether the transcriptomic effects of Taf4b deficiency on the X chromosome arise from greater X chromosome localization, we plotted the expected versus observed number of peaks using the TAF4b peaks that were categorized as ‘Promoter-TSS’ and, surprisingly, found that there were fewer X chromosome peaks than expected (Fig. 5F). Given that we found that there are more DEGs between Taf4b-heterozyous and -deficient oocytes on the X chromosome than expected, this suggests that there might be an indirect but disproportionately high effect of TAF4b on the X chromosome in E16.5 oocytes. However, we cannot claim this approach thoroughly annotates all TAF4b-bound sites in the developing female germ cell genome.
This CUT&RUN experiment allowed us to begin to examine which DEGs identified in our RNA-seq experiment were putative direct targets of TAF4b. When comparing our DEGs to the ‘Promoter-TSS’ peaks of TAF4b, we found 129 DEGs that had at least one peak near their TSS (Fig. 6A). GO analysis of these peaks found that the categories enriched in these data pertained to chromatin modification and organization (Fig. 6B). A volcano plot of these TAF4b-bound DEGs revealed a skew in the number of upregulated versus downregulated DEGs, with 34 TAF4b-bound DEGs being downregulated and 95 TAF4b-bound DEGs being upregulated (Fig. 6C). This means there were three times the number of TAF4b-bound upregulated DEGs compared with downregulated DEGs, suggesting that TAF4b may primarily antagonize mRNA levels in developing oocytes. As examples of TAF4b-bound DEGs, we present gene tracks for JunD, Sp1, Fmr1 and Taf4b (Fig. 6D). JunD, Sp1 and Fmr1 were all upregulated in Taf4b-deficient oocytes. These data suggest that TAF4b negatively regulates the expression levels of these transcription factors and Fmr1. Therefore, we next determined whether the level of Fragile X mental retardation protein (FMRP), encoded by Fmr1, was also perturbed in Taf4b-deficient embryonic oocytes. To do this, we performed immunofluorescent staining of FMRP and the nuclear germ cell marker Tra98 in E16.5 Taf4b-deficient and wild-type ovary tissue sections (Fig. 6E). First, we focused on germ cell clusters within each tissue section and we then quantified the levels of fluorescence for both the FMRP and Tra98 channels within each individual cluster. We found that there was a modest but statically significant (P<0.01) increase in FMRP signal intensity in Taf4b-deficient germ cell clusters compared with wild type. However, there was no significant difference in Tra98 signal intensity between wild-type and Taf4b-deficient germ cell clusters, indicating that the increase in FMRP signal was not due to an increase in oocyte numbers (Fig. 6F). The increased FMRP levels are consistent with upregulated Fmr1 mRNA levels in Taf4b-deficient E16.5 and E17.5 oocytes as shown by RNA-seq and qRT-PCR, respectively (Fig. S2B,D).
We then identified conserved motifs in TAF4b ‘promoter-TSS’ peaks and were surprised to find that TATA-box was not among the top five motifs from the 129 DEGs that had at least one TAF4b ‘promoter-TSS’ peak (Fig. 7A). Instead, GC-box motifs, which are bound by the Sp/KLF family of transcription factors, and the CCAAT-box, bound by NFY, dominated the list. TAF4b peak motif analysis for each female replicate, as well as all TAF4b motifs combined, yielded the same five motifs (Fig. S9A,B,F). Examining the peaks associated with upregulated and downregulated DEGs did not reveal any conclusive differences in TAF4b-bound motifs (Fig. S9C-D). As we had previously noted, the highest number of TAF4b ‘promoter-TSS’ peaks were located −60 to −40 bp away from the TSS (Fig. 5D), suggesting that TAF4b may be binding a few nucleotides upstream of the canonical −25 to −30 bp TATA-box location in mouse embryonic germ cells. To explore this further, we created box plots of the distance to the TSS (no outliers included) of the following: all TAF4b ‘promoter-TSS’ peaks; all TAF4b ‘promoter-TSS’ peaks for genes that were also DEGs; TAF4b ‘promoter-TSS’ peaks for genes that were only downregulated DEGs; and TAF4b ‘promoter-TSS’ peaks for genes that were only upregulated DEGs (Fig. 7B). Their median locations from the TSS were −65 bp, −87.5 bp, −72.5 bp and −104 bp, respectively, with there being a significant difference in the location between all TAF4b ‘promoter-TSS’ peaks and all TAF4b ‘promoter-TSS’ peaks for genes that were also DEGs. When performing motif enrichment on ‘promoter-TSS’ peaks based on distance to TSS, we found that the same motifs (NFY and Sp1) were in the top three frequently rather than strongly varying based on location (Fig. S9E). This integration of RNA-seq and CUT&RUN data suggests that TAF4b directly regulates chromatin remodeling and modification genes in oocytes, perhaps through an unconventional protein-protein interaction that prioritizes other motifs just upstream of the TATA-box (discussed below). However, more canonical functions of TAF4b cannot be ruled out, as ‘TATA-Box (TBP)/Promoter’ did appear as a significantly enriched motif in both replicates; it was ranked 140 in Replicate 1 and 137 in Replicate 2.
We then evaluated whether the genes that were commonly associated with TAF4b-bound motifs in E16.5 oocytes were dynamic in their expression over germ cell development. We re-examined our re-processed scRNA-seq dataset from E12.5-E16.5 mouse oocytes for the gene expression profiles of Nfya, Nfyb, Nfyc, Sp1, Sp2, Klf3 and Sp5 (Fig. 7C). All genes, with the exception of Nfyc, were relatively unchanged over the time and pseudotime courses in mouse oocytes. Nfyc showed its highest expression in the E14.5 cells that were closest to E16.5 in pseudotime. These data indicate that if TAF4b is directly interacting with one or more of these proteins, it might be TAF4b that provides the dynamic expression in germ cells rather than its potential binding partner.
Proper establishment of the ovarian reserve is essential for the reproductive capacity of female mammals, including both humans and mice. This healthy establishment of female gametes is orchestrated through complex oocyte transcription networks that must also properly distinguish germ cell and somatic cell lineages. In addition to the more well-known enhancer-bound transcription activators and repressors, tissue-selective components of the basal transcription machinery can help impart such exquisite regulatory control (Freiman, 2009; Goodrich and Tjian, 2010). We have previously shown that the TAF4b subunit of the TFIID complex is required for proper establishment of the ovarian reserve in the mouse (Grive et al., 2014, 2016; Lovasco et al., 2010). However, the network of genes regulated by TAF4b to accomplish this crucial task has been elusive, until now. Here, we show that TAF4b directly and indirectly regulates genes essential for proper meiotic progression during early oocyte differentiation. Integration of RNA-seq and CUT&RUN data in E16.5 mouse oocytes reveals germ cell-intrinsic regulation by TAF4b in the promoter-proximal regions of chromatin modification and organization genes. Furthermore, we discovered an unexpected link between Taf4b deficiency in the mouse and the proper expression of the mouse X chromosome, and similarities to the transcriptome of TS, a well-known cause of POI in women (Sangrithi et al., 2017). Surprisingly, TATA-box motifs were not among the top binding motifs in either female oocyte replicate shown by CUT&RUN nor was the peak enrichment of TAF4b at the expected location (Fig. 8A). Together, these molecular insights suggest that TAF4b directly regulates genes instrumental in establishing the finite ovarian reserve and that TAF4b may have a non-canonical function in mouse oocytes outside of TFIID or in an unconventional version of TFIID (Fig. 8B).
TFIID was first discovered as a large multi-protein complex required for activator-dependent RNA polymerase II transcription (Dynlacht et al., 1991; Reinberg et al., 1987). Characterization of the composition of TFIID revealed a key DNA-binding subunit, TBP, that binds directly to the TATA-box found at the −25 nucleotide position in relation to the TSS of many genes (Hoey et al., 1990). Surprisingly, our embryonic germ cell CUT&RUN data for TAF4b centers its peak of binding to GC- and CCAAT-box sequences at −40 to −60 bp upstream (with TAF4b-bound DEGs containing peaks even further upstream), but still proximal to the TSS. These sequences are well-known binding sites for specificity protein 1 (Sp1) and nuclear factor y (NFY) transcription factors, which are known to play extensive roles in promoter proximal transcription and are ubiquitously expressed. Although we do not yet know the significance of these binding sites and the occupancy of TAF4b, there are interesting clues in the published literature. Hibino et al. (2016) showed that there is a direct interaction between human SP1 and TAF4a through their intrinsically disordered domains. TAF4b lacks most of the large intrinsically disordered regions that TAF4a contains, but TAF4b was not tested in that study. Therefore, we do not know if it might have some capacity to bind to Sp1. Because Sp1 is highly and ubiquitously expressed in cell types throughout the body, a tidy hypothesis would be that Sp1 provides the DNA-binding capacity and TAF4b provides the germ cell expression specificity (Fig. 8B). We also know that NFY is a protein complex with three components: NF-YA, NF-YB and NF-YC. Both NF-YB and NF-YC contain histone fold domains and heterodimerize, leading to the NFY protein complex acting in a sequence-specific, histone-like mode of DNA binding (Nardini et al., 2013). TAFs, including TAF4b, also contain histone-fold domains and perhaps this shared feature could enable their cooperation through di-/trimerization in oocytes. Interestingly, Nfyb was a downregulated DEG in our E16.5 RNA-seq experiment and Nfya and Nfyc were non-significantly decreased (Table S3). Furthermore, Replicate 2 of our CUT&RUN experiment contained a TAF4b peak in the ‘promoter-TSS’ region for Nfya (Table S10). Therefore, we have some limited evidence that the factors whose motifs comprise TAF4b-bound promoter regions have a connection to TAF4b in our own data. Further molecular investigations are required to unravel the germ cell specificity of this regulatory logic and the exact protein binding partners of TAF4b.
Similar diversification of selective TFIID subunits has occurred within germline development of highly distant organisms, including insects, vertebrates and plants. In Drosophila, several testis-specific TAFs (tTAFs) play a crucial role in regulating transcription and the timing of spermatogenic differentiation, and a germ cell-expressed TBP paralog, TBP-related factor 2 (TRF2) is required for oogenesis (Gazdag et al., 2009; Hiller et al., 2004). The mouse ortholog of TRF2, called TBPL1, is required for spermiogenesis, as is TAF7l, which is coordinately expressed with TAF4b in early meiotic oocytes (Zhou et al., 2013a). Interestingly, TBPL1, TAF7l and TAF9b have also been shown to be essential for muscle and adipocyte differentiation, indicating that this diversification of TFIID subunits has evolved to regulate both somatic and germ cell differentiation, sometimes via the identical subunit (Herrera et al., 2014; Zhou et al., 2013b). The most striking parallel of the early meiotic transcription and chromatin functions of TAF4b shown here lie with a natural variant of TAF4b found in Arabidopsis (AtTAF4b) (Lawrence et al., 2019). This recent study has identified a similar timing of the meiocyte transcriptome regulated by AtTAF4b as we show here for mouse TAF4b. Although arising independently in the plant and animal kingdoms, there appears to be some common transcription and/or chromatin state that is regulated by TAF4b to ensure the fidelity of meiotic recombination and early oogenesis. We have previously found other TFIID subunits, such as Taf7l and Taf9b, to be preferentially and dynamically expressed in embryonic mouse oocytes (Gura et al., 2020). One hypothesis to explain these observations is that a germ cell-specific version of TFIID may exhibit characteristics and targets that differ from those of canonical TFIID (Fig. 8B).
One major limitation of this study is the heterogeneity inherent to experiments on E16.5 oocytes, even ones that have been sorted for GFP fluorescence. Because oocytes progress through meiosis I asynchronously, some oocytes will have advanced as far as pachynema, whereas others may still be in leptonema. Furthermore, we know meiotic progression is slowed in Taf4b-deficient oocytes (Grive et al., 2016). Therefore, it is not possible to conclude that TAF4b has a specific transcriptomic effect on these substages, such as pachynema, from these aggregated data. Potential experiments to circumvent this issue could be to perform scRNA-seq analysis between Taf4b-deficient and wild-type oocytes, which would help improve the resolution of the transcriptomic effects of Taf4b deficiency by allowing us to look at developmental markers and pseudotime trajectories. We could also perform the ‘3S’ method on male germ cells to isolate precise prophase I substages of germ cells, albeit in male mice rather than female (Romer et al., 2018). Despite the challenges of this oocyte heterogeneity, we revealed some consistent trends in our data. One avenue of future research is regarding chromatin modifications and organization because their categories were repeatedly found in our data. This study further corroborates earlier hints of the relevance of chromatin to TAF4b, where higher yH2AX, less recombination and more asynapsis were found in E16.5 Taf4b-deficient oocytes (Grive et al., 2016). Further exploration of the chromatin state in Taf4b-deficient oocytes will hopefully help bring the worlds of transcription and chromatin biology closer together in these newly born oocytes.
In addition to illuminating the molecular underpinnings of TAF4b function, we discovered unexpected overlaps between Taf4b deficiency and other known causes of POI. Many individuals with TS experience POI, which includes not reaching or delayed menarche and primary amenorrhea. Recent research has suggested that excessive prenatal oocyte loss may underlie the ovarian insufficiency in TS. Excessive oocyte attrition at the perinatal DNA-damage checkpoint, where oocytes that have not resolved DNA damage or still contain asynapsed chromosomes are eliminated, results in a depleted ovarian reserve and its downstream sequelae. Our observation of deregulation of the X chromosome in Taf4b-deficient E16.5 oocytes was surprising and prompted us to compare Taf4b deficiency and a mouse model of TS in which females contain a single X chromosome (XO) (Grive et al., 2016). The extensive overlap of the deregulated gene expression in XO and Taf4b-deficient early meiotic oocytes was striking. Importantly, Taf4b expression itself was compromised in the XO oocytes, indicative of potential mutual regulation between these two genetic changes. A recent report uncovered a unique mechanism of XX dosage compensation in human primordial oocytes and it is possible that TAF4b plays an integral role in this sexually dimorphic mechanism of X-chromosome regulation (Chitiashvili et al., 2020). Another interesting parallel is the association of the X chromosome-encoded Fmr1 gene with TS and Taf4b deficiency. Taf4b expression is significantly correlated with Fmr1 in embryonic human ovaries, mutation of Fmr1 is one the most common underlying genetic causes of POI, and TS individuals are missing one copy of Fmr1. Here, we show that TAF4b directly associates with the proximal promoter region of Fmr1 and the loss of TAF4b increases its mRNA abundance. Interestingly, the peak of TAF4b at Fmr1 was not localized to the CGG repeats of the gene, which are the underlying cause of the contribution of Fmr1 to POI incidence (Fortuño and Labarta, 2014). Given the clear links between Taf4b deficiency, TS, FXPOI and POI presented here, we suspect that a core group of the genes identified in this study are required for the proper development of the ovarian reserve in humans and if that quorum is not reached a similar cascade of dysregulated gene expression occurs. Understanding these and other causes of POI will clarify the best ways to manage these related infertility syndromes and improve assisted reproduction therapies.
MATERIALS AND METHODS
This study was approved by Brown University IACUC protocol #21-02-0005. The primary method of euthanasia was CO2 inhalation and the secondary method used was cervical dislocation both as per American Veterinary Medical Association (AVMA) guidelines on euthanasia.
Mice that were homozygous for an Oct4-EGFP transgene (The Jackson Laboratory: B6;129S4- Pou5f1tm2Jae/J) were backcrossed to the C57BL/6 line and mated for CUT&RUN collections. Mice that were homozygous for an Oct4-EGFP transgene (The Jackson Laboratory: B6;129S4-Pou5f1tm2Jae/J) and C57BL/6 mice heterozygous for the Taf4b-deficiency mutation (in exon 12 of the 15 total exons of the Taf4b gene, which disrupts the endogenous Taf4b gene) were mated for mRNA collections. Timed matings were estimated to begin at day E0.5 by evidence of a copulatory plug. The sex of the embryos was identified by confirming the presence or absence of testicular cords. Genomic DNA from tails was isolated using Qiagen DNeasy Blood & Tissue Kits (69506) for PCR genotyping assays.
All animal protocols were reviewed and approved by Brown University Institutional Animal Care and Use Committee and were performed in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. Ovaries were dissected out of embryos into cold PBS.
Embryonic ovary dissociation and FACS
To dissociate ovary tissue into a single-cell suspension, embryonic ovaries were harvested and placed in 0.25% Trypsin/EDTA and incubated at 37°C for 15 and 25 min for E14.5 and E16.5 ovaries, respectively, as previously described (Gura et al., 2020). Eppendorf tubes were flicked to dissociate tissue halfway through and again at the end of the incubation. Trypsin was neutralized with fetal bovine serum. Cells were pelleted at 1500 RPM (221 g) for 5 min, the supernatant was removed, and cells were resuspended in 100 μl PBS. The cell suspension was strained through a 35 μm mesh cap into a FACS tube (Gibco, 352235). Propidium iodide (1:500) was added to the cell suspension to distinguish between live and dead cells. FACS was performed using a Becton Dickinson FACSAria III in the Flow Cytometry and Cell Sorting Core Facility at Brown University. A negative control non-transgenic mouse ovary was used for each experiment to establish an appropriate GFP signal baseline. Dead cells were discarded and the remaining cells were sorted into GFP+ and GFP− samples in PBS at 4°C for each embryo.
For RNA-seq analysis, GFP+ cells from each individual embryo were kept in separate tubes and were then spun down at 1500 RPM (221 g) for 5 min, PBS was removed, and cells were then resuspended in Trizol (Thermo Fisher, 1556026). If samples had roughly less than 50 µl of PBS in the tube, Trizol was added immediately. The number of cells for each sample sequenced can be found in Table S2. We used five embryos per genotype and littermates at E16.5 because of the less clear results we received at E14.5. Five embryos per genotype, taken from the same three litters was sufficient to overcome the embryo and oocyte heterogeneity inherent in this tissue type. Samples were stored at −80°C.
For CUT&RUN germ cells, all the collected ovaries were pooled prior to FACS. Sorted cells were then spun down at 1500 RPM (221 g) for 5 min and were resuspended in 300 µl of PBS, then split into three Eppendorf tubes. These three tubes of germ cells were then used for CUT&RUN. The number of cells for each sample were as follows: Replicate 1 female germ cell samples had 42,416 cells per tube (obtained from 12 embryos) and Replicate 2 female germ cell samples had 63,079 cells per tube (obtained from 33 embryos). We used many embryos per replicate because of the meiotic heterogeneity in oocytes and we used two replicates in order to identify the genes that were consistently bound by TAF4b.
scRNA-seq data analysis
SRP193506 and SRP188873 were downloaded from the NCBI Sequence Read Archive (SRA) onto Brown University's high-performance computing cluster at the Center for Computation and Visualization. The fastq files were aligned using Cell Ranger (v 5.0.0) count and then aggregated using Cell Ranger aggr. The resulting output from aggr was used as input for Seurat (v 3.9.9) in RStudio (R v 4.0.2) (Stuart et al., 2019). Seurat was used to select for Dazl-positive (Dazl>0), high-quality (nFeature_RNA>1000, nFeature_RNA<5000, nCount_RNA>2500, nCount<30,000, mitochondrial genes<5%) oocytes. These data were then passed to Monocle3 (v 0.2.3) for pseudotime analysis and generating uniform manifold approximation and projection (UMAP) and gene expression data (Cao et al., 2019; Qiu et al., 2017; Trapnell et al., 2014). The cloupe file created from Cell Ranger aggr was used as input for Loupe Cell Browser (v 5.0), where the same filtering steps were used (Dazl>0, Feature Threshold>1000, Feature Threshold<5000, UMI Threshold>2500, UMI Threshold<30,000, Mitochondrial UMIs<5%). These filtered cells were then split into Taf4b-expressing (Taf4b>0) and Taf4b-off (Taf4b=0) and then ‘Locally Distinguishing’ was run for Significant Feature Comparison. The list of genes significantly associated with Taf4b-expressing cells (Table S1) was used as input for ClusterProfiler (v 3.16.1) to create a dotplot of significantly enriched GO categories (Yu et al., 2012).
Embryonic germ cells resuspended in Trizol were shipped to GENEWIZ (NJ, USA) on dry ice. Sample RNA extraction, sample QC, library preparation, sequencing, and initial bioinformatics were carried out at GENEWIZ. RNA was extracted following the Trizol Reagent User Guide (Thermo Fisher Scientific). Glycogen was added (1 µl, 10 mg/ml) to the supernatant to increase RNA recovery. RNA was quantified using Qubit 2.0 Fluorometer (Life Technologies) and RNA integrity was checked with TapeStation (Agilent Technologies) to determine whether the concentration met the requirements.
SMART-Seq v4 Ultra Low Input Kit for Sequencing was used for full-length cDNA synthesis and amplification (Clontech), and Illumina Nextera XT library was used for sequencing library preparation. The sequencing libraries were multiplexed and clustered on a lane of a flow cell. After clustering, the flow cell was loaded onto an Illumina HiSeq 4000 according to the manufacturer's instructions. The samples were sequenced using a 2×150 Paired End (PE) configuration. Image analysis and base calling were conducted by the HiSeq Control Software (HCS) on the HiSeq instrument. Raw sequence data (.bcl files) generated from Illumina HiSeq were converted into fastq files and de-multiplexed using bcl2fastq (v. 2.17). One mismatch was allowed for index sequence identification.
RNA-seq data analysis
Datasets SRP059601 and SRP059599 were taken from NCBI SRA. All raw fastq files were initially processed on Brown University's high-performance computing cluster. Reads were quality-trimmed and had adapters removed using Trim Galore! (v 0.5.0) with the parameters –nextera -q 10. Samples before and after trimming were analyzed using FastQC (v 0.11.5) for quality and then aligned to the Ensembl GRCm38 using HiSat2 (v 2.1.0) (Andrews, 2010; Pertea et al., 2016). Resulting sam files were converted to bam files using Samtools (v 1.9) (Li et al., 2009). E14.5 heterozygous bam files were downsampled because these samples had been sequenced more deeply than their wild-type and deficient counterparts.
To obtain TPMs for each sample, StringTie (v 1.3.3b) was used with the optional parameters -A and -e. A gtf file for each sample was downloaded and, using RStudio (R v 4.0.2), TPMs of all samples were aggregated into one comma separated (csv) file using a custom R script. To create interactive Microsoft Excel files for exploring the TPMs of each dataset the csv of aggregated TPMs was saved as an Excel spreadsheet, colored tabs were added to set up different comparisons, and a flexible Excel function was created to adjust to gene name inputs. See Tables S1, S3, S4, S7-10 for an Excel file of each dataset analyzed. To explore the Excel files, under the ‘Quick_Calc’ tab type the gene name of interest into the highlighted yellow boxes (Tables S3, S4, S8).
To obtain count tables, featurecounts (Subread v 1.6.2) was used (Liao et al., 2014). Metadata files for dataset were created manually in Excel and saved as a csv. These count tables were used to create PCA plots by variance-stabilizing transformation (vst) of the data in DESeq2 (v 1.22.2) and plotting by ggplot2 (v 3.1.0) (Love et al., 2014; Wickham, 2016). DESeq2 was also used for differential gene expression analysis, with count tables and metadata files used as input. We accounted for the litter effect in our mouse oocytes by setting it as a batch parameter in DESeq2. For the volcano plot, the output of DESeq2 was used and plotted using ggplot2. DEG lists were used for ClusterProfiler (v 3.16.1) input to create dotplots of significantly enriched GO categories for all DEGs, downregulated DEGs and upregulated DEGs. Physical, highest-confidence protein-protein interactions were identified using STRING, with unconnected proteins not shown in the image (Szklarczyk et al., 2019).
For X chromosome analysis, expected numbers of downregulated and upregulated DEGs per chromosome were calculated by dividing the average number of observations per chromosome by the average number of total genes per chromosome. Chi-square values and P-values were calculated using the GraphPad QuickCalcs chi-square function, with observed and expected frequencies used as input (https://www.graphpad.com/quickcalcs/chisquared1/, accessed Jan 2021). Box plots of log2 fold change between the autosomes and X chromosomes used the output of DESeq2 as input, based on other publications comparing autosomal and X chromosome expression (Hirota et al., 2018). The X:A ratio was calculated using pairwiseCI (v. 0.1.27), a bootstrapping R package, after filtering genes for an average TPM>1 (Duan et al., 2019a; Sangrithi et al., 2017). The RXE was calculated using a custom R script based after filtering genes for an average TPM>1 and adding pseudocounts for log transformation [log2(x+1)], based on other RXE publications (Duan et al., 2019b; Jue et al., 2013). The ‘ubiquitous genes’ from the data of Sangrithi et al. (2017) were converted from gene names to Ensembl IDs, first by using ShinyGO to convert IDs through the ‘Genes’ tab (Ge et al., 2020). Genes that were not mapped were then used as input for DAVID gene ID conversion, and any remaining unconverted gene names were manually entered into the Ensembl database to find matches (Howe et al., 2021; Huang et al., 2007). Venn diagrams were created using BioVenn (Hulsen et al., 2008). All plots produced in RStudio were saved as an EPS file type and then opened in Adobe Illustrator in order to export a high-quality JPEG image.
The CUT&RUN performed on E16.5 germ cells followed the protocol described by Hainer and Fazzio (2019). CUT&RUN antibodies were as follows: polyclonal rabbit TAF4b (as previously described; Grive et al., 2016), monoclonal rabbit H3K4me3 (EMD Millipore, 05-745R), rabbit IgG (Thermo Fisher, 02-6102), pA-MNase [generous gift from Dr Thomas Fazzio; the expression and purification of the pA-MNase were performed according to Schmid et al. (2004)]. All antibodies were used at a working dilution of 1:100.
For library preparation, the KAPA HyperPrep kit (Roche, 07962363001) was used with New England Biolabs NEBNext Multiplex Oligos for Illumina (NEB, E7335). After library amplification by PCR, libraries were size-selected by gel extraction (∼150-650 bp) and cleaned up using the Qiagen QIAquick Gel Extraction Kit (28704). CUT&RUN libraries in EB buffer were shipped to GENEWIZ (NJ, USA) on dry ice. Sample QC, sequencing and initial bioinformatics were performed at GENEWIZ.
The sequencing libraries were validated on the Agilent TapeStation and quantified using Qubit 2.0 Fluorometer (Invitrogen) as well as by quantitative PCR (KAPA Biosystems). The sequencing libraries were clustered on flow cells. After clustering, the flow cells were loaded on to the Illumina HiSeq instrument (4000 or equivalent) according to the manufacturer's instructions. The samples were sequenced using a 2×150 bp PE configuration. Raw sequence data (.bcl files) generated from Illumina HiSeq were converted into fastq files and de-multiplexed using bcl2fastq (v. 2.20). One mismatch was allowed for index sequence identification.
CUT&RUN data analysis
Computational scripts regarding CUT&RUN data analysis were based on other CUT&RUN publications (Hainer and Fazzio, 2019). All raw fastq files were initially processed on Brown University's high-performance computing cluster. Reads were quality-trimmed and had adapters removed using Trim Galore! (v 0.5.0) with the parameter -q 10 (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Samples before and after trimming were analyzed using FastQC (v 0.11.5) for quality and then aligned to the Ensembl GRCm39 using Bowtie2 (v 2.3.0). Fastq screen (v 0.13.0) was used to determine the percentage of reads uniquely mapped to the mouse genome in comparison to other species. Resulting sam files were converted to bam files, then unmapped, duplicated and low-quality mapped reads were removed using Samtools (v1.9). Resulting bam files were split into size classes using a Unix script. For calling peaks, annotating peaks and identifying coverage around TSSs, Homer (v 4.10) was used (Heinz et al., 2010). For gene track visualization, the final bam file before splitting into size classes was used as input to Integrative Genomics Viewer (IGV) (Robinson et al., 2011). A custom genome was created using a genome fasta and gtf file for Ensembl GRCm39.
Pie charts were created using data from Homer output and Venn diagrams were created using BioVenn. For X chromosome analysis, expected numbers of promoter peaks per chromosome were calculated by dividing the average number of observations per chromosome by the average number of total protein-coding genes per chromosome. Chi-square values and P-values were calculated using the GraphPad QuickCalcs chi-square function, with observed and expected frequencies used as input (https://www.graphpad.com/quickcalcs/chisquared1/, accessed Apr 2021). Dotplots of Promoter-TSS peaks were made using ClusterProfiler. To determine which TAF4b peaks were shared between female replicates and RNA-seq DEGs, the Ensembl ID associated with the annotation was used. TSS plots were created using the ‘tss’ function of Homer and plotted using Microsoft Excel. All plots produced in RStudio were saved as an EPS file type and then opened in Adobe Illustrator in order to export a high-quality JPEG image.
Total RNA was extracted from E17.5 female Oct4-EGFP+ germ cells using a Direct-zol RNA miniprep kit (Zymo Research, R2027, C1004). Total RNA from all experiments was quantified and checked for purity, and 50 ng was used to prepare 20 μl of cDNA with an iScript cDNA Synthesis Kit (Bio-Rad, 170-8891). Real-time PCR was performed in technical triplicate using 1 μl of DNA template, 12.5 μl of ABI SYBR green PCR master-mix (Applied Biosystems, A25742), and 0.5 μM custom oligos (Invitrogen) for Fmr1, Sp1, Fam83d, JunD, Taf4b or 18S rRNA in a 20 μl reaction in an ViiA 7 Real Time PCR machine (Life Technologies). Data were analyzed by the ΔΔCt method, and relative expression levels were normalized to 18S rRNA. Primer sequences corresponding to genes of interest can be found in Table S11.
Prenatal ovaries were harvested at E16.5, cleaned of excess fat, and fixed in 4% formaldehyde solution for 2 h before embedding in Optimal Cutting Temperature (OCT) compound. Ovaries were serially sectioned at 8 μm on a Leica Cryostat onto glass slides and and washed twice for 5 min in PBS-T [1× PBS containing 0.1% Tween-20 (Fisher Scientific)] at room temperature. Tissue sections were then incubated in blocking buffer [5% goat serum (Sigma-Aldrich) and 0.1% Tween-20 (Fisher Scientific) in 1× PBS] for 1 h at room temperature. Slides were stained with rat anti-Tra98 (Abcam, ab82527) and rabbit anti-FMRP (Abcam, ab17722) primary antibodies at 1:100 in blocking buffer and incubated overnight at 4°C. Slides were then washed three times for 10 minutes each in PBS-T at room temperature. Slides were stained with goat anti-rat Alexa 594 (Invitrogen, A11007) and goat anti-mouse Alexa 488 (Abcam, ab150077) secondary antibodies at 1:500 in blocking buffer and counterstained with 4′,6-diamidino-2-phenylindole (DAPI; Vector Laboratories) and incubated for 1 h at room temperature. Slides were washed three times for 10 minutes each in PBS-T at room temperature before mounting coverslips with Vectashield Antifade Mounting Medium (Vector Laboratories). A ‘secondary antibody-only’ control was included to compare background staining. Images were taken at 40× magnification on a Zeiss Axio Imager M1 microscope (Carl Zeiss).
Quantification of fluorescence intensity
Ovary tissue sections from two E16.5 Taf4b-deficient and wild-type animals were stained with FMRP and Tra98 and used for fluorescence intensity quantification. Five spatially distributed sections from each ovary were imaged at 40× magnification on a Zeiss Axio Imager M1 microscope (Carl Zeiss). Fiji ImageJ software was used to visualize images and perform quantification of fluorescence intensity (Schindelin et al., 2012). First, germ cell clusters were identified within each tissue section and cropped to generate a 60 µm square image containing the red (Tra98), green (FMRP) and blue (DAPI) channels. The freehand tool was then used to outline the germ cell cluster within the cropped image. Next, either the FMRP or Tra98 channel was selected, and the ‘Measure’ functionality under the ‘Analyze’ menu was used to quantify the mean pixel intensity within the outlined area. The mean FMRP and Tra98 pixel intensity was measured from a total of 53 wild-type and 70 Taf4b-deficient germ cell clusters. The mean and standard deviation of the values collected for each genotype were calculated and statistical significance was calculated using an unpaired, two-tailed Student's t-test. Jitter plots were generated using the ggplot2 (v 3.1.0) package in R. All plots produced in RStudio were saved as an EPS file type and then opened in Adobe Illustrator in order to export a high-quality JPEG image.
We thank Drs Ashley Webb, Erica Larschan, Mark Johnson, and Kathryn Grive for their helpful input throughout these studies. We thank the Center for Computation and Visualization at Brown University for computational resources for scRNA-seq, RNA-seq and CUT&RUN data analysis. We thank Kevin Carlson and the Brown University Flow Cytometry and Sorting Facility for expertise completing the flow sorting. The Brown University Flow Cytometry and Sorting Facility has received generous support in part by the National Institutes of Health (NCRR Grant No. 1S10RR021051) and the Division of Biology and Medicine, Brown University. As much of our insights were gained by reprocessing publicly available datasets, we greatly appreciate both the researchers that generated and shared the data initially and the respective repositories for making them available. We dedicate this manuscript to the fond memory of our dear colleague, developmental biologist extraordinaire, friend and teacher Dr John Coleman.
Conceptualization: M.A.G., R.N.F.; Methodology: M.A.G., T.W., H.K., J.M.A.T., T.G.F., R.N.F.; Formal analysis: M.A.G., S.R., K.M.A., K.A.S., T.W.; Investigation: M.A.G., S.R., K.M.A., K.A.S.; Resources: T.G.F.; Data curation: M.A.G.; Writing - original draft: M.A.G.; Writing - review & editing: M.A.G., S.R., K.M.A., K.A.S., T.W., H.K., J.M.A.T., T.G.F., R.N.F.; Visualization: M.A.G.; Supervision: K.A.S., R.N.F.; Project administration: R.N.F.; Funding acquisition: M.A.G., K.M.A., T.G.F., R.N.F.
This work was funded by the National Institute of Child Health and Human Development for their generous support (1F31HD097933, 1F31HD105340, 1R01HD072122 and 1R01HD091848 to M.A.G., K.M.A., T.G.F. and R.N.F., respectively) and thank the United States-Israel Binational Science Foundation (BSF) for their generous support. Deposited in PMC for release after 12 months.
All computational scripts used in this publication are available to the public at https://github.com/mg859337/Gura_et_al._TAF4b_transcription_networks_regulating_early_oocyte_differentiation. The female mouse E14.5 and E16.5 RNA sequencing data are available in NCBI Gene Expression Omnibus (GEO) under accession number GSE174366 and NCBI Sequence Read Archive (SRA) under accession numbers SRP319538 and SRP319541. The female mouse E16.5 CUT&RUN sequencing data are available from NCBI GEO under accession number GSE186991 and NCBI SRA under accession number SRP344210. The sequencing datasets accessed in this research are from the follow accession numbers: the scRNA-seq mouse data from E12.5-E16.5 female sorted Oct4-EGFP ovaries used for Fig. 1 was obtained through NCBI GEO: GSE130212; the scRNA-seq mouse data from E11.5-E14.5 female ovaries used for Fig. S1 was obtained through NCBI GEO: GSE128553; the RNA-seq mouse data from E9.5-E18.5 XX and XO sorted Oct4-EGFP ovaries used for Fig. 4 and Fig. S6 was obtained through ArrayExpress: E-MTAB-4616; the RNA-seq of in vitro differentiated XX and XO mouse ESCs used for Fig. S7 was obtained through NCBI GEO: GSE121299.
Peer review history
The peer review history is available online at https://journals.biologists.com/dev/article-lookup/doi/10.1242/dev.200074.
The authors declare no competing or financial interests.