ABSTRACT
The mammalian male germline is sustained by a pool of spermatogonial stem cells (SSCs) that can transmit both genetic and epigenetic information to offspring. However, the mechanisms underlying epigenetic transmission remain unclear. The histone methyltransferase Kmt2b is highly expressed in SSCs and is required for the SSC-to-progenitor transition. At the stem-cell stage, Kmt2b catalyzes H3K4me3 at bivalent H3K27me3-marked promoters as well as at promoters of a new class of genes lacking H3K27me3, which we call monovalent. Monovalent genes are mainly activated in late spermatogenesis, whereas most bivalent genes are mainly not expressed until embryonic development. These data suggest that SSCs are epigenetically primed by Kmt2b in two distinguishable ways for the upregulation of gene expression both during the spermatogenic program and through the male germline into the embryo. Because Kmt2b is also the major H3K4 methyltransferase for bivalent promoters in embryonic stem cells, we also propose that Kmt2b has the capacity to prime stem cells epigenetically.
INTRODUCTION
One major feature of stem cells is the priming potential for future lineage development (Harikumar and Meshorer, 2015). In embryonic stem cells (ESCs), transcriptional priming is epigenetically mediated by ‘bivalent chromatin’ at promoters carrying both active (H3K4me3) and silent (H3K27me3) marks (Azuara et al., 2006; Bernstein et al., 2006). Upon differentiation, the bivalent promoters are resolved for later activation or silencing depending on the lineage specified. Bivalent chromatin is also prevalent in various adult cell types, including hematopoietic stem cells and intestinal epithelial cells, and appears to support tissue-specific epigenetic programs (Cui et al., 2009; Jadhav et al., 2016).
In the male germline, bivalent chromatin is associated with silencing of non-meiotic genes (Hasegawa et al., 2015; Sin et al., 2015) and the priming of embryonic developmental programs for the next generation, which is conserved across mammalian species (Lesch et al., 2016; Teperek et al., 2016). Indeed, disruption of histone methylation during spermatogenesis by overexpression of the H3K4 demethylase Lsd1 (Kdm1a) affects the health of offspring (Siklenka et al., 2015). Germline epigenetic marks may be transmitted to embryos through the 1-10% of histones that are not subject to the extensive histone-to-protamine exchange that occurs during the final stage of spermatogenesis (Arpanahi et al., 2009; Brykczynska et al., 2010; Erkek et al., 2013; Hammoud et al., 2009). Whereas the H3K27me3 mark of germline bivalency involves the germline-specific Polycomb-group (Pc-G) protein Scml2 (Maezawa et al., 2018), the molecular mechanisms for H3K4me3 deposition at bivalent promoters remains unclear.
In mouse ESCs, Kmt2b (also known as Mll2 or Wbp7), one of the six mammalian Set1/Trithorax-type H3K4 methyltransferases, catalyzes H3K4me3 at bivalent promoters but is not required for ESC survival (Denissov et al., 2014). However, Kmt2b is required for both male and female germ cell development (Andreu-Vieyra et al., 2010; Glaser et al., 2009). These lines of evidence prompted the hypothesis that Kmt2b is the pioneer H3K4 methyltransferase responsible for priming the epigenome (Glaser et al., 2009) and led us to investigate whether Kmt2b plays a priming role in the male germline at the stem cell stage. In the mouse, spermatogenesis is supported by self-renewal and differentiation of spermatogonial stem cells (SSCs) (de Rooij and Russell, 2000). The SSC population is suggested to correspond well to the Plzf (Zbtb16)+/Kit− undifferentiated type-A spermatogonia because repressive heterochromatin, a feature typical of lineage-committed cells (Hawkins et al., 2010), is not evident until differentiation to Kit+ spermatogonia (Shirakawa et al., 2013) (Fig. S1A). To address the outstanding questions regarding the acquisition of H3K4me3 priming in the germline and possible Kmt2b action in SSCs, we used histological and molecular approaches to investigate the effect of Kmt2b knockout in male germ cells in mice. We found that Kmt2b is highly expressed in Plzf+/Kit− SSCs and its absence leads to an SSC-to-progenitor transition failure. At the stem-cell stage, Kmt2b conveyed H3K4me3 at bivalent genes associated with embryonic development. Notably, Kmt2b also targets a new class of genes lacking H3K27me3, or ‘monovalent genes’, which are mainly activated during late spermatogenesis. Thus, we suggest that Kmt2b is an SSC priming factor, marking two sets of promoters – one activated within the germline and the other in the next generation.
RESULTS
Kmt2b is dispensable for SSC survival but is required for SSC-to-progenitor transition
We previously reported that postnatal Kmt2b knockout leads to male infertility owing to an aberration likely occurring between spermatogonia and spermatocytes (Glaser et al., 2009). To investigate further the precise time window when Kmt2b is required and to assess the Kmt2b activity in SSCs, we used postnatal mice carrying floxed alleles of Kmt2b (termed Kmt2bF/F) and Rosa26-CreERT2 to conditionally delete Kmt2b by tamoxifen (TMX) administration (to generate the FC/FC genotype) (Glaser et al., 2006) (Fig. 1A; Fig. S1B,C). Kmt2bFC/FC testes were 2.8-fold smaller on average compared with Kmt2bF/F (41.7 mg and 118.8 mg, respectively; P=6.8e−08) despite no difference in body weight (36.9 g and 37.0 g, respectively; P=1) (Fig. 1B,C). Kmt2bFC/FC seminiferous tubules visualized by laminin staining were deformed and lacked proper germinal epithelial layers and lost a substantial number of germ cells stained with Tra98 (Tanaka et al., 1997) (Fig. S1D), thus confirming and extending the testicular phenotypes observed previously (Glaser et al., 2009).
Kmt2b is required for SSC-to-progenitor transition. (A) TMX administration schedule for Kmt2b knockout (see Materials and Methods for details). (B) Testes from Kmt2bF/F and FC/FC males at 4 months of age. (C) Dot plot showing weight of each testis (left) and body weight (right) of Kmt2bF/F and FC/FC mice (testes, n=20 of each genotype; mice, n=11 of each genotype). Horizontal bars, mean. (D) IHC images of seminiferous tubules at 4 months stained for Kmt2b (magenta), Plzf (green) and Kit (orange). Dotted lines mark seminiferous tubules. Scale bars: 50 µm (left) and 5 µm (right). (E) Average cell count of Plzf+/Kit− and Plzf−/Kit+ spermatogonia using IHC images. Twenty seminiferous tubules per mouse (n=3) were assessed for each genotype. Values were normalized as the number of cells per tubule per 100 mg of testis. Error bars represent s.e.m. P-values were calculated by Wilcoxon rank sum test. (F) Testis sections stained for H3K4me3 (magenta), Plzf (green) and Kit (orange). Scale bars: 50 µm (left) and 5 µm (right). In D and F, blue boxes indicate the areas enlarged on the right. Plzf+/Kit− SSCs are indicated by yellow arrowheads.
Kmt2b is required for SSC-to-progenitor transition. (A) TMX administration schedule for Kmt2b knockout (see Materials and Methods for details). (B) Testes from Kmt2bF/F and FC/FC males at 4 months of age. (C) Dot plot showing weight of each testis (left) and body weight (right) of Kmt2bF/F and FC/FC mice (testes, n=20 of each genotype; mice, n=11 of each genotype). Horizontal bars, mean. (D) IHC images of seminiferous tubules at 4 months stained for Kmt2b (magenta), Plzf (green) and Kit (orange). Dotted lines mark seminiferous tubules. Scale bars: 50 µm (left) and 5 µm (right). (E) Average cell count of Plzf+/Kit− and Plzf−/Kit+ spermatogonia using IHC images. Twenty seminiferous tubules per mouse (n=3) were assessed for each genotype. Values were normalized as the number of cells per tubule per 100 mg of testis. Error bars represent s.e.m. P-values were calculated by Wilcoxon rank sum test. (F) Testis sections stained for H3K4me3 (magenta), Plzf (green) and Kit (orange). Scale bars: 50 µm (left) and 5 µm (right). In D and F, blue boxes indicate the areas enlarged on the right. Plzf+/Kit− SSCs are indicated by yellow arrowheads.
Immunohistochemical (IHC) analysis using antibodies for Kmt2b, the SSC marker Plzf (Buaas et al., 2004; Costoya et al., 2004) and the progenitor marker Kit (Ohbo et al., 2003; Shinohara et al., 2000) was used to examine the SSC-to-progenitor transition. In the control, we found that Kmt2b protein was most highly expressed in SSCs (Plzf+/Kit−). Weak expression was also detected in Plzf−/Kit+ progenitor spermatogonia and Sertoli cells (Gata4+) (Fig. 1D, Fig. S1E). In the knockout, we confirmed that Kmt2b protein was reduced in spermatogonia (Fig. 1D, Fig. S1F), but persisted in Sertoli cells (Fig. S1E,F), which is consistent with the quiescence of adult Sertoli cells (Vergouwen et al., 1991) resulting in a decrease in Kmt2b protein turnover. Thus, the absence of Kmt2b in spermatogonia is likely to be the cause of the testicular phenotype. We next used IHC to characterize the surviving germ cells in the knockout. We found that Plzf+/Kit− SSCs were retained in Kmt2bFC/FC testis whereas Plzf−/Kit+ progenitors were lost (Fig. 1D). The number of remaining Plzf+/Kit− SSCs did not change (P=0.066), but that of Plzf−/Kit+ progenitors significantly decreased (P<2.2e−16) (Fig. 1E). These observations suggest that the loss of Kmt2b activity in SSCs impairs the SSC-to-progenitor transition. However, when we evaluated the H3K4me3 level by IHC, there was no evident decrease of H3K4me3 in the knockout spermatogonia (Fig. 1F). This observation suggests that Kmt2b conveys only a subset of spermatogonial H3K4me3 with the bulk deposited by a different enzyme(s).
Kmt2b conveys H3K4me3 at bivalent and monovalent promoters in the male germline
Because the SSC-to-progenitor transition is blocked in the knockout, we expected the SSC chromatin state to be influenced by Kmt2b. Thus, we sought to identify Kmt2b target genes (i.e. genes showing reduced promoter H3K4me3 in the absence of Kmt2b) by chromatin immunoprecipitation (ChIP) using SSC-derived germline stem cells (GSCs). These cells can be expanded in culture and exhibit some features of SSCs, including gene expression characteristics and the ability to reconstitute complete spermatogenesis upon transplantation (Kanatsu-Shinohara et al., 2003; Sin et al., 2015). We generated Kmt2bF/F; Rosa26-CreERT2 GSCs (Fig. 2A) and confirmed efficient knockout of Kmt2b at the DNA, RNA and protein levels 6 days after treatment with 4-hydroxy-tamoxifen (4-OH-TMX), which is the active metabolite of tamoxifen (Fig. S2A-C). Like SSCs, GSCs do not depend on Kmt2b because Kmt2bFC/FC GSCs continued to proliferate in culture, albeit with a slower growth rate compared with Kmt2bF/F GSCs (Fig. S2D).
Kmt2b targets both bivalent and monovalent chromatin in GSCs. (A) Generation and knockout of Kmt2bFC/FC GSCs (n=2) for molecular analysis (see Materials and Methods for details). (B) Average enrichment of H3K4me3 ChIP-seq reads over all TSSs±3 kb. (C) Heat map showing H3K4me3 and H3K27me3 enrichment over TSS±3 kb. All TSSs carrying an H3K4me3 peak are shown (n=15,842). Genes (rows) for all plots were ordered in the same way: by the level of H3K4me3 decline in Kmt2bFC/FC GSCs. Significantly affected genes (n=1085) are indicated (vertical line at the top right corner). (D) Heat map showing H3K4me3 and H3K27me3 enrichment in GSCs for all H3K4me3-affected genes over TSSs±2 kb (n=1085). Kmt2b-affected TSSs were subdivided into bivalent (top, n=793) and monovalent (bottom, n=292) genes (see Materials and Methods) and H3K4me3 and H3K27me3 enrichments are shown. (E) GO terms (biological processes) for bivalent and monovalent targets showing reductions in H3K4me3 levels. (F) Examples of ChIP-seq peaks at TSSs at bivalent (Dll3 and Grin2b) and monovalent (Tmem140 and Pla2g10) promoters. Affected H3K4me3 peaks at TSSs are shaded.
Kmt2b targets both bivalent and monovalent chromatin in GSCs. (A) Generation and knockout of Kmt2bFC/FC GSCs (n=2) for molecular analysis (see Materials and Methods for details). (B) Average enrichment of H3K4me3 ChIP-seq reads over all TSSs±3 kb. (C) Heat map showing H3K4me3 and H3K27me3 enrichment over TSS±3 kb. All TSSs carrying an H3K4me3 peak are shown (n=15,842). Genes (rows) for all plots were ordered in the same way: by the level of H3K4me3 decline in Kmt2bFC/FC GSCs. Significantly affected genes (n=1085) are indicated (vertical line at the top right corner). (D) Heat map showing H3K4me3 and H3K27me3 enrichment in GSCs for all H3K4me3-affected genes over TSSs±2 kb (n=1085). Kmt2b-affected TSSs were subdivided into bivalent (top, n=793) and monovalent (bottom, n=292) genes (see Materials and Methods) and H3K4me3 and H3K27me3 enrichments are shown. (E) GO terms (biological processes) for bivalent and monovalent targets showing reductions in H3K4me3 levels. (F) Examples of ChIP-seq peaks at TSSs at bivalent (Dll3 and Grin2b) and monovalent (Tmem140 and Pla2g10) promoters. Affected H3K4me3 peaks at TSSs are shaded.
H3K4me3 ChIP-seq in Kmt2bF/F and Kmt2bFC/FC GSCs was employed to identify potential Kmt2b targets. The sequencing statistics are summarized in Table S1 and data quality evaluation in Fig. S2E,F. We found that the average enrichment of H3K4me3 at all transcription start sites (TSSs) was reduced in Kmt2bFC/FC (Wilcoxon P<2.2e−16) (Fig. 2B). When individual regions were analyzed, 6.8% (1085 regions) of all H3K4me3 peaks at TSSs (15,842) showed a significant decrease in H3K4me3 in the knockout (see Materials and Methods for details of peak analysis) (Fig. 2C). We next performed H3K27me3 ChIP-seq to investigate the possible association of Kmt2b with bivalent chromatin. Strikingly, the affected TSSs overlapped with the H3K27me3 enrichment, suggesting bivalent promoters are major Kmt2b targets in GSCs (Fig. 2C, Fig. S2G). The affected genes were enriched for developmental and differentiation genes (Fig. S2H), a finding similar to the situation in ESCs.
In addition to the bivalent targets, we found that 292 (27%) of the target promoters lacked H3K27me3, which we now refer to as ‘monovalent genes’ (Fig. 2D). Whereas the bivalent genes were highly enriched for differentiation and commitment, the monovalent genes were enriched for biosynthesis and signaling (Fig. 2E). Thus, these two classes of genes are associated with clearly separate sets of biological features. Examples of bivalent and monovalent genes are shown in Fig. 2F.
Bivalent and monovalent promoters are suppressed by distinct mechanisms
In ESCs and oocytes, bivalent genes are usually repressed and hence the absence of Kmt2b causes only a minimal effect on the transcriptome (Denissov et al., 2014; Hanna et al., 2018). To examine whether this is the case in GSCs, we performed RNA-seq analysis and compared the Kmt2bF/F and Kmt2bFC/FC GSC transcriptomes. The sequencing statistics are provided in Table S1 and data quality evaluation in Fig. S3A. We found that the Kmt2b target genes were silent in GSCs compared with the active genes associated with high- and medium-H3K4me3 peaks (Fig. 3A). Although monovalent gene expression was slightly higher than that of bivalent genes with high or medium H3K4me3, Kmt2b target genes were generally silent irrespective of the H3K4me3 level compared with non-target genes (Fig. S3B). Consistently, the effect of the knockout on mRNA was subtle across all gene categories with only 25 genes showing significant changes between Kmt2bF/F and FC/FC (see Materials and Methods for details of differential expression analysis) (Fig. 3B, Table S2).
Kmt2b targets are marked by Pc-G and non-Pc-G mechanisms. (A) mRNA expression of H3K4me3-high (>5 log2 RPM), -medium (−1 to 5 log2 RPM) and -low genes (<−1 log2 RPM) (250 randomly selected genes for each group) as well as bivalent and monovalent genes (>2-fold) in Kmt2bF/F and FC/FC GSCs. (B) Scatter plot of RNA-seq read count over all genes comparing Kmt2bF/F and Kmt2bFC/FC GSCs. Duplicate samples for each genotype were analyzed. Differentially expressed genes are indicated. (C) Average CpG density and ChIP-seq enrichment for Scml2, Rnf2, Bmi and H2AK119ub at bivalent and monovalent promoters. TSSs±5 kb are shown.
Kmt2b targets are marked by Pc-G and non-Pc-G mechanisms. (A) mRNA expression of H3K4me3-high (>5 log2 RPM), -medium (−1 to 5 log2 RPM) and -low genes (<−1 log2 RPM) (250 randomly selected genes for each group) as well as bivalent and monovalent genes (>2-fold) in Kmt2bF/F and FC/FC GSCs. (B) Scatter plot of RNA-seq read count over all genes comparing Kmt2bF/F and Kmt2bFC/FC GSCs. Duplicate samples for each genotype were analyzed. Differentially expressed genes are indicated. (C) Average CpG density and ChIP-seq enrichment for Scml2, Rnf2, Bmi and H2AK119ub at bivalent and monovalent promoters. TSSs±5 kb are shown.
Primed promoters with bivalent chromatin are generally associated with high CpG density and recruit Pc-G proteins mediating H3K27me3 (Lynch et al., 2012). In GSCs, Scml2, a component of Polycomb repressive complex 1 (PRC1), facilitates H3K27me3 of bivalent chromatin (Maezawa et al., 2018), and promotes RNF2-dependent H2A ubiquitylation (Hasegawa et al., 2015). To explore whether these factors are associated with the Kmt2b-dependent promoters, we evaluated CpG density and enrichment for PRC1 factors using ChIP-seq data (Hasegawa et al., 2015; Sin et al., 2015). The CpG density was higher at bivalent TSSs compared with monovalent TSSs, consistent with the higher accumulation of H3K27me3 at bivalent genes (Fig. 3C). All the PRC1 components examined (Scml2, Rnf2 and Bmi1) and H2AK119ub were enriched at the bivalent, but not at the monovalent, promoters. In ESCs, RNA polymerase II (RNAPII) pausing has been suggested to be involved in priming of bivalent promoters (Marks et al., 2012; Stock et al., 2007), but recent reports have shown contrasting evidence (Liu et al., 2017; Williams et al., 2015). We investigated this possibility by analyzing the RNAPII enrichment using RNAPII ChIP-seq data in GSCs (Sin et al., 2015) and found that RNAPII was not highly enriched at bivalent and monovalent TSSs compared with all H3K4me3-marked TSSs (Fig. S3C). Moreover, RNAPII pausing index (Adelman and Lis, 2012) did not show enriched pausing at Kmt2b-target promoters (Fig. S3C). The moderate RNAPII enrichment found at the Kmt2b-target TSSs, in particular at the monovalent genes (Fig. S3C), may reflect expression of a small fraction of the genes in GSCs (outliers in Fig. 3A, Fig. S3B). Thus, in line with bivalent genes in other cell types (Liu et al., 2017; Williams et al., 2015), RNAPII pausing may not be a central mechanism in the later activation of Kmt2b-target genes in the germline.
Taken together, these results suggest that Kmt2b primes genes in SSCs containing both monovalent and bivalent chromatin. Both gene sets were suppressed, but PRC1-mediated silencing occurred only at the bivalent promoters. In general, monovalent gene silencing does not appear to depend on H3K9me2, another representative facultative heterochromatic mark, or H3K9me3, an alternative modification reported to mark primed genes in mesenchymal stem cells and preadipocytes (Matsumura et al., 2015) (Fig. S3D). However, it is noteworthy that a small subset of bivalent and monovalent genes are marked by H3K9me3 (Fig. S3D), suggesting the presence of a small number of H3K9me3-dependent priming in SSCs. Interestingly, in ESCs, monovalent targeting by Kmt2b is less pronounced than in GSCs: 90% of the Kmt2b target promoters were bivalent with high H3K27me3 enrichment (Fig. S3E). This suggests that monovalent priming may be a feature more frequently found in lineage-specified stem cells. Furthermore, Kmt2b appears to prefer different genes in GSCs and ESCs: 6% of the bivalent genes and none of the monovalent genes in ESCs overlapped with those in GSCs (Fig. S3F).
Bivalent and monovalent genes are fated for distinct developmental stages
To explore the fate of the bivalent and monovalent genes, we grouped the genes according to their expression patterns during spermatogenesis (see Materials and Methods). The groups enriched for decreased expression towards late spermatogenesis were named Bi-1 (bivalent) and Mono-1 (monovalent), whereas the groups enriched for elevated expression were named Bi-2 and Mono-2. (Fig. 4A). The rest of the genes (i.e. unchanged genes) are shown in Fig. S4A. Bi-1 and Bi-2 groups were both associated with development, fate commitment and differentiation (Bi-1: e.g. Ascl1, Atoh1, Lbx1, Nkx2-2, Gsx2 and Tbx4; Bi-2: e.g. Wnt1, Gata3, Nkx2-1 and Sox6). Many of these genes are known to play roles during early development (Gross et al., 2002; Guillemot et al., 1993; Kaestner, 2010; Naiche and Papaioannou, 2003; Pandolfi et al., 1995; Smits et al., 2001; Sussel et al., 1998; Szucsik et al., 1997). Neither Bi-1 nor Bi-2 were enriched for known germline genes, but their expression implies potential roles in late spermatogenesis. The monovalent Mono-1 group included genes associated with signaling, but with no significant enrichment (P>0.2), and appears to contain genes already transcribed in spermatogonia (Fig. 4A), which may not be associated with a particular future stage. In contrast, Mono-2 genes showed upregulation during spermatogenesis and were associated with known spermatid development and reproduction functions (e.g. Adam3, Wbp2nl, Acrbp, Trp63, Adad1 and Piwil1) (Aarabi et al., 2010; Connolly et al., 2005; Kanemori et al., 2016; Nishimura et al., 2001; Petre-Lazar et al., 2007; Reuter et al., 2011; Shamsadin et al., 1999; Yu et al., 2000) (Fig. 4A,B, Table S2). Consistently, a portion of this group (53 out of 137) overlapped with a previously reported group of H3K4me2/3-marked genes in GSCs activated in spermatocytes or spermatids (Sin et al., 2015) (Table S2). Hence, Mono-2 genes appear to be relevant to the germline and are likely primed for later activation. These results suggest that Kmt2b primarily targets genes in SSCs that are mainly fated for activation in two distinct stages – spermatogenesis or embryonic development, depending on the absence or presence of H3K27me3.
Fate of Kmt2b target genes. (A) Clustering of bivalent and monovalent target genes by the relative mRNA expression patterns across spermatogenic stages: primitive spermatogonia (Pri-SG), type-A spermatogonia (A-SG), type-B spermatogonia (B-SG), leptotene spermatocytes (Lep-SC), pachytene spermatocytes (Pac-SC), round spermatids (R-ST) and elongated spermatids (E-ST). Genes were divided in subgroups (Bi-1, n=423; Bi-2, n=329; Mono-1, n=140; Mono-2, n=137; see Materials and Methods for details). Values are absolute RNA-seq expression levels (log2 RPKM). Associated GO categories with example genes are indicated on the right. (B) Example screenshots of Kmt2b-target Bi-1 (left) and Mono-2 (right) genes showing affected H3K4me3. (C) Heat map showing the fate of H3K4me3 at the (Bi-1, Bi-2) and monovalent (Mono-1, Mono-2) genes during spermatogenesis and post-fertilization embryonic development (TSSs±5 kb). Color scales for each dataset were automatically determined by ngs.plot.
Fate of Kmt2b target genes. (A) Clustering of bivalent and monovalent target genes by the relative mRNA expression patterns across spermatogenic stages: primitive spermatogonia (Pri-SG), type-A spermatogonia (A-SG), type-B spermatogonia (B-SG), leptotene spermatocytes (Lep-SC), pachytene spermatocytes (Pac-SC), round spermatids (R-ST) and elongated spermatids (E-ST). Genes were divided in subgroups (Bi-1, n=423; Bi-2, n=329; Mono-1, n=140; Mono-2, n=137; see Materials and Methods for details). Values are absolute RNA-seq expression levels (log2 RPKM). Associated GO categories with example genes are indicated on the right. (B) Example screenshots of Kmt2b-target Bi-1 (left) and Mono-2 (right) genes showing affected H3K4me3. (C) Heat map showing the fate of H3K4me3 at the (Bi-1, Bi-2) and monovalent (Mono-1, Mono-2) genes during spermatogenesis and post-fertilization embryonic development (TSSs±5 kb). Color scales for each dataset were automatically determined by ngs.plot.
Fate of Kmt2b-dependent H3K4me3 during development
Mammalian histones are largely replaced by protamines during spermiogenesis, but 1-10% of histones are retained in sperm. Such nucleosome-retained regions could give rise to intergenerational epigenetic inheritance to offspring (Arpanahi et al., 2009; Brykczynska et al., 2010; Hammoud et al., 2009). Potentially, Kmt2b-dependent H3K4me3 in SSCs may be a source of the epigenetic inheritance. We evaluated whether Kmt2b-marked genes retain nucleosomes in sperm. By analyzing ChIP-seq data (Erkek et al., 2013), we found that Kmt2b target TSSs retain H3.1/3.2, canonical forms of histone H3 variants, in preference to H3.3, which is the most abundant variant in sperm nucleosomes with H3.1/3.2 (Wilcoxon P<2.2e−16 and P=3.0e−15, respectively, compared with all TSSs) (Fig. S4B). H3.1/3.2 enrichment was greater at bivalent targets than monovalent targets (Fig. S4B; Wilcoxon P<2.2e−16). Furthermore, analysis of germline and embryonic ChIP-seq data (Erkek et al., 2013; Lesch et al., 2013; Zhang et al., 2016) showed that many Kmt2b-marked promoters contain H3K4me3 either up to sperm or early embryonic stages (Fig. 4C). The Mono-2 monovalent genes were marked by H3K4me3 in pachytene spermatocyte, round spermatid and sperm, in agreement with the mRNA expression patterns. Later, H3K4me3 was lost at monovalent genes in early embryos. In contrast, bivalent promoters showed moderate enrichment for H3K4me3 through early embryonic development and showed elevated levels in embryonic day 6.5 epiblast and adult cortex, consistent with their functions in development and differentiation. Interestingly, bivalent H3K27me3 was lost after fertilization, in agreement with a previous report (Zheng et al., 2016) (Fig. S4C). These results were confirmed by analyzing an independent set of ChIP-seq data (Lesch et al., 2013; Liu et al., 2016; Zhang et al., 2016) (Fig. S4D, Table S3). Thus, our data suggests the possibility that Kmt2b-dependent monovalent H3K4me3 is retained for gene activation at late spermatogenesis, whereas the bivalent H3K4me3 information may facilitate activation of embryonic genes, thereby providing a mechanism of intergenerational inheritance.
DISCUSSION
The mammalian female germline carries epigenetic information relevant to future developmental programs. The possibility that the male germline also carries epigenetic information was raised by the realization that not all nucleosomes are scrubbed in the histone-to-protamine displacement (Arpanahi et al., 2009; Brykczynska et al., 2010; Erkek et al., 2013; Hammoud et al., 2009). Here, we investigated the functional importance and potential priming role of Kmt2b in the male germline using conditional mutagenesis in mice and GSCs. We found that Kmt2b is highly expressed in SSCs and is essential for their differentiation towards Kit+ progenitors but is not required for SSC or GSC survival. At this stem-cell stage, Kmt2b conveyed H3K4me3 at bivalent promoters and also a new class of monovalent promoters lacking H3K27me3. Many of these monovalent genes were upregulated during late spermatogenesis whereas most bivalent genes were not activated until post-fertilization embryonic development. Like its role in depositing H3K4me3 onto bivalent promoters in ESCs, these results suggest that Kmt2b primes SSC chromatin by conveying H3K4me3 onto promoters before they are upregulated later in gene expression programs and that germline priming for intra- and inter-generational programs is already being established at the SSC stage (Fig. 5). In yeast, which has only one H3K4 methyltransferase, Set1, H3K4me3 on promoter nucleosomes reflects transcriptional activity (Soares et al., 2017), presumably because Set1 associates with elongating RNAPII (Dehé et al., 2006). Indeed, promoter H3K4me3 levels closely correlate with transcriptional activity in all eukaryotes with few exceptions, such as the action of Kmt2b/Mll2 on bivalent promoters in ESCs (Denissov et al., 2014). Our observations in SSC/GSCs concord with this exception and indicate that Kmt2b incorporates the special ability to deposit an epigenetic mark that flags a gene for later activation. This concept of ‘epigenetic priming’ requires further investigations to secure its validity. In particular, our data only pertain to the maintenance of H3K4me3 by Kmt2b on these promoters. Although it is likely that Kmt2b also establishes H3K4me3, this remains to be proven by different experiments. A further caveat regarding our experiments relates to the differences between GSCs and SSCs, which may have an impact on our observations. Further molecular work will need to be done to understand the role of Kmt2b in vivo.
Priming model of Kmt2b for future development. In SSCs, Kmt2b conveys H3K4me3 (red circles) to create monovalent (no H3K27me3, blue circle) and bivalent (with H3K27me3) chromatin for many genes required for germline and embryonic development. The genes required during late spermatogenesis are mainly primed by the monovalent mechanism whereas most embryonic genes are primed by the Pc-G-dependent bivalent mechanism.
Priming model of Kmt2b for future development. In SSCs, Kmt2b conveys H3K4me3 (red circles) to create monovalent (no H3K27me3, blue circle) and bivalent (with H3K27me3) chromatin for many genes required for germline and embryonic development. The genes required during late spermatogenesis are mainly primed by the monovalent mechanism whereas most embryonic genes are primed by the Pc-G-dependent bivalent mechanism.
We found PRC1 components enriched at bivalent but not monovalent promoters. This suggests that Kmt2b action in SSCs involves both Pc-G silencing and a non-Pc-G monovalent mechanism. The presence of unexpressed monovalent genes lacking H3K27me3 was unexpected. Monovalent silencing does not appear to involve the classical repressive marks H3K9me2/me3 and may be simply due to the absence of necessary transcription factors until late spermatogenic stages.
Our observation that monovalent and bivalent H3K4me3 marks were found towards late spermatogenic and embryonic stages when genes become activated suggests that the primed chromatin may be inherited by the 1-10% residual histones in mouse sperm. A previous report showed that most CpG island (CGI) promoters in sperm contain the H3.3 variant as a result of nucleosome turnover during spermiogenesis whereas canonical H3.1/H3.2 variants are retained at CGIs with Pc-G-mediated H3K27me3 (Erkek et al., 2013). Our data further support this notion by showing that non-Pc-G monovalent genes generally lose H3K4me3 as a result of nucleosome turnover before fertilization but that the Pc-G-dependent bivalent chromatin on canonical H3.1/H3.2 can be retained.
Bivalent promoters in stem cells have been associated with genes primed for expression in lineage-specific patterns. Our study extends this association to suggest that priming can occur through multiple mechanisms in the same lineage. Furthermore, the influence of Kmt2b priming could reach developmental programs in the next generation, which agrees with recent reports showing the importance of correct histone modification in the male germline for offspring development (Siklenka et al., 2015; Teperek et al., 2016). The blockade of SSC differentiation by Kmt2b disruption suggests a requirement for Kmt2b-mediated H3K4me3 in differentiation, but whether H3K4me3 priming is a prerequisite for SSC differentiation and whether different functions of Kmt2b are required remain open questions. Also, the phenotype raises the possibility that correct priming may influence the health of offspring and may be linked to pathogenesis of male infertility. Kmt2b appears to convey a priming role in SSCs as it does in ESCs (Denissov et al., 2014), suggesting that Kmt2-mediated epigenetic priming may be a fundamental property of stem cells and will be found in other stem cell systems, for example in hematopoietic stem cells, which rely on Kmt2a/Mll1 (Gan et al., 2010).
MATERIALS AND METHODS
Mouse usage
Mouse husbandry and experiments were carried out under the guidelines of Yokohama City University, and all animal experiments were approved by the Committee for Animal Care and Use at Yokohama City University. Mice carrying Kmt2bF/F; Rosa26-CreERT2 were reported previously (Glaser et al., 2009). Primer sequences for genotyping PCR are listed in Table S4. For the Kmt2b knockout, 100 µl of 10 mg/ml TMX (Sigma-Aldrich) dissolved in corn oil (Sigma-Aldrich) was injected intraperitoneally to adult males at weeks 8 and 12 for five consecutive days. The mice were checked for knockout efficiency by PCR and used for experiments at week 16. Eleven independent mice of each genotype were used. The Wilcoxon rank sum test was used to calculate statistical significance of weight and cell number differences between Kmt2bF/F and Kmt2bFC/FC.
Generation and culture of GSCs
GSCs were generated and cultured according to previous reports (Kanatsu-Shinohara et al., 2003; Kanatsu-Shinohara et al., 2014). Briefly, testes from Kmt2bF/F; Rosa26-CreERT2 pups at postnatal day 7-10 were digested using 0.25% trypsin/EDTA at 37°C for 15 min after removal of tunica albuginea. The dissociated cells were washed with DMEM/10% fetal calf serum (FCS) and cultured on a gelatinized dish in GS medium (Kanatsu-Shinohara et al., 2003; Kanatsu-Shinohara et al., 2014) at 37°C in 5% CO2 incubator. After a few passages, the cells were transferred onto mitomycin C-inactivated mouse embryonic fibroblasts (MEFs) and maintained in GS medium or serum-free medium. CSII-EF-IRES2-Venus vector was introduced into established GSCs by lentivirus transfection as previously described (Kanatsu-Shinohara et al., 2011). Venus-positive GSCs were collected by FACS-sorting using FACS AriaII (BD Biosciences) to deplete MEFs and used for RNA-seq and ChIP-seq experiments. For the H3K9me3 ChIP-seq experiment, two Kmt2bF/F GSC clones were used after MEF removal using the re-plating method (Sin et al., 2015). Kmt2b knockout in GSCs was carried out by adding 4-OH-TMX (Sigma-Aldrich) at a final concentration of 1 µM (1 mM 4-OH-TMX stock solution was prepared in 100% ethanol) to the media. After 6 h, GSCs were washed twice in PBS to remove 4-OH-TMX, and further cultured in 4-OH-TMX-free media. For control, GSCs were mock-treated with 100% ethanol in parallel with the knockout. Growth of GSCs (n=2) was assessed as follows: every 2 days, GSCs were trypsinized and re-plated on a gelatinized dish and incubated at 37°C for 30 min. The floating cells were collected to remove MEFs and used for counting with a hemocytometer. The cell lines have been checked for mycoplasma contamination by DAPI (Sigma-Aldrich) staining.
Immunohistochemistry
Testes were fixed and immunostained as described previously (Shirakawa et al., 2013). Briefly, animals were fixed by perfusion with 2% paraformaldehyde (PFA) and testes were dissected out, weighed, and incubated in 2% PFA for ∼1 h/25 mg of testis. After mounting in O.C.T. compound (Tissue-Tek), sections were sliced and stained as follows. Sections were incubated with PBS supplemented with 1% bovine serum albumin (BSA/PBS), and incubated for either 2 h at room temperature or overnight at 4°C, with an appropriate primary antibody followed by incubation with a secondary antibody for 1 h at room temperature. For staining the Kit protein, the TSA Biotin System (Perkin Elmer) was used to amplify signals. DAPI (Sigma-Aldrich) was used for nuclear staining. The sections were mounted with ProLong Diamond (Thermo Fisher), and observed using confocal laser microscopy (Olympus FV-1000). All antibodies used in this study are listed in Table S5.
Western blotting
Western blotting was performed as described (Shirakawa et al., 2013). All antibody information is included in Table S5.
ChIP-seq
ChIP experiments were carried out based on a previous report (Dahl and Collas, 2008) with modifications. About 2-3×105 GSCs were suspended in 250 µl of crosslinking solution containing 1% formaldehyde and 10% FCS in PBS and incubated for 5 min at room temperature. The reaction was quenched by the addition of 28 µl of 1.25 M glycine and incubated for 5 min. The crosslinked cells were rinsed with PBS/10% FCS and suspended in lysis buffer containing 50 mM Tris, pH 8.0, 1% sodium dodecyl sulfate (SDS), 1 mM phenylmethylsulfonyl fluoride (PMSF), Halt protease inhibitor cocktail (Thermo Fisher Scientific), and incubated on ice for 3 min with intermittent vortexing. The suspension was centrifuged at 900 g for 10 min at 4°C to isolate nuclei. The supernatant was discarded and the pellet containing chromatin was re-suspended in 110 µl of lysis buffer and sonicated using Covaris S2 (8 cycles; duty cycle, 5%; Intensity, 3; cycles/burst, 200; Time, 60 s). The tubes containing chromatin were centrifuged at 12,000 g for 10 min at 4°C after addition of 400 µl RIPA ChIP buffer (10 mM Tris, pH 7.4, 140 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, 1 mM PMSF, Halt protease inhibitor cocktail) and supernatant was transferred to a new tube. The pellet was re-suspended with another 100 µl of RIPA ChIP buffer and centrifuged again at 12,000 g for 10 min. The two supernatants were pooled and 100 µl aliquots were used for ChIP reactions or input samples. The chromatin was incubated with 10 µl of Dynabeads Protein G (Life Technologies), which was pre-incubated with an appropriate antibody, at 4°C overnight while rotating. The antibodies used for ChIP are listed in Table S5. The chromatin was collected by using a magnetic rack, and DNA extraction and crosslink reversal was performed in complete elution buffer (20 mM Tris, pH 7.4, 0.5 mM EDTA, 50 mM NaCl, 1% SDS, 50 µg/ml Proteinase K, 50 µg/ml RNase A) at 68°C for 3 h with vigorous shaking. Eluted DNA was purified by a standard phenol-chloroform extraction followed by ethanol precipitation. ChIP-seq libraries for Illumina sequencing was prepared using either NEBNext Ultra DNA Library Prep Kit for Illumina (NEB) or ThruPLEX DNA-seq Kit (Takara). After quality control using Bioanalyzer (Agilent) with High Sensitivity DNA Kit (Agilent), libraries were sequenced using Illumina HiSeq3000, HiSeq2500 and GAIIx.
RNA-seq
Total RNA from GSCs was purified using Isogen (Nippon Gene) according to the manufacturer's instructions. Quality control was ensured by Bioanalyzer (Agilent) with RNA 6000 Nano Kit (Agilent). Genomic DNA was digested using RQ1 DNase (Promega) at 37°C for 30 min, and the resulting RNA was used to generate a library with NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB). The libraries were sequenced using Illumina HiSeq2000.
Data analysis
Illumina reads were quality- and adapter-trimmed with Trim Galore! (version 0.4.0) (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). ChIP-seq reads were mapped onto mouse genome mm10 using Bowtie (version 1.1.1) (Langmead et al., 2009) with the parameters -n 2 -l 30 --best --strata -m 1. For the analysis of H3K9me3 ChIP-seq data, which were expected to contain a large proportion of reads mapping to repetitive sequences, non-uniquely mapped reads were also reported and analyzed. RNA-seq reads were mapped onto mm10 using TopHat (version 2.0.14) (Trapnell et al., 2009) with default parameters except that a gtf reference file was used. Total number of sequencing reads and mapping efficiency are summarized in Table S1. Read counts and quantification were performed using SeqMonk (version 0.30.2) (http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/), correcting for total read count per million reads (log2 RPM) for ChIP-seq. For RNA-seq, RPM was further normalized for gene length (log2 RPKM) using SeqMonk. For downstream analysis, all genes and TSSs associated with existing mRNA were extracted. Pearson correlation coefficients between all possible pairs of samples and hierarchical clustering were performed in R using either the DiffBind package (http://bioconductor.org/packages/release/bioc/html/DiffBind.html) or stats package. Heat map and enrichment profile drawing for ChIP-seq and CpG density were performed using ngs.plot (version 2.61) (Shen et al., 2014). Gene ontology (GO) analysis was performed using DAVID (Huang et al., 2009). All Venn diagrams were drawn using Venn Diagram Plotter (https://omics.pnl.gov/software/venn-diagram-plotter).
ChIP-seq peak detection was carried out using MACS2 (Zhang et al., 2008) with default settings and specifying input data as controls. TSSs±0.5 kb carrying both H3K4me3 and H3K27me3 peaks were defined bivalent. Those without H3K27me3 were defined monovalent. Identification of differentially enriched regions for Kmt2bF/F and FC/FC GSCs was achieved using the MACS2 bdgdiff module in combination with the >3-fold difference (log2 RPM) cutoff. RNAPII pausing index was calculated as the RNAPII ChIP-seq signal density over a promoter (TSS±300 bp) to a gene body (excluding the first 300 bp).
For the analysis of differentially expressed genes using GSC RNA-seq, genes showing P<0.05 significance with both DEseq2 and Benjamini and Hochberg correction were identified. Next, genes showing <2-fold difference (log2 RPKM) between Kmt2bF/F and FC/FC or falling among the bottom 25 percentile of all genes at the expression level in the more highly expressed groups were removed.
Identification of gene clusters (Fig. 4A, Fig. S4A) was carried out as follows. First, genes showing <2-fold difference in mRNA levels (RPKM) during spermatogenesis (Gan et al., 2013) were defined as ‘unchanged genes’. Then, K-means clustering with n=2 was performed using the remaining genes to separate them into two clusters depending on the expression trends during spermatogenesis.
Methodology
The numbers of replicate samples for each experiment are indicated in the respective figure legends. At least three biological replicate samples for IHC and two biological samples for ChIP-seq and RNA-seq were analyzed. No statistical method was used to predetermine sample size, and no randomization or blinding methods were used.
Acknowledgements
We are grateful to Drs Michio Ono and Keiichiro Yoshida for suggestions and discussions, Mr Hidetoshi Sone, Mr Kentaro Noguchi and Ms Tamaki Nagasaka for experimental support, Drs Naomichi Matumoto, Masataka Nakamura, Hideaki Matsuki, Mr Hisho Kawamura and Ms Maki Iwai for technical support and Dr Gavin Kelsey for critical reading of the manuscript. This work was carried out with the support of National Institute for Basic Biology (NIBB) Cooperative Research Program (Dr Shuji Shigenobu).
Footnotes
Author contributions
Conceptualization: S.T., K.A., A.F.S., K.O.; Methodology: S.T., T.S., K.O.; Formal analysis: S.T., Y.K., K.W., I.H., K.N., J.N., S.S., M.S., Y.S., H.R.; Investigation: S.T., Y.K., S.S., H.R.; Resources: S.T., T.S., K.M.; Data curation: S.T., K.W., J.N., S.S., A.D., D.A., M.S., Y.S., H.R.; Writing - original draft: S.T.; Writing - review & editing: H.R., A.H.F.M.P., K.A., A.F.S., K.O.; Supervision: S.T., Y.S., A.H.F.M.P., K.A., A.F.S., K.O.; Project administration: K.O.; Funding acquisition: S.T., A.F.S., K.O.
Funding
This work was partly supported by Grant-in-Aid for Scientific Research on Innovative Areas funding from the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) KAKENHI (25114004 to K.O., 221S0002, and 16H06279 to M.S., Y.S. and K.O.); Grant-in-Aid for Challenging Exploratory Research funding from the Japan Society for the Promotion of Science (JSPS) KAKENHI (25670097 to K.O.); Grant-in-Aid for Scientific Research (C) funding from JSPS KAKENHI (15K08156 to K.O.); Grant-in-Aid for Young Scientists (B) funding from JSPS KAKENHI (26860137 and 17K15549 to S.T.); the 'Creation and Innovation Centers for Advanced Interdisciplinary Research Areas' program from MEXT (to S.T.); the EU Seventh Framework Programme Integrated Project SyBoSS (to A.F.S.); the Novartis Research Foundation funding (to H.R. and A.H.F.M.P); and the Swiss National Science Foundation funding (31003A to A.H.F.M.P).
References
Competing interests
The authors declare no competing or financial interests.