Methylation of histone 3 lysine 4 (H3K4) is a major epigenetic system associated with gene expression. In mammals there are six H3K4 methyltransferases related to yeast Set1 and fly Trithorax, including two orthologs of fly Trithorax-related: MLL3 and MLL4. Exome sequencing has documented high frequencies of MLL3 and MLL4 mutations in many types of human cancer. Despite this emerging importance, the requirements of these paralogs in mammalian development have only been incompletely reported. Here, we examined the null phenotypes to establish that MLL3 is first required for lung maturation, whereas MLL4 is first required for migration of the anterior visceral endoderm that initiates gastrulation in the mouse. This collective cell migration is preceded by a columnar-to-squamous transition in visceral endoderm cells that depends on MLL4. Furthermore, Mll4 mutants display incompletely penetrant, sex-distorted, embryonic haploinsufficiency and adult heterozygous mutants show aspects of Kabuki syndrome, indicating that MLL4 action, unlike MLL3, is dosage dependent. The highly specific and discordant functions of these paralogs in mouse development argues against their action as general enhancer factors.
The lysine methylation status of the histone 3 tail is central to epigenetic regulation, pivoting on methylation of lysines at positions 4, 9, 27 and 36. All active RNA polymerase II (Pol II) promoters are characterized by trimethylation of histone 3 lysine 4 (H3K4me3) on the first nucleosome in the transcribed region. Dimethylation (H3K4me2) is a general characteristic of transcribed regions, whereas monomethylation (H3K4me1) is a general characteristic of active chromatin with peaks on enhancers (Bannister and Kouzarides, 2011).
Mammals have six orthologous Set1/Trithorax type H3K4 methyltransferases in three paralogous pairs; SETD1A and B (KMT2F and KMT2G), which are homologs of yeast Set1; MLL1 and 2 (KMT2A and KMT2B), which are homologs of Drosophila Trithorax; and MLL3 and 4 (KMT2C and KMT2D), which are homologs of Drosophila Lost PHD fingers of Trr (Lpt) fused to Trithorax-related (Trr). All six are found in individual complexes; however, all six complexes share the same highly conserved scaffold, first reported for yeast Set1C (Miller et al., 2001; Roguev et al., 2001) composed of four subunits, WDR5, RBBP5, ASH2L and DPY30 (Cho et al., 2007; Lee et al., 2006; Ruthenburg et al., 2007; Ernst and Vakoc, 2012) or less precisely ‘COMPASS’, which surrounds the SET domain and is required for enzymatic activity (Kim et al., 2013; Hsu et al., 2018; Qu et al., 2018). SETD1A apparently conveys most H3K4me3 in mammalian cells (Bledau et al., 2014). Similarly Set1 conveys most H3K4me3 in most Drosophila cell types (Ardehali et al., 2011; Mohan et al., 2011; Hallson et al., 2012). Consequently the Set1 homologs are primarily implicated in trimethylation and general promoter function. In contrast, evidence indicating that MLL3 and 4 are monomethylases (Weirich et al., 2015; Zhang et al., 2015; Li et al., 2016) has triggered their linkage to enhancer function (Lee et al., 2013; Rao and Dou, 2015; Piunti and Shilatifard, 2016). Whether they proceed to catalyze H3K4 di- and trimethylation remains uncertain (Dhar et al., 2012) and the emergent model relating Set1 activities to promoters and MLL3/4 to enhancers requires further substantiation.
Histone 3 lysine methyltransferases are prominent members of both Trithorax- (Trx-G) and Polycomb-groups (Pc-G) (Steffen and Ringrose, 2014; Schuettengruber et al., 2017) with the genetic opposition between Trx-G and Pc-G being exerted, in part, by a competition for the methylation status of the histone 3 tail on key nucleosomes (Schmitges et al., 2011; Voigt et al., 2012). This opposition is central to epigenetic regulation in development, differentiation, homeostasis and, more recently, oncogenesis (Chi et al., 2010; Rao and Dou, 2015; Soshnev et al., 2016) with several Trx-G and Pc-G factors, including the H3K27 methyltransferase EZH2, implicated as oncogenes or tumor suppressors in a variety of malignancies. MLL1 was discovered as the major leukemia gene at the 11q23.1 translocation involved in early onset childhood leukemia (Li and Ernst, 2014). The N-terminal half of MLL1 fused to many (now more than 70) different C-terminal partners, including AF4 (AFF1) and AF9 (MLLT3) (Slany, 2009; Meyer et al., 2018) is leukemiogenic without the need for secondary mutations (Dobson et al., 2000). These MLL1 fusion proteins promote both acute lymphocytic (ALL) and acute myeloid (AML) leukemias, collectively termed mixed lineage leukemias.
Massively parallel sequencing of cancer exomes by the international cancer genome projects revealed somatic mutations in MLL3 and MLL4 in almost all cancers analyzed (Rao and Dou, 2015). Inactivating heterozygous mutations have been identified in patients with medulloblastoma, B cell lymphoma, bladder carcinoma, renal carcinoma and colorectal cancer, among many other cancers (Morin et al., 2011; Parsons et al., 2011; Pasqualucci et al., 2011). An explanation of these findings is lacking; however, recent evidence suggests that mutation of Mll4 promotes defective transcription-coupled DNA repair (Kantidakis et al., 2016).
Exome sequencing also revealed mutations in MLL4 as the cause of Kabuki syndrome type I (Ng et al., 2010; Li et al., 2011). All MLL4 Kabuki mutations are apparently de novo somatic heterozygous nonsense or frameshift mutations that appear throughout the gene, but most commonly in exon 48. Most of these MLL4 mutations truncate the protein and all are haploinsufficient (Banka et al., 2012; Bogershausen et al., 2015; Faundes et al., 2019). The less common Kabuki syndrome type 2 is caused by mutations of UTX (KDM6A). UTX, which is an H3K27 demethylase, is a subunit of the MLL4 complex (Lee et al., 2006; Lederer et al., 2012, 2014; Banka et al., 2015).
As for MLL1 and MLL2 (Denissov et al., 2014), MLL3 and MLL4 may have overlapping and redundant functions in mammalian cells (Lee et al., 2013). Notably, the H3K4 methyltransferase activities of MLL3 and MLL4 are dispensable for gene expression in mouse embryonic stem cells (ESCs) (Dorighi et al., 2017). Similarly, the catalytic activity of the SET domain of Trr (the fly homolog of MLL3/MLL4) is dispensable for development and viability in Drosophila (Rickels et al., 2017).
Unlike the other four H3K4 methyltransferases (Yagi et al., 1998; Glaser et al., 2006, 2009; Jude et al., 2007; Andreu-Vieyra et al., 2010; Bledau et al., 2014) and despite their emerging importance for cancer, the roles of MLL3 and MLL4 in mammalian development have only been partly described (Lee et al., 2013; Ang et al., 2016; Jang et al., 2017). Here, we compare the null and heterozygous phenotypes of these two genes in mouse development.
Mll3 and Mll4 are conserved paralogs
MLL3 and MLL4 are the largest known nuclear proteins (4903 and 5588 amino acids, respectively) and their genes clearly arose by duplication. Both genes and proteins have the same architecture based on the positions of splice sites, PHD fingers, HMG box, FYRN/FYRC and SET domains (Fig. 1A; Table S4). Except for PHD3 of MLL3, which has been lost from MLL4 (because PHD3 can be found in Drosophila Lpt) (Mohan et al., 2011; Chauhan et al., 2012), the other five PHD and ePHD fingers share high identity. Both proteins are notable for their extensive regions of low sequence complexity, in particular MLL4 contains several lengthy stretches of glutamine repeats including a patch of 450 amino acids C-terminal to the HMG box with more than 50% glutamines and a 600 amino acid patch with one-third prolines after its second PHD finger (Table S4). Both genes encompass more than 50 exons (Fig. 1B,C) spliced to very long mRNAs (14 kb for Mll3 and 19 kb for Mll4) that are widely expressed in the mouse embryo (Figs S1C, S2C).
Mll3 and Mll4 knockouts die at different developmental stages
To explore the function of these conserved proteins, we established multipurpose alleles for both Mll3 and Mll4 by gene targeting. In our multipurpose allele strategy (Testa et al., 2004), a frameshifting exon(s) is flanked by loxP sites (exon 49 for Mll3; Fig. 1B, Fig. S1A, S1B and exons 2-4 for Mll4; Fig. 1C, Fig. S2A, S2B) accompanied by the insertion of a genetrap stop cassette in the intron upstream of these exons. The stop cassette, which is flanked by FRT sites, contains a lacZ reporter and stops target gene transcription because it includes a 5′ splice site, which captures the target gene transcript and a polyadenylation site, which terminates it, thereby – ideally – producing a null allele, termed the ‘A’ allele (Testa et al., 2004; Skarnes et al., 2011). After FLP recombination to remove the stop cassette, which establishes the ‘F’ allele and restores wild-type expression, subsequent Cre recombination establishes a frame-shifted mRNA in the ‘FC’ allele that should provoke nonsense-mediated mRNA decay (NMD) (Dyle et al., 2019).
The multipurpose allele strategy aims to establish a loxP allele for conditional mutagenesis and also to mutate the target gene in two different ways, either by truncation of the mRNA (A allele) or by NMD (FC allele). So, if A/A and FC/FC present the same phenotype, the conclusion that both are null can be established because the A and FC alleles mutate the gene in different ways. Although unlikely, this conclusion is not secure if the two different mutations produce the same hypomorphic or dominant negative phenotypes.
For Mll3, in addition to the multipurpose allele strategy we included a rox-flanked blasticidin-selectable cassette to provide for selection of the 3′ loxP site. Deletion of the rox-flanked cassette by Dre recombinase established the ‘D’ allele, which is equivalent to the ‘A’ allele described above. Subsequent FLP and Cre recombination establish ‘FD’ and ‘FDC’ alleles that are equivalent to ‘F’ and ‘FC’ alleles, respectively.
The multipurpose allele logic worked for Mll3. The D/D and FDC/FDC phenotypes were the same (as described below), supporting the conclusion that the knockout phenotype reported here is the null. In further support of this conclusion, targeting to insert the genetrap cassette into Mll3 intron 33 (Fig. S1D) presented the same phenotype (Table 1) as those described below for exon 49 targeting. Furthermore, a BayGenomics genetrap in Mll3 intron 9 also provoked neonatal death (Lee et al., 2013). Although no further analysis other than neonatal death was reported, this outcome concords with the other three Mll3 mutagenic alleles, supporting the conclusion that MLL3 is not required until late gestation.
For Mll4, the Mll4A/A and Mll4FC/FC phenotypes were different, potentially revealing different aspects of MLL4 function. As described below, Mll4A/A embryos are defective before gastrulation whereas Mll4FC/FC embryos died at birth. As expected, MLL4 expression was abolished by the intron 1 genetrap cassette insertion in Mll4A/A ESCs (Fig. 1D), indicating that the stronger phenotype is the null. However, mRNA (Mll4 mRNA expression at 68% of wild type in Mll4FC/FC ESCs) and truncated protein expression persisted in Mll4FC/FC ESCs, indicating that the milder phenotype was hypomorphic. Analysis of this hypomorphic allele is not presented in this paper. Our conclusions about the null Mll4 phenotype are supported by a briefly described homozygous BayGenomics Mll4 genetrap in intron 19 (Lee et al., 2013), which also resulted in embryonic lethality before embryonic day (E) 10.5. Although not investigated, this phenotype was severe and could be the same as Mll4A/A. Notably, another Mll4 allele may also present the same phenotype. Aiming to mutate the methyltransferase activity of MLL4, Jang et al. (2017) mutated three conserved tyrosines to alanines in the SET domain. However they observed MLL4 protein instability and may have inadvertently created a null. Again the embryos were not investigated except for observing severe early embryonic lethality that could be the same as Mll4A/A.
The Mll3 and Mll4 null phenotypes are strikingly different. Embryos lacking MLL3 appeared to develop normally until birth whereupon they died because they failed to breathe, although they gasped (Fig. 1E, Table 1). Embryos lacking MLL4 showed abnormalities during gastrulation and died shortly afterwards (Fig. 1F, Table 2). The earliest observable phenotype in Mll4A/A embryos was growth retardation starting at E6.5 followed by the appearance of a marked constriction at the embryonic/extra-embryonic boundary 1 day later (Fig. 1F). Later in development the Mll4A/A embryos displayed abnormal headfolds, absence of somites and heartbeat, and did not turn (Fig. 1F, Table 2). Despite this severe phenotype, Mll4A/A embryos displayed no observable change in global mono-, di- or trimethylation of H3K4 (Fig. S3). Furthermore, heterozygous Mll4 embryos exhibited abnormalities (see below) whereas heterozygous Mll3 embryos were normal.
Loss of MLL3 results in respiratory failure at birth
Both Mll3D/D and Mll3FDC/FDC mice died at birth and were indistinguishable (Table 1). Hence we now term both Mll3KO. To determine the cause of lethality of Mll3KO neonates, we followed natural delivery paying attention not to disturb maternal care. Mll3KO neonates quickly became cyanotic and died immediately after birth or were found dead (Fig. 1E, Table 1). These neonates had normal weight and morphology. The hearts of E15.5 and E18.5 Mll3KO fetuses showed no morphological abnormalities and were beating, indicating normal fetal circulation (Fig. S4A). Neonatal death by asphyxiation can be caused by a defect of the respiratory rhythm generator in the brain stem, which comprises the retrotrapezoïd nucleus/parafacial respiratory group (RTN/pFRG) and pattern generator neurons of the ventrolateral medulla named the pre-Bötzinger complex (preBötC). Knockout of Jmjd3 (Kdm6b), which together with the UTX and UTY subunits of the MLL3/4 complexes are the three known H3K27 demethylases (Van der Meulen et al., 2014), die at birth because the preBötC is absent (Burgold et al., 2012). Therefore we looked for, and found, the preBötC in Mll3KO perinatal brain stem (Fig. S4B), excluding this explanation for the failure to breathe. Furthermore, the rib cage and palate were intact and the intercostal muscles as well as the diaphragm were normal in Mll3KO and littermate fetuses (Fig. S4C-E).
Considering that the structures of the palate, diaphragm, intercostal muscles, brain stem and heart were normal, and the fact that MLL3 is highly expressed in the lung (Fig. S5A), we focused on defects in the lung as the cause of respiratory failure. Lungs from E15.5 and E18.5 fetuses were examined for morphological differences. At the pseudoglandular stage, E15.5, Mll3KO lungs appeared normal in lobulation pattern and small bifurcations in the distal epithelium (Fig. S5B). Examination of phospho-histone H3 (PH3) during this highly proliferative stage of lung development revealed no significant difference between control and Mll3KO lungs (Fig. S5B).
At the saccular stage several Mll3KO fetuses, but not all, had smaller lungs than wild-type littermates (Fig. 2A). Hematoxylin and eosin (H&E) staining of E18.5 lung sections showed that Mll3KO lung had thickened septae and smaller alveolar spaces (Fig. 2A). Staining with the epithelial marker E-cadherin (CDH1) showed normally developed epithelial lining, and the integrity of the proximal epithelium was confirmed by the tight junctional marker zonula occludens-1 (ZO-1; TJP1) (Fig. S5C). The differentiated proximal epithelium consists of neuroendocrine, ciliated and secretory club cells. Secretoglobin family 1A member 1 (SCGB1A1; CC10), which is a marker for club cells, was reduced in Mll3KO lungs (Fig. 2B). The specification of the various lung epithelial cell types requires a balance between differentiation and proliferation (Bellusci et al., 1997). The developing lung at the pseudoglandular stage is characterized by high proliferative activity, which drastically decreases during sacculation. Lungs of both genotypes underwent this inhibition of proliferation; however, to a lesser extent in the Mll3KO (Fig. S6A). Concomitantly, extracellular matrix (ECM) proteins laminin α1 (LAMA1) and fibronectin were apparently more prevalent in the ECM of the basement membrane in Mll3KO lungs (Fig. S6B).
To unravel the molecular basis of the lung maturation defect, we performed a global transcriptome analysis on lung samples of E18.5. As expected for an H3K4 methyltransferase, we observed five times more mRNAs down- than upregulated. Analysis of differentially expressed genes (DEGs) at a false discovery rate (FDR) of 5% revealed 70 up- and 354 downregulated genes (Fig. 2C, Table S5). The most highly enriched pathways in the group of downregulated DEGs related to cell motility, vasculature development, regulation of cell differentiation and morphogenesis, and lung development (Fig. 2D). GO term analysis suggests a defect in lung vasculature. Preliminary analysis with antibody staining against CD31 (PECAM1) and α-smooth muscle actin (αSMA) demonstrated the establishment of lung vasculature (Fig. S6C). The functional intactness still needs to be determined.
Using published datasets specific for distal lung epithelium [ciliated cells, club cells, alveolar type I and type II epithelial cell (AEC-I and -II)] (Treutlein et al., 2014), gene set enrichment analysis (GSEA) revealed a significant negative correlation exclusively for AEC-I marker genes (Fig. 2E, Table S5). Notably, 66% of the AEC-I marker genes were represented in our dataset (96 out of 145). Quantitative RT-PCR validated the RNA-seq results on all four selected AEC-I marker genes (Aqp5, Clic5, Pdpn and Hopx; Fig. 2F). The top downregulated mRNA was Aqp5, encoding a water channel protein. AQP5 staining visualizes the continuous squamous epithelium that lines the air spaces in control lungs at E18.5. However in Mll3KO lungs this characteristic marker was significantly reduced (Fig. 2B). Pdpn, which is required for effective maturation of AEC-I, was another downregulated AEC-I marker. We conclude that the AEC-I is the most affected cell type in MLL3 zygotic null development.
Incompletely penetrant neural tube Mll4 haploinsufficiency with sex distortion
The null alleles of Mll3 and Mll4 presented very different phenotypes in another way. No impact of the heterozygous Mll3 knockout was observed (Table 1), whereas many heterozygous Mll4 knockouts presented exencephaly. Neural fold defects were observed in Mll4A/+ embryos from E9.5 and further throughout gestation. At E9.5, some Mll4A/+ embryos had failed to close their neural tube. Later in embryogenesis, from E10.5 until E16.5, exencephaly was observed, with open cranial neural folds displaying an everted and enlarged appearance (Fig. 3A,B). This disorder was inherited with a parent-of-origin distortion. When only the father was the carrier of the Mll4A allele, on average 11% of the heterozygous embryos of all dissected developmental stages developed exencephaly, whereas 50% of the heterozygous embryos were affected if only the mother passed on the allele (Table 3). As expected, when both parents were heterozygous the number of exencephalic heterozygous embryos was in between these two frequencies at 20% (Table 2). Moreover, two-thirds of all exencephalic embryos were female (data not shown). Later in gestation, exposed neural folds degenerated, producing anencephaly (Fig. S8A). These embryos were found dead at P1, hence the low recovery rate of Mll4A/+ pups (Table 3). Because the Wnt1 gene is only 40 kb 3′ of the Mll4 gene on mouse chromosome 15, and Wnt1 is typically expressed in the dorsal midline of the developing hindbrain and spinal cord (Parr et al., 1993), we analyzed its expression using whole-mount in situ hybridization. As seen in Fig. 3C, Wnt1 is expressed normally in the exencephalic embryo, so disturbed Wnt1 expression is not the cause of the phenotype. Furthermore Otx2 was also normally expressed (Fig. 3C).
Adult heterozygous Mll4 mice present aspects of Kabuki syndrome
In humans, all MLL4 mutations associated with Kabuki syndrome are heterozygous. Having observed an embryonic heterozygous phenotype, we therefore looked for signs of Kabuki syndrome in viable Mll4A/+ mice after birth. Surviving Mll4A/+ pups were analyzed for facial, cranial and skeletal abnormalities but none was observed. However, we did find indications of the metabolic problems that are associated with Kabuki syndrome. Significant differences in body weight between wild-type and Mll4A/+ mice of both sexes from the age of 4 weeks were observed. Mll4A/+ mice of both sexes remained 30% smaller throughout adulthood (Fig. 4A) and showed a decreased amount of white adipose tissue (Fig. S8B). Although Mll4A/+ mice had the same fasting blood glucose levels as wild-type littermates, male Mll4A/+ mice displayed altered glucose tolerance, as their blood glucose levels declined faster and reached their initial values earlier (Fig. 4B). Also, insulin tolerance tests reported higher insulin sensitivity in male Mll4A/+ mice (Fig. 4C). In contrast female Mll4A/+ mice displayed a slight impairment in glucose tolerance and no change in insulin tolerance tests (Fig. 4B,C).
MLL4 is required for specification of the embryonic anterior-posterior axis
In contrast to the incompletely penetrant heterozygous Mll4 knockout, all Mll4A/A embryos displayed a phenotype similar to that observed in knockouts of Foxa2 (Hnf3b), Otx2 and Lhx1 (Lim1), in which specification of the anterior-posterior (A-P) embryonic axis is disrupted (Kinder et al., 2001). To examine A-P patterning in Mll4A/A embryos, we performed whole-mount in situ hybridizations for anterior and posterior molecular markers (Figs 5, 6) between developmental stages E6.5 and E7.75.
Lefty1, an early anterior marker expressed at the anterior visceral endoderm (AVE) of E6.5 embryos, was absent in Mll4A/A embryos. Expression of another AVE marker, Hex (Hhex), was restricted to the distal region of Mll4A/A embryos. Otx2 was prominent at the anterior ectoderm and visceral endoderm of control embryos but was detected at the distal epiblast and the overlying visceral endoderm of Mll4A/A embryos (Fig. 5A). Bmp4, Nodal and Wnt antagonist cerberus 1 (Cer1) is normally expressed at the AVE (Piccolo et al., 1999) but only weakly expressed at the distal region in Mll4A/A embryos (Fig. 5B). The Wnt antagonist Dkk1, is typically expressed closest to the embryonic/extra-embryonic boundary; however, in Mll4A/A embryos expression was observed at the distal region (Fig. 5C).
As the anterior markers were misexpressed at the distal region of Mll4A/A embryos (Fig. 5), we expected defective posterior patterning. Wnt3 and brachyury (T) are the earliest markers of the primitive streak confined exclusively to the posterior side of control embryos. However, in Mll4A/A embryos the expression was seen ectopically at the embryonic/extra-embryonic boundary (Fig. 6A,B). Nodal is essential for the induction of the primitive streak (Conlon et al., 1994) and is normally expressed in the posterior embryonic ectoderm, expanding along the length of the primitive streak. This failed in Mll4A/A embryos. Bmp4 and Eomes mark the posterior and anterior region of the primitive streak, respectively, and in addition mark the extra-embryonic chorion and amnion. In Mll4A/A embryos both markers were detected at the embryonic/extra-embryonic boundary (Fig. 6C). The patterning of posterior markers in Mll4A/A embryos indicates that the primitive streak did not extend distally. Consequently paraxial mesoderm was absent (Fig. S7).
At the anterior region of the primitive streak is a distinct region called the node, from which the axial mesoderm such as head process and notochord develop (Yamanaka et al., 2007; Benazeraf and Pourquié, 2013). Because the primitive streak did not extend in Mll4A/A embryos, we examined whether the node and the node derivatives were present. Goosecoid (Gsc) was distally induced in control embryos but expression was more confined and proximal in Mll4A/A embryos. Foxa2 and Lhx1 marked the node and its derivatives in control embryos. In Mll4A/A embryos expression for both was seen at the proximal epiblast. Shh was expressed exclusively in the head process arising from the node in control embryos but was completely absent in Mll4A/A embryos (Fig. 6D). Despite the fact that Mll4A/A embryos expressed markers of the node they failed to establish the node derivatives.
At E8.5 Mll4A/A mutants were characterized by the absence of Mox1 (Meox1) transcripts confirming the lack of somites. Very weak and discontinuous localization of T verified that the node derivatives were not specified (Fig. S7). We noticed weak expression of Hoxb1, suggesting the presence of precursor cells for rhombomere 4 (Fig. S7).
MLL4 is essential for AVE migration
Shortly after implantation, the first asymmetry in normal embryos is evident as a proximal-distal (P-D) axis (Beddington and Robertson, 1999). At the distal tip, AVE cells undergo a transition from columnar to squamous, protrude filopodia and migrate unidirectionally as a collective towards the future anterior of the embryo until they reach the embryonic/extra-embryonic boundary (Srinivas et al., 2004). At E6.5 squamous AVE cells form a single-layer reaching the embryonic/extra-embryonic boundary indicated by expression of HEX (Fig. 7A). In Mll4-deficient embryos, AVE cells failed to reach the embryonic/extra-embryonic boundary. Notably, HEX-expressing cells retained a cuboidal shape and displayed strong apical actin (Fig. 7A), indicating that the first defect in Mll4A/A embryos is a failure to undergo the columnar-to-squamous transition that precedes migration.
Despite the similarities between MLL3 and MLL4, including protein architecture, residence in apparently identical protein complexes, ubiquitous expression and high frequency of mutations in almost all human cancers, their null phenotypes in mouse development are dramatically different. MLL4 is indispensable for the establishment of the A-P axis and progression of gastrulation. MLL3 is not required until the final steps of lung development to ensure neonatal breath at birth. It therefore appears that these two conserved paralogs are required for very different, highly specific functions in mouse development. In this regard, they are similar to MLL1, which is first required for definitive hematopoiesis at E12.5 with apparently little other contribution to the developing embryo (Ernst et al., 2004). The observation that these highly conserved proteins are only required for a few, very specific, developmental functions does not concur with the prevailing model that MLL3 and MLL4 are the transcription co-factors that deposit the universal epigenetic characteristic of enhancers, H3K4me1 (Lee et al., 2013; Rao and Dou, 2015; Piunti and Shilatifard, 2016). This conundrum could find a resolution if the MLLs are embedded in a system of extensive functional redundancy and backup. If so, their individual knockout phenotypes mainly reveal flaws in the backup system rather than their ongoing activities. Some evidence for the functional backup proposition has been acquired (Lee et al., 2013; Denissov et al., 2014; Chen et al., 2017, 2018).
MLL3 and defective respiration
Mll3D/D and Mll3FDC/FDC neonates died owing to failures in the final steps of lung maturation. MLL3 is not required for patterning of the lung but is required for efficient differentiation of distal lung epithelium when squamous alveolar type I epithelial cells arise from columnar alveolar type II epithelial cells, and thinning of the mesenchyme possibly due to sustained proliferative activity during the canalicular/saccular phase. The concomitant occurrence of defects in two nearby cell types suggest an underlining failure of cell signaling during the final maturation of the lung.
MLL4 and defective AVE migration
Collective cell migration of the AVE in the mouse embryo precedes gastrulation (Rossant and Tam, 2009) (Fig. 7B). Before migration, extra-embryonic visceral endoderm cells change their shape from columnar to squamous, which involves cytoskeletal rearrangements and the projection of filopodia towards the direction of migration (Srinivas et al., 2004). At the migratory front of the cell the Rho GTPase RAC1 is positioned to regulate actin polymerization. Notably, the Rac1 knockout phenotype is comparable with Mll4A/A, characterized by failed AVE migration and lethality before E9.5 (Sugihara et al., 1998; Migeotte et al., 2010). NAP1 (NCKAP1), a component of the WAVE complex, acts downstream of RAC1 to control actin branching, and Nap1 mutants do not establish the A-P axis owing to failed AVE polarization and migration (Rakeman and Anderson, 2006). Considering these similarities, we suggest that MLL4 regulates RAC1 and/or the WAVE complex. Another possibility is that MLL4 acts through interaction with UTX, which was found in an unbiased screen to regulate cell migration (Thieme et al., 2013).
In the absence of MLL4, gene expression associated with the AVE including Hex, Dkk1 and Cer1 was established but mislocalized at the distal region of the embryo owing to the absence of migration. This migration is essential for the correct patterning of both the anterior and posterior of the embryo (Stower and Srinivas, 2014). Consequently, the failure of the AVE to reach the anterior embryonic/extra-embryonic boundary resulted in failed elongation of the primitive streak towards the distal tip, indicated by misexpression of T, Wnt3, Bmp4 and Eomes. As a result, node markers Gsc, Foxa2 and Lhx1 were restricted to the posterior-proximal region (Fig. 7B). These failures preceded the absence of mesodermal derivatives emerging from the primitive streak.
MLL3 and MLL4 also differ regarding their heterozygous phenotypes. In our survey of the six H3K4 methyltransferases in mouse development (Glaser et al., 2006, 2009; Andreu-Vieyra et al., 2010; Bledau et al., 2014; Denissov et al., 2014; Brici et al., 2017; Chen et al., 2017; Hanna et al., 2018), only the knockout of Mll4 has presented embryonic haploinsufficiency. Notably, Drosophila Trr also displays haploinsufficiency (Chauhan et al., 2012). Among the various explanations for haploinsufficiency, transcriptional synergy may be relevant for MLL4. Transcriptional synergy is based on co-operative recruitment of transcription factors to cis regulatory elements to achieve a transcriptional output, which involves a threshold and sigmoidal response to protein concentration (Veitia et al., 2018). Furthermore, a contribution by MLL4 to fixing the stability of decisions made by stochastic choices could explain the incomplete penetrance (Cook et al., 1998). Incomplete penetrance in a developmental choice indicates that MLL4 does not make the choice but rather reduces the error rate by either stabilizing the choice or counteracting mistakes. A primary role for epigenetic regulation in choice stabilization and error reduction concords with our recent findings in yeast where Set1C and the H3K4me3 demethylase Jhd2 act together as a quality control mechanism to ensure symmetrically trimethylated nucleosomes (Choudhury et al., 2019).
The frequency of neural tube defects in Mll4A/+ embryos was affected by the sex of the mutant parent. The defects were more likely when the null allele was transmitted from the mother. Thus it seems likely that MLL4 contributes to oogenesis as well as neurulation. Alternatively, the sex distortion may relate to the sex-specific difference between MLL4 complexes; in females this includes only UTX, whereas in males both UTX and UTY are involved.
More than 300 genes have been implicated in neural tube closure defects, which are very common in humans, estimated at 1 per 1000 fetuses (Juriloff and Harris, 2018), and epigenetic mechanisms involving DNA and histone methylation have emerged as particularly important (Harris and Juriloff, 2010). Notably, folic acid supplementation during pregnancy, which elevates S-adenosylmethionine (SAM) levels, diminishes the probability of neural tube closure defects in certain cases (Greene and Copp, 2005). The accompanying proposition that females are more susceptible to neural tube defects due to the increased requirement for SAM in X-chromosome inactivation (Juriloff and Harris, 2000), may be relevant to our observations of sex distorted neural tube closure defects.
Concordant with Mll4 haploinsufficiency in the mouse, de novo heterozygous mutations of MLL4 are the primary cause of the rare congenital Kabuki syndrome, which involves mental retardation and a distinctive facial appearance. Further features of this phenotypically variable disorder include postnatal dwarfism, heart and kidney dysfunction, skeletal abnormalities, loss of hearing, gastrointestinal disorders and metabolic imbalances including hypoglycemia (Banka et al., 2012, 2015; Bogershausen et al., 2015; Yap et al., 2019). A very mild version of Kabuki syndrome has been observed in a mouse model based on a heterozygous, hypomorphic Mll4 allele (Benjamin et al., 2017). Similarly, the Mll4A/+ phenotype only presents limited aspects of Kabuki syndrome: reduced body weight, stunted growth and hypoglycemia together with reduced body fat. Nevertheless, these haploinsufficient Mll4 observations indicate that the amount of expressed MLL4 is crucial to its function. Mll4 haploinsufficiency reveals a requirement for MLL4 function in development after its role in the AVE columnar to squamous transition. Mll4 conditional mutagenesis also revealed later requirements in heart development, myogenesis and adipogenesis (Lee et al., 2013; Ang et al., 2016). Notably, Ang et al. (2016) presented evidence of aortic haploinsufficiency.
Because MLL4 is required for one of the earliest collective cell migrations in mouse development, and because it is required for the cellular migration processes involved in closure of the neural tube, we suggest that MLL4 is a master regulator of cell migration gene expression programs. Although diverse and as yet only partially documented, the evidence for the various MLL4 functions in mouse development are nevertheless still limited to a small group of specific indications. For MLL3 the indications are even more limited and, together, these indications do not offer an explanation for the extraordinary prevalence of MLL3 and MLL4 mutations in human cancers. The proposition that the MLL system is deeply embedded in functional redundancy and backup may resolve this conundrum. Testing this proposition requires concerted conditional mutagenesis, which is underway.
MATERIAL AND METHODS
The targeting constructs for Mll3 and Mll4 were generated using Red/ET recombineering (Fu et al., 2010). For Mll3 an FRT-SA-GT0-T2A-lacZneo-CoTC-FRT-loxP cassette was inserted into intron 48. In addition, a loxP-rox-PGK-Blasticidin-pA-rox cassette was introduced into intron 49. For Mll4, a loxP site was introduced into intron 4 using a loxP-zeo-loxP cassette with subsequent removal in E. coli by Cre recombination using pSC101-BAD-Cre-tet (Anastassiadis et al., 2009). Then a loxP-FRT-SA-IRES-lacZneo-pA-FRT cassette was inserted in intron 1 of the gene. The homology arms were 5′ 4.6 kb/3′ 5 kb and 5′ 4.7 kb/3′ 4.9 kb for the Mll3 and Mll4 targeting constructs, respectively.
Gene targeting and generation of conditional knockout mice
Gene targeting in R1 ESCs was performed as described (Bledau et al., 2014). The correctly targeted Mll3 clones were electroporated with CAGGS-Dre-IRES-puro expression vector (Anastassiadis et al., 2009) and clones screened by PCR for complete recombination and sensitivity to blasticidin. After germline transmission, Mll3D/+ mice were crossed to CAGGs-Flpo (Kranz et al., 2010) to generate Mll3FD/+ mice and then crossed to PGK-Cre (Lallemand et al., 1998) to produce Mll3FDC/+ mice. Mll3D/+ and Mll3FDC/+ mice were backcrossed to C57BL/6JOlaHsd mice (>15 generations). Mll4A/+ mice were backcrossed to CD1 mice (>15 generations). Primers for genotyping are provided in Table S1. All animal experiments were performed in accordance with German animal welfare legislation, and were approved by the relevant authority: the Landesdirektion Dresden.
Western blot, whole-mount X-gal staining and immunostaining
ESCs were homogenized in buffer E [20 mM HEPES (pH 8.0), 350 mM NaCl, 10% glycerol, 0.1% Tween 20, 1 mM PMSF, 1× complete protease inhibitor cocktail] and protein extracts were obtained after three cycles of freezing and thawing. Whole cell extracts were subsequently separated by NuPAGE 3-8% Tris-acetate gel (Invitrogen), transferred to PVDF membranes and probed with an MLL4-specific polyclonal antibody. The antibody was generated by immunizing rabbits with a mixture of three KLH-conjugated synthetic peptides from the central part of the MLL4 protein (QRPRFYPVSEELHRLAP, NGDEFDLLAYT, KQQLSAQTQRLAPS) (Table S4).
Embryos were dissected, fixed with 0.2% glutaraldehyde and X-gal stained as previously described (Kranz et al., 2010). Embryos and organs were dissected and fixed with 4% paraformaldehyde (PFA) overnight. Dehydration and paraffin infiltration utilized the Paraffin-Infiltration-Processor (STP 420, Zeiss). Dehydrated tissues were embedded in paraffin (Paraffin Embedding Center EG116, Leica) and sectioned. Antigen retrieval was performed by microwaving slides in 10 mM citrate buffer (pH 6.0) for 12 min (Microwave RHS 30, Diapath). Immunohistochemical staining was performed as previously described (Bledau et al., 2014). Images were collected using an Olympus WF upright microscope. For immunofluorescence, sections were permeabilized in 0.5% Triton X-100 in PBS for 10 min, blocked for 1 h at room temperature (RT), incubated with primary antibody overnight at 4°C, followed by secondary antibody for 2 h at RT. Sections were mounted with Mowiol and imaged with Zeiss Laser Scanning Confocal Microscope LSM/780. Antibody information and dilutions are in Table S3.
Whole-mount in situ hybridization
Standard procedures were employed (Riddle et al., 1993; Piette et al., 2008). Digoxigenin-labelled riboprobes were: T (Herrmann, 1991), Mox1, Hoxb1 and Wnt1 (Glaser et al., 2006), Otx2 (Ang et al., 1996), Nodal, Eomes, Bmp4 and Hex (Norris et al., 2002), Cer1 (Belo et al., 1997), Foxa2 (Norris et al., 2002) and Gsc, Lhx1 and Wnt3 (Liu et al., 1999), Dkk1 and Lefty1 (Stuckey et al., 2011), Shh and Tbx6 (Alten et al., 2012). Anti-dig-AP antibody and NBT/BCIP colorimetric signal detection were used for whole-mount in situ hybridizations. Embryos were imaged using a Nikon SMZ 1500 stereomicroscope.
Whole-mount immunofluorescence for HEX
PFA-fixed embryos were permeabilized in 0.5% Triton X-100 in PBS for 1 h at RT, incubated with anti-HEX antibody (Hoshino et al., 2015) overnight at 4°C followed by goat anti-rabbit IgG-CFL 488 (Santa Cruz Biotechnology) secondary antibody. The embryos were imaged with Zeiss Laser Scanning Confocal Microscope LSM/780.
Glucose and insulin tolerance test
For glucose tolerance test, mice fasted 16 h before 1.5 mg glucose per g body weight was applied by gavage. For insulin tolerance test, mice were injected intraperitoneally with 0.75 mU insulin per g body weight after 6 h of fasting.
Reverse transcription and real-time quantitative PCR (qRT-PCR) analysis
Total RNA was isolated using Trizol (Sigma-Aldrich) and reverse transcribed using the AffinityScript Multiple Temperature cDNA Synthesis kit (Agilent Technologies). qRT-PCR was performed with GoTaq qPCR Master Mix (Promega) using an Mx3000P QPCR System (Agilent Technologies). Ct values were normalized against Rpl19. Primer sequences and length of the amplified products are given in Table S2. Fold differences in expression levels were calculated according to the 2–ΔCt method (Livak and Schmittgen, 2001).
mRNA expression profiling
Total RNA from E18.5 lung samples of control (n=3) and Mll3KO (n=3) mice was purified using Trizol (Sigma-Aldrich) and quality ensured by using Bioanalyzer (Agilent) with the RNA 6000 Nano Kit (Agilent). mRNA was isolated from 1 μg total RNA by poly-dT enrichment using the NEBNext Poly(A) mRNA Magnetic Isolation Module (New England Biolabs) according to the manufacturer's instructions. Samples were then directly subjected to the workflow for strand-specific RNA-seq library preparation (Ultra Directional RNA Library Prep II, New England Biolabs). Resulting libraries were pooled in equimolar quantities for 75 bp single read sequencing on Illumina NextSeq500. FastQC (Babraham Bioinformatics) and RNA-SeQC (v188.8.131.52) (DeLuca et al., 2012) were used to perform a basic quality control on the resulting reads. Fragments were then aligned to the mouse genome (GRCm38/mm10) using GSNAP (2019-06-10) (Wu and Watanabe, 2005; Wu and Nacu, 2010) and a table of read counts per gene was created based on the overlap of the uniquely mapped reads with the Ensembl Gene annotation v. 98 for mm10, using Feature Counts (v1.6.3) (Liao et al., 2014). Normalization of the raw read counts based on the library size and testing for differential gene expression between the different conditions was performed using the DESeq2 R package (v. 1.24.0) (Anders and Huber, 2010). Genes with an adjusted P-value (padj)≤0.05 were considered as significantly differentially expressed, accepting this way a maximum of 5% false discoveries. To identify enrichment for particular biological processes associated with the DEGs, the DAVID GO/BP/FAT database (Huang da et al., 2009) was used. Gene set enrichment analysis was performed using GSEA software from the Broad Institute (Subramanian et al., 2005).
We thank Mandy Obst, Doris Müller, Isabell Kolbe, Madeleine Walker and Stefanie Weidlich for excellent technical assistance. We also thank the Biomedical Services of the Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany, for the excellent service and technical assistance. We thank Dr Siddharth Banka (University of Manchester, UK) for discussions and Dr Tristan A. Rodriguez (Imperial College, London, UK), Dr Karin Schuster-Gossler (MH Hannover, Germany), Dr Janet Rossant (University of Toronto, Canada), Dr Elizabeth Robertson (University of Oxford, UK), Dr Dominic Norris (MRC Harwell, UK), Dr Rachel D. Mullen (MD Anderson, TX, USA), Dr Hans Schöler (Max Planck Institute for Molecular Biomedicine, Münster, Germany) and Dr Andrew P. McMahon (University of Southern California, CA, USA) for providing probes for whole-mount in situ hybridization and Dr Go Shioi (RIKEN Kobe, Japan) for providing the anti-HEX antibody and advice. The Advanced Imaging Facility, a core facility of the CMCB Technology Platform at Technische Universität Dresden, http://biotp.tu-dresden.de/facilities/advanced-imaging/ assisted this research.
Conceptualization: D. Ashokkumar, Q.Z., C.M., A.S.B., J.F., K.A., A.F.S., A.K.; Methodology: D. Ashokkumar, Q.Z., C.M., A.S.B., R.N., A.D., N.G., J.F., K.A., A.K.; Validation: D. Ashokkumar, Q.Z., A.K.; Formal analysis: D. Ashokkumar, Q.Z., D. Alexopoulou, N.G., A.F.S., A.K.; Investigation: D. Ashokkumar, Q.Z., C.M., A.S.B., J.F., K.A., A.K.; Resources: R.N., A.D., J.F., K.A.; Data curation: D. Alexopoulou; Writing - original draft: D. Ashokkumar, Q.Z., A.F.S., A.K.; Writing - review & editing: D. Ashokkumar, A.F.S., A.K.; Visualization: D. Ashokkumar, Q.Z., C.M.; Supervision: K.A., A.F.S., A.K.; Funding acquisition: D. Ashokkumar, A.F.S., A.K.
This work was supported by funding from the Else Kröner-Fresenius-Stiftung (2012_A300 to A.K. and A.F.S.), the Deutsche Forschungsgemeinschaft (KR 2154/6-1 to A.K. and STE 903/12-1 to A.F.S.), the Deutsche Krebshilfe (110560 to A.K. and A.F.S.) and the Scholarship Program for the Promotion of Early-Career Female Scientists of Technische Universität Dresden (to D. Ashokkumar).
RNA-seq data have been deposited in GEO under accession number GSE146915.
The authors declare no competing or financial interests.