The mammary gland is a unique tissue and the defining feature of the class Mammalia. It is a late-evolving epidermal appendage that has the primary function of providing nutrition for the young, although recent studies have highlighted additional benefits of milk including the provision of passive immunity and a microbiome and, in humans, the psychosocial benefits of breastfeeding. In this Review, we outline the various stages of mammary gland development in the mouse, with a particular focus on lineage specification and the new insights that have been gained by the application of recent technological advances in imaging in both real-time and three-dimensions, and in single cell RNA sequencing. These studies have revealed the complexity of subpopulations of cells that contribute to the mammary stem and progenitor cell hierarchy and we suggest a new terminology to distinguish these cells.
The mammary gland is a dynamic organ that develops primarily in the adult, undergoing extensive expansion during puberty, followed by cycles of growth and regression with each estrus cycle and every pregnancy/lactation/involution cycle. The mammary gland epithelium is a bi-layer composed of luminal cells that line the ductal or alveolar lumen, and myoepithelial cells (also called basal cells) that surround the luminal layer and contact the stroma, which is composed of extracellular matrix, adipocytes and various immune cells. There are three distinct stages of morphogenesis as depicted in Fig. 1. The first commences in utero, around embryonic day (E) 10.5 in the mouse and 7 weeks in the human fetus (McNally and Stein, 2017; Propper et al., 2013), and establishes the rudimentary ductal tree that then undergoes isometric growth postnatally until puberty. A surge in estrogen production at puberty triggers the second morphogenetic event, which is the formation of club-shaped structures called terminal end buds (TEBs) at the tips of the ducts. TEBs are comprised of cap and body cells and become sites of massive proliferation and bifurcation, resulting in a highly branched network of ducts and side branches that fill the adipose-rich fat pad in which they are embedded. Subsequently, cyclical expansion and regression of tertiary branches occurs with each estrus cycle. The third, and most dramatic, morphogenetic event occurs in response to pregnancy, with the formation of lobuloalveolar structures at the tips of tertiary branches, in response primarily to progesterone, that expand as pregnancy progresses and terminally differentiate during lactation to produce milk. Concomitant with this expansion of the alveolar epithelium is the dedifferentiation of the adipocytes in the fat pad, most likely to make space for the massive expansion of the epithelium and provide lipid for milk fat production (Zwick et al., 2018). Notably, when lactation ceases, these lobuloalveolar structures regress by a combination of programmed cell death and tissue re-modelling, and the gland returns to a branched ductal network similar to the pre-pregnant state.
There has been considerable interest in identifying and defining mammary stem cells (MaSCs) and determining their capacity to generate a branched ductal network and lobuloalveolar structures throughout life. Indeed, since we wrote our first review a decade ago (Watson and Khaled, 2008), new experimental approaches and technical advances have dramatically added to our knowledge of mammary gland development. Our view of MaSCs, the utility of the once gold-standard MaSC transplantation assay (Daniel et al., 1968), and the use of cell-surface markers to isolate and quantify cell subpopulations has changed (Shackleton et al., 2006; Stingl et al., 2006). It is now clear that there are subsets of MaSCs that may be unipotent or bi/multipotent, that may be quiescent or proliferative, and that are more or less susceptible to reprogramming by their microenvironment. The distinction between stem and progenitor cells is more than semantics and is especially challenging given the plasticity of mammary epithelial cells. We suggest that new terminology may be useful in describing these different subsets of stem and progenitor cells (Fig. 2). In brief, we suggest that MaSCs should not be defined by their expression of cell-surface markers nor their ability to repopulate a cleared fat pad, but should be defined by their true potential in vivo and in situ in the normal mammary gland. In this context, a MaSC must be at least bipotent with unlimited self-renewal and replicative potential, whereas a progenitor cell is lineage restricted and must be long-lived with high self-renewal and replicative potential.
In this Review, we provide an overview of the three distinct stages of mammary gland development, and discuss recent insights into the cell, molecular and genetic events associated with morphogenetic and functional changes that occur at these stages. The spectacular expansion of lineage-tracing studies in the mammary gland has been prompted by the generation of lineage-specific promoter-driven reporter genes combined with temporal induction in defined cohorts of cells, using tamoxifen or doxycycline-inducible constructs (reviewed by Zhou et al., 2019). We highlight recent advances in tissue clearing and deep imaging, live imaging, single cell RNA sequencing (scRNA-seq) and epigenetic analyses. We discuss the impact of these new approaches on our understanding of mammary gland development and the unexpected complexity of the mammary epithelial hierarchy. We draw on a wide range of studies to provide a comprehensive overview on mammary gland development, focusing on the mouse mammary gland, as this is the most experimentally tractable species.
Embryonic mammary gland development
The early stages of mammary gland development are independent of hormones, unlike subsequent phases. Mammary gland formation is first visualised around E10.5 in the mouse by the expression of Wnt10b in bilateral streaks that run from the fore- to hindlimb buds. These mammary, or milk, lines give rise to five pairs of placodes that are visible at E11.5, and arise from the surface ectoderm of the embryonic skin. These placodes subsequently invaginate into the underlying tissue to give rise to buds, which then become embedded within a condensed mammary mesenchyme. These buds gradually increase in size, partly through cell hypertrophy and recruitment of ectodermal cells (Lee et al., 2011) until E15-E16 when they start to form a primary sprout, which invades the secondary mammary mesenchyme, and branching morphogenesis is initiated (Cowin and Wysolmerski, 2010) and a nipple sheath is formed (Propper et al., 2013) (Fig. 1A).
Partly because of their small size and difficulty of detection, investigating the development of embryonic mammary glands is challenging. Nonetheless, a number of approaches have been used including reporter gene expression, immunohistochemistry and gene ablation. Early genetic studies in mice have demonstrated that all five pairs of glands do not require the same genetic components and although placode pairs develop symmetrically, they do not develop synchronously: placode 3 appears first, followed by placode 4, then placodes 1 and 5, and finally placode 2 (Cowin and Wysolmerski, 2010). A striking example of different genetic regulation is provided by the deletion of Tbx3, which results in loss of the 3rd pair of placodes (Davenport et al., 2003). In humans, mutations in TBX3 can result in failure to develop breasts, known as mammary-ulnar syndrome (Jerome-Majewska et al., 2005). The reciprocal signalling between mammary epithelium and its surrounding stroma is an essential component of mammary gland morphogenesis. In the embryo, the fibroblast growth factor (Fgf), Wnt, Ectodysplasin-A1 (Eda) (mediated by NF-κB) and parathyroid hormone related protein (PTHrP; also known as Pthlh) signalling pathways are predominant. For example, deficiency in Fgf10 or its receptor Fgfr2b blocks induction of all mammary placodes except the fourth. In contrast, loss of the Wnt signal mediator Lef1 leads to absence of only placodes 2 and 3, whereas epithelial overexpression of the soluble Wnt inhibitor Dkk1 completely prevents mammary placode formation (Cowin and Wysolmerski, 2010). Although Eda is dispensable for placode formation, overexpression of Eda in the ectoderm results in the formation of supernumerary mammary placodes, particularly between placode pairs 3 and 4 (Lindfors et al., 2013). The Eda pathway regulates expression of Fgf20 that in turn regulates mammary bud growth, and also TEB formation, ductal outgrowth and branching during puberty (Elo et al., 2017). Other genes important in determining placode number and formation include Hoxc8, which is transiently expressed in surface ectoderm at E10.5. Misexpression of Hoxc8 results in the formation of several ectopic mammary placodes (Carroll and Capecchi, 2015). Formation of placode pairs 3 and 5 requires the repressor function of the Hedgehog signalling pathway regulator Gli3 (Chandramouli et al., 2013a). The initiation of bud outgrowth is triggered by expression of PTHrP in the epithelium, and mice null for PTHrP or its mesenchymal receptor (PTh1r) display little to no bud sprouting. The role of the mammary mesenchyme has been further illustrated using promoter-driven reporter mice for latent TGFβ-binding protein1 (LTBP1), which is expressed in early mammary mesenchyme at around E12-E12.5 in a halo surrounding each mammary bud, subsequently becoming restricted to areolar muscle cells (Chandramouli et al., 2013b). LTBP1 is also expressed during the differentiation of the nipple epithelium coincident with suppression of hair follicle formation in the areola (Chandramouli et al., 2013a). In late embryogenesis, when the sprout is formed (∼E17.5), LTBP1 is expressed in the luminal cells facing a microlumen, but not in the ductal tips that are multi-layered (Chandramouli et al., 2013b). This pattern is later reflected in TEBs during puberty, where only cells lining the lumen express LTBP1 while most body and cap cells do not.
Fetal mammary stem cells
It is self-evident that MaSCs are required to generate a functional mammary gland and to regenerate the gland after periods of regression such as post-lactational involution. Whether the pools of stem cells are similar or distinct depending on the stage of development has been a long-standing question. Although fetal MaSCs (fMaSCs) must be the source of all other MaSCs, they may not be required postnatally after the rudimentary branched structure in the fetus has been formed. Should this be the case, the question arises as to the nature of adult MaSCs; how they are produced from fMaSCs, and how many subtypes of adult MaSCs exist? Technological developments in the past decade and the use of lineage-tracing studies to complement reporter gene expression and gene ablation studies have provided exciting new insights into embryonic mammary gland development and the origin and nature of fMaSCs and their descendants (Fig. 2).
One of the first markers of fMaSCs to be identified was Lgr5, a Wnt-regulated target gene. However, although Lgr5 is a marker for fMaSCs, it is not essential for stem cell activity (Trejo et al., 2017) and its role is unclear. One study suggests that a single Lgr5-expressing cell can reconstitute an entire mammary gland, and that Lrg5 is essential for postnatal development (Plaks et al., 2013), whereas another suggests that the progeny of Lgr5-expressing cells switch from a luminal to a myoepithelial fate within the first 12 days of postnatal development (de Visser et al., 2012). Disruption of another canonical Wnt signalling pathway component Lrp6 results in stunted embryonic branching morphogenesis and an underdeveloped fat pad (Lindvall et al., 2009).
FMaSCs arise from keratin 14 (K14; Krt14)-expressing cells, which first become detectable at E12.5 (although they must arise before this stage) and reach peak levels at E18 (Wuidart et al., 2018). Until recently, fMaSCs were thought to give rise to both luminal and basal lineages (Boras-Granic et al., 2014; Fu et al., 2017; Rodilla et al., 2015; Spike et al., 2012; Trejo et al., 2017). Although fMaSCs have multipotent activity upon transplantation into a cleared fat pad, their true developmental potential in situ requires lineage tracing at clonal density. A recent elegant study using intra-amniotic injection of lentivirus to barcode embryonic epidermal cells at E9.5 (before placode formation) has shown that mammary glands are derived from bi-potent fMaSCs that arise early. Furthermore, a small number of such cells (∼120) are sufficient to generate an entire mammary gland (Ying and Beronja, 2020). The bipotency of fMaSCs has been demonstrated further using the multicolour Confetti mouse in combination with tetracycline-inducible Cre-mediated recombination for clonal analysis (Wuidart et al., 2018). This study has shown that fMaSCs have a ‘hybrid’ basal and luminal gene expression signature, and that expression of ΔNp63 promotes the switch towards a basal cell fate (Wuidart et al., 2018). Another recent study has revealed the complexity of the fMaSC hierarchy (Giraddi et al., 2018). RNA-seq of more than 1000 single Epcam+ cells isolated from E18 mammary glands has shown that individual fMaSCs co-express genes associated with distinct adult mammary lineages (e.g. luminal and basal) and that fMaSCs can be distinguished from their precursors and progeny. Interestingly, fMaSCs have heterogeneous transcriptional states and do not form a distinct subcluster, suggesting that a unique fMaSC population does not exist, at least at this stage of development (Giraddi et al., 2018) (Fig. 2). It is notable that a single embryonic MaSC has a remarkable capacity to contribute to postnatal development, as shown using a single cell labelling approach (Lloyd-Lewis et al., 2018).
Perhaps the most surprising observation, however, is the early switch from multipotency to unipotency during embryonic development. This can occur as early as E12.5, as shown using the Notch1 promoter to drive Cre-mediated recombination of the multicolour Confetti reporter line followed by imaging whole-mount mammary glands in 2-week-old mice (Lilja et al., 2018). Notch1 is expressed in the majority of the cells in the mammary bud (Muzumdar et al., 2007) and cells targeted by the Notch1 promoter in embryogenesis appear to show no lineage bias. Using a combination of imaging and mathematical modelling, bipotency was revealed to be undetectable after E15.5, although these cells remain undifferentiated (Lilja et al., 2018). A role for Notch1 as a master regulator of luminal cell fate has been demonstrated using Notch1 gain-of-function mice. Ectopic Notch1 activation at the onset of puberty is even sufficient to switch basal cells to estrogen receptor (ERα; Esr1)-negative luminal cells (Lilja et al., 2018). This complements the cell-fate switch observed by inducing expression of ΔNp63 in embryonic or committed luminal cells (Wuidart et al., 2018), and is consistent with the negative regulation of ΔNp63 by Notch signalling (Yalcin-Ozuysal et al., 2010). Finally, expression of the zinc-finger transcriptional repressor Blimp1 (PRDM1) can be detected in the E17.5 embryonic mammary gland (Elias et al., 2017). These progenitor cells are long-lived, survive multiple pregnancy/involution cycles and give rise to progeny that do not express Blimp1, ERα or progesterone receptor (PR; Pgr), but are of the Elf5+ lineage (Elias et al., 2017).
The epigenetic landscape of fetal cells has been recently revealed by single cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) coupled with machine learning approaches (Shema et al., 2019). Interestingly, the epigenetic landscape of E18 fetal cells is partially specified into states that resemble one fetal and three major adult cell types (basal, luminal progenitor and mature luminal). Thus, these cells are already poised to differentiate into their corresponding lineages after birth (Chung et al., 2019). Of note, p63 (Trp63) interacts with KMT2D (MLL4), a major mammalian H3K4 mono- and di-methyltransferase that is essential for enhancer activation during cell differentiation (Lin-Shiao et al., 2018). Therefore, the Notch1-p63 axis could include an epigenetic switch during mammary lineage determination.
These experimental approaches could result in rare cell types being missed. Notwithstanding, the foregoing studies suggest that fMaSCs are a mixed population of true multipotent cells that arise early, shortly after E9.5, and lineage-primed cells that are not fully committed to either the basal or luminal lineage before birth.
Perinatal mammary gland development
Before birth, the mammary gland comprises a small ductal tree with one primary duct and 10-15 branches embedded within a nascent fat pad (Boras-Granic et al., 2014). In the perinatal period, MaSCs become lineage restricted and contribute to either basal or luminal progeny. We suggest that these long-lived proliferative cells are called basal enduring progenitors (BEP) and luminal enduring progenitors (LEP) as they are long-lived, unipotent, and have a high proliferative and self-renewal capacity. A number of studies have shed light on the timing of this switch. Labelling of Notch1-expressing cells from postnatal day (P) 3 marks only luminal cells, whereas labelling at birth marks only basal cells (Lilja et al., 2018). Interestingly, Notch1 expression is further restricted, being found primarily in luminal cells that do not express ERα by about 3 weeks of age. This conclusion is supported by independent work (Rodilla et al., 2015; Van Keymeulen et al., 2017; Wang et al., 2017).
A few multipotent, embryonically-derived MaSC do, however, remain. These are mostly quiescent, retain label for extended periods of time and are long-lived (Boras-Granic et al., 2014; dos Santos et al., 2013). When such cells re-enter the cell cycle, they can contribute to both luminal and basal lineages (Boras-Granic et al., 2014). The transcriptional regulator Foxp1 re-activates these cells, at least in part, by repressing the expression of the transmembrane cell adhesion-associated protein tetraspanin 8 (Tspan8) (Fu et al., 2018). The mechanism by which these bipotent cells exit the cell cycle and remain quiescent is unknown; however, Bcl11b, a zinc-finger transcription factor, regulates a subset of quiescent stem cells by inducing them to enter the G0 phase of the cell cycle (Cai et al., 2017). The functionally related protein Bcl11a is expressed in mammary placodes from E12.5 and is predominantly expressed in luminal progenitors in adult mice (Khaled et al., 2015). The rare Bcl11b-expressing cells appear distinct from a small population of proliferative cells marked by expression of the Wnt3A target gene Procr (protein C receptor) (Wang et al., 2015). Fluorescence-activated cell sorting (FACS) analysis shows these populations of cells do not overlap (Cai et al., 2017). Interestingly, Procr-expressing cells are not found in TEBs, and lineage tracing suggests that only basal and stromal cells express Procr, with progeny contributing subsequently to the luminal lineage. It would be interesting to determine whether these apparently distinct MaSCs serve a different function in the mammary gland, such as homeostasis versus repair, or whether they correspond to different stem cell niches. Thus, with the exception of long-lived, primarily quiescent MaSCs, luminal and basal progenitor identities are imparted at birth and are self-sustaining (Lilja et al., 2018). The molecular signal for this switch in fate is unlikely to be hormonal given the timing of the switch, unless it is related to the precipitous drop in progesterone at birth, but it may be epigenetic.
Puberty is marked by a rise in the levels of estrogen, inducing the primary ductal network to rapidly elongate and form an internal lumen (Mailleux et al., 2007). TEBs, club-shaped structures, form at the tips of the elongating ducts and comprise an outer single cell layer of cap cells and inner multi-layered body cells (Paine and Lewis, 2017) (Fig. 1B). The cap cells of the TEB express s-SHIP (stem-SH2-containing 5′-inositol phosphatase; Inpp5d), a marker of various tissue stem cell populations (Tu et al., 2001). Expression of an s-SHIP-GFP reporter becomes undetectable in the subtending duct, subsequently becoming reactivated in basal cells in the alveolar bud during pregnancy (Bai and Rohrschneider, 2010). More recently, these cap cells have been shown to express Par3-like polarity protein (Huo and Macara, 2014) and have Wnt/β-catenin activity that diminishes in concert with s-SHIP in the neck of the TEB. Meanwhile, expression of the alternative Wnt receptor Ror2, which can inhibit β-catenin-dependent signalling, is upregulated (Roarty et al., 2015). This indicates that Wnt/β-catenin activity, Par3-like protein and s-SHIP activity are restricted to basal progenitor cell populations. Most TEB cells are highly proliferative and contribute to ductal elongation, with TEBs bifurcating (or terminating) stochastically to form branches. TEBs contain lineage-restricted pools of progenitor cells that exhibit heterogeneous expression profiles (Scheele et al., 2017). Lateral side branching occurs during the estrous cycle downstream of progesterone signalling that is mediated partly by Wnt4 and RANKL (receptor activator of nuclear factor κB ligand; Tnsf11) (Joshi et al., 2010) and the transcription factor Id2 (Seong et al., 2018). Wnts promote survival of the cap cells in TEBs by preventing the nuclear accumulation of FoxO transcription factors (Chakrabarti et al., 2018). Wnts are produced by fetal macrophages in response to the secretion of the Notch ligand Delta-like canonical Notch ligand 1 (Dll1) by cap cells. Indeed, it is well established that macrophages and other leukocytes (Coussens and Pollard, 2011; Gouon-Evans et al., 2000; Ingman et al., 2006) are required for mammary gland development. Furthermore, analysis of mammary glands in which the transcription factor signal transducer and activator of transcription 5 (Stat5) is specifically deleted in macrophages, shows a delay in ductal elongation while enhancing branching and elevating epithelial proliferation (Brady et al., 2017). Once the mammary fat pad is filled, these TEBs regress and presumably the cap cells disappear or become quiescent.
A long-standing question is whether ER+ and ER− lineages derive from the same progenitor (Fig. 2). Using doxycycline-inducible ER-rtTA mice, genetic lineage tracing of ER+ luminal cells has been performed (Van Keymeulen et al., 2017). These studies conclude that ER+ cells are derived from ER+ lineage-restricted progenitors, which maintain this lineage during adult life (Van Keymeulen et al., 2017). Other work tracing cells that express prominin-1 (PROM1; CD133) reveals that these cells contribute to only the ER+ lineage, whereas SOX9-expressing cells maintain the ER− lineage (Wang et al., 2017). However, ablation of Prom1 has shown that it is not essential for mammary gland function, although in its absence there is reduced ductal branching (Anderson et al., 2011). Prominin 1 is unlikely to be a driver of the ER+ lineage.
Although studies using lineage-specific promoters have provided valuable insights into the mammary epithelial cell hierarchy (Asselin-Labat et al., 2010; Rios et al., 2014; Van Keymeulen et al., 2011), there are caveats associated with this biased approach. First, these studies have used different mouse strains, some with, for example, human or bovine promoters and others with the Cre recombinase or tetracycline-responsive elements knocked-in to the 3′UTR of the endogenous gene. This could result in different patterns of expression and the treatment of animals with tamoxifen or tetracycline could itself perturb the growth of mammary epithelial cells differentially. Secondly, a minimal level of expression may be required to express sufficient Cre recombinase to mediate recombination at the reporter locus and so a subset of cells may be missed. In addition, recent reports of Esr1 gene transcription without translation could impact the interpretation of lineage-tracing data (Cagnet et al., 2018). Furthermore, two subpopulations of luminal ductal cells, expressing either high or low levels of keratin 8 (K8; Krt8), have been detected, using tissue clearing and deep three-dimensional (3D) imaging; surprisingly, only a single K8+ cell is present in each alveolus at lactation (Davis et al., 2016). Thus, the use of the K8 promoter to drive a reporter gene could potentially generate misleading data. A completely agnostic approach would overcome the confounding issues of using specific promoters and potentially mis-interpreting cellular proximity (Box 1).
One such approach allows the progeny of a single cell to be traced using a slippage cassette, encompassing a [CA]30 microsatellite repeat directly upstream of an out-of-frame reporter gene inserted into the Rosa26 locus (Kozar et al., 2013). A DNA replication error could put the reporter into frame and as such replication errors are extremely rare, and occur stochastically in any cell type, there is no bias in the labelling of any single cell with progeny that can be traced at clonal density. This model has been combined with tissue clearing and deep 3D imaging procedures in mammary gland where only 1-2 labelling events were detected on average per gland, thus eliminating any possibility of clone convergence. This approach demonstrated that the progeny of a highly proliferative labelled progenitor are unipotent in the adult, with only luminal or basal clones being detected (Davis et al., 2016). Labelled progeny are distributed sporadically to branching ducts and quantification of their contribution to entire ductal trees suggested that pools of around 20 LEPs and 15 BEPs are sufficient to generate a major duct during puberty (Davis et al., 2016). These numbers are not dissimilar from the estimates achieved by lentiviral barcoding at around E9.5 discussed in the main text (Ying and Beronja, 2020). The integration of unbiased lineage tracing, using other technologies such as CRISPR-Cas9-induced genetic scars, coupled with single cell genomics such as scGASTALT (Raj et al., 2018; Spanjaard et al., 2018), is an exciting new area that will resolve many of the questions around cell fate decisions in the mammary gland.
The pregnancy/lactation/involution cycle
During pregnancy, tertiary branching and the formation of lobuloalveolar structures at the duct tips occurs in response to rising levels of progesterone and prolactin. This process, called alveologenesis, requires massive proliferation that is mediated by a number of signals including Stat5 (Cui et al., 2004), Elf5 (Oakes et al., 2008), Stat6 (Khaled et al., 2007) and RANKL (Tnfsf11; Fata et al., 2000). The two types of luminal cells that constitute growing lobuloalveoli during pregnancy are either hormone-sensing ER+/PR+ cells (which usually co-express Gata3), or hormone responsive (ER–/PR–) pStat5/Elf5+ cells (Oliver et al., 2012) (Fig. 2). Conditional ablation of ERα in the mammary epithelium demonstrated its essential role for ductal and alveolar morphogenesis (Feng et al., 2007), whereas PR is required during pregnancy for mammary ductal side-branching and alveologenesis (Brisken et al., 1998; Lydon et al., 1995). Recently, it has been shown that rare progenitor cells expressing Blimp1, give rise to proliferative pStat5/Elf5+ cells during pregnancy and are a long-lived population that survive involution (Ahmed et al., 2016; Elias et al., 2017). These data further support the presence of two different lineages of alveolar cells during pregnancy.
Early studies have shown that the prolactin receptor activates Stat5a signalling and is essential for alveologenesis (Cui et al., 2004; Liu et al., 1997; Oakes et al., 2008), whereas Stat5b mediates growth hormone signalling. However, more recent work has demonstrated that both Stat5a and Stat5b are important for regulating the expression of mammary-specific genes during pregnancy and lactation (Yamaji et al., 2013). Alveolar differentiation progresses throughout pregnancy, with a number of genes activated early and others expressed just before birth. Stat5 regulates the expression of a large set of ∼750 of these genes that display chronological activation (Yamaji et al., 2013). Milk protein genes, such as whey acidic protein (Wap) and β-casein (Csn2), are induced more than 1000-fold during pregnancy. This reflects the presence of mammary specific super-enhancers (transcription factor sites highly enriched for active chromatin marks). Indeed, analysis of the Wap super-enhancer suggests a temporal and functional enhancer hierarchy (Shin et al., 2016). Gene promoters bound by Stat5 have H3K4me3 (trimethylated histone H3 at K4) marks at the onset of lactation.
There is an extensive literature on the other factors that control mammary gland development during pregnancy (Oakes et al., 2006). Recent seminal studies have revealed a more complex picture of cell subsets using lineage tracing of alveolar progenitors and scRNA-seq. Furthermore, it has been shown that there is an ‘epigenetic memory’ of pregnancy, which may shed light on the protective effect of an early pregnancy (Dos Santos et al., 2015; Meier-Abt and Bentires-Alj, 2014) (Box 2).
Interestingly, ductal branching is initiated faster in parous than in non-parous mice (Dos Santos et al., 2015). Analysis of the epigenome shows that this is associated with considerable and long-lasting reductions in DNA methylation, primarily at sites occupied by Stat5a during pregnancy. Thus, epigenetic memory of pregnancy may facilitate subsequent differentiation cycles, which might partly explain why women produce milk faster in second and subsequent lactations. These epigenetic changes may also partly explain the protective effect of pregnancies lasting 34 weeks or longer for breast cancer development (Husby et al., 2018), although the biological mechanism behind this is a key question. One candidate is the IGF1 receptor pathway, because the Igf1r gene is hypermethylated in the parous mouse mammary gland (Katz et al., 2015). RNA-seq data demonstrates that cells in the post-parous luminal compartment have different transcriptional profiles than comparable nulliparous compartments (Bach et al., 2017). Post-involution cells upregulate genes involved in immune response (Mfge8 and Tgfb3) and lactation (Csn2, Lalba, Lipa and Cidea). The parity effect appears to be confined to the luminal progenitor population (Bach et al., 2017), which may be already primed towards the alveolar fate owing to continued transcription from these loci. In mammary epithelial cells, X chromosome inactivation does not appear to be random, and expression of the X-linked gene Rnf12 (Rlim) from the paternal allele is required for alveolar cell differentiation and milk production. Cells without Rnf12 undergo cell death upon differentiation, with the effect being more pronounced in basal than luminal cells (Jiao et al., 2012). This indicates that silencing of the maternal allele of Rnf12 is non-random, or it could be the result of progenitor/MaSCs that have an active paternal X chromosome having a growth advantage over those with an active maternal X chromosome.
Lineage tracing of alveolar progenitors
As we have discussed, a number of studies have employed lineage tracing to characterise the progeny of MaSCs and progenitor cells, and their contribution to mammary gland development at embryonic, perinatal and pubertal stages (reviewed by Fu et al., 2020; Lloyd-Lewis et al., 2017; Sale and Pavelic, 2015). Moreover, labelling at saturation density using the doxycycline-inducible systems has allowed the fate of all luminal or basal cells to be followed throughout pregnancy and lactation (Wuidart et al., 2016), concluding that luminal and basal lineage progenitors are unipotent even following pregnancy, lactation and remodelling of the gland.
The ability of a single stem/progenitor cell to contribute to lobuloalveolar structures has also been analysed in lactating mice using the R26[CA]30EYFP ‘slippage’ mouse model (Davis et al., 2016; Box 1). A surprising variety of clonal patterns was observed, with some alveoli being almost completely labelled, whereas others had intermediate levels and some had no labelled progeny at all (Davis et al., 2016). The labelled clones were either basal or luminal, but never both, and clonal regions spanned up to 100 alveoli. These observations could be interpreted to suggest that an alveolar progenitor cell niche is located at the tip of a secondary branch. Whether this niche contains both LEPs and BEPs is difficult to determine. It is clear that there is a huge expansion of luminal alveolar cells during pregnancy but whether this applies to basal cells is less clear. Given the sparse covering of alveolar clusters with basal cells, compared with the tightly packed parallel lines of myoepithelial cells in ducts (Hitchcock et al., 2020), it is possible that proliferation of the basal cells is not required and they simply stretch out over expanding luminal alveolar cells. In this context, an interesting observation for one particular clone of cells labelled in the R26[CA]30EYFP mouse model, was the presence of a single EYFP-labelled cell in each alveolus within a lobuloalveolar cluster indicating a sub-lineage that comprises a minor component of each alveolus. The nature and role of this cell type is open to speculation, but it is worth noting that only approximately one cell per alveolus shows immunostaining for K8 compared with a high proportion of the luminal ductal cells. It is not apparent why these data do not support a previous study indicating that alveoli can be derived from the progeny of a single cell (van Amerongen et al., 2012), but may be a consequence of analysing progeny of clones that coincide owing to labelling at non-clonal density. These results highlight the power, and the importance, of in vivo lineage tracing at clonal density.
Transcriptome analyses to identify progenitor populations
Several studies have analysed the transcriptome of individual mammary epithelial cells at different stages of development, and with different conclusions. Bach and colleagues used scRNA-seq to examine four developmental stages (nulliparous, mid gestation, lactation and post involution) to reveal 15 clusters based on gene expression. However, none of the clusters could be defined by the expression of a single gene, which is an important consideration when interpreting and performing lineage-tracing studies (Bach et al., 2017). In contrast to the recent lineage-tracing studies discussed above, pseudotime trajectory analysis suggests that there is a common luminal progenitor that can give rise to intermediate restricted alveolar and hormone-sensing progenitors that subsequently undergo changes in their transcriptome in response to the pregnancy cycle (Bach et al., 2017). A similar finding has been reported by Giraddi and colleagues also using scRNA-seq (Giraddi et al., 2018) – and further supported by their follow-up study of chromatin using scATAC-seq (Chung et al., 2019). It is possible that these bipotent luminal cells are mostly active during early embryonic and prepubertal stages. However, the analysis and interpretation of single cell genomics data is still in its infancy and this may explain why another study (Pal et al., 2017) reached different conclusions. Pal and colleagues suggest that gene expression in the pre-pubertal epithelium shifts from a basal-like, fairly homogeneous, programme to clear-cut lineage-restricted programmes in puberty (Pal et al., 2017). Furthermore, their data uncovered an apparent mixed-lineage cluster within the basal clusters that is not observed in the other two studies (Bach et al., 2017; Giraddi et al., 2018). Whether the cells in this cluster are quiescent fMaSCs is an open question. Importantly, scRNA-seq studies have revealed many cellular intermediates, particularly in the luminal compartment, that could be transit amplifying lineage-committed cells. These results also point towards the potential plasticity of this compartment, something worth noting when considering the growing evidence that most breast cancers originate from luminal cells.
During late gestation, milk secretion is regulated by high levels of progesterone so, although milk and fat droplets are produced, these are usually not secreted and can be observed within the luminal epithelial cells. However, upon the precipitous drop in progesterone that occurs at birth, a final round of proliferation takes place and the fat pad becomes filled with ductal and alveolar epithelium (Fig. 3). This is associated with a loss of differentiated adipocytes. There is some evidence of reciprocal transdifferentiation of adipocytes and epithelium (Morroni et al., 2004; Prokesch et al., 2014), but this is controversial; more recent evidence does little to resolve this issue, with one study suggesting that this does not occur (Zwick et al., 2018) and another showing that only adipocyte progenitors can become epithelial (Joshi et al., 2019). These conflicting views may be the result of different experimental approaches. In this context it is interesting that stromal cells express Procr, so could it be that these stromal cells contribute to the luminal lineage rather than bipotent Procr-expressing MaSCs? In the light of a study showing that breast cancer cells, which are undergoing epithelial-to-mesenchymal transition (EMT), can be induced by drug treatment to differentiate into adipocytes, thereby reducing metastasis (Ishay-Ronen et al., 2019), this transdifferentiation potential becomes an important issue to resolve. In this next section, we discuss the key features of lactation, such as the requirement for polyploid cells, the mechanism of milk secretion and important interactions with stromal components.
Lactation is a time of high demand for energy and protein production. Indeed, the synthesis of milk proteins dramatically increases the total amount of protein produced by secretory alveolar cells. One mechanism for achieving this elevated protein production would be to enhance the translational machinery of the cell. Smith and Vonderhaar suggested almost four decades ago that the full differentiation of secretory alveolar cells ‘requires DNA synthesis inconsequent of mitosis’ and that this could involve polyploidy (Smith and Vonderhaar, 1981). Recent studies highlighting the presence of binucleate cells in the lactating mammary glands of mice, cows, marsupials and humans have confirmed this theory (Ho et al., 2016; Rios et al., 2016). Genetic confirmation of the requirement for binucleate cells for functional differentiation of the mouse mammary gland is provided by Aurora kinase A (Aurka) conditional knockout mice (Rios et al., 2016). Aurora kinase A is a mitotic spindle-associated kinase, required for spindle formation and the progression of mitosis. Interestingly, although basal cells deficient in Aurora A are unaffected, the number of binucleate luminal cells is drastically reduced and this has a striking effect on the growth of pups suckled by Aurora A-deficient dams, which fail to thrive after 4 days of suckling. The number of binucleate cells can also be perturbed by ablation of the KRAB-domain zinc-finger protein Zfp157 (also known as Roma), a transcriptional target of Stat6 (Oliver et al., 2012). In the absence of Zfp157, luminal cells continue to undergo DNA replication but fail to divide, resulting in an elevation of the number of binucleate cells that is accompanied by evidence of considerable DNA damage (Ho et al., 2016). It is intriguing that cells harbouring such damaged DNA do not die but continue to produce milk. In this context, note that knockout of Zfp157 compensates for the loss of Gata3 and rescues lactation failure that arises when Gata3 is deleted (Oliver et al., 2012). Thus, Gata3 is not essential for development during pregnancy so long as Zfp157 is absent, although cells appeared dysplastic during lactation.
Lactation requires the contraction of the myoepithelial network in response to oxytocin. The development of new live-imaging techniques has shed light on this dynamic process. Using live 3D imaging, it has been shown that Orai1, which is a store-operated Ca2+ channel subunit, is required for the frequency and coordination of the oxytocin-mediated pulsatile contractions that expel milk into ducts (Davis et al., 2015). More recently, the ultrasensitive protein calcium (Ca2+) sensor GCaMP6f has been used in transgenic mice to show that Ca2+ increases in single basal cells in response to oxytocin, immediately before their contraction, suggesting that basal cells contract – and thus deform – luminal cells to eject milk. A different approach has focussed on observing lipid droplet fusion and secretion, using intravital imaging of the lactating mammary gland in live transgenic mice expressing green fluorescent protein (GFP). Lipid droplets were stained with the hydrophobic fluorescent dye BODIPY (boron-dipyrromethene) and their formation and secretion monitored by time-lapse subcellular microscopy over the space of 1-2 h. Notably, oxytocin-induced contraction of the myoepithelial cells is required to release the droplets into the alveolar lumen (Masedunskas et al., 2017; Mather et al., 2019).
Epithelial–basement membrane and cell adhesion interactions
Interaction of mammary epithelia cells with the basement membrane is essential for proper differentiation of alveolar cells and the maintenance of lactation. The laminin-binding integrin chains α3 and α6 are required for lobuloalveolar development at the appropriate time and are essential to maintain lactation, because conditional deletion of these integrins causes precocious involution (Romagnoli et al., 2020). Although these studies have relied on transplantation and mammosphere assays, it is possible that integrins are involved in maintaining the MaSC or progenitor cell niche (Romagnoli et al., 2019). An interesting adjunct to this study is the demonstration that α6 integrin is inherited asymmetrically by orientating at one spindle pole when daughter cells divide (Morris et al., 2020). Also, the tight junction protein, occludin, has been shown to have an unanticipated role as modulator of endoplasmic reticulum stress, which is a feature of late lactation, by facilitating protein secretion in a SNARE-dependent manner (Zhou et al., 2020).
Myoepithelial-immune cell interactions
Recent studies using tissue clearing and deep imaging of lactating mammary glands have revealed the unexpected interaction between leukocytes and myoepithelial cells, which adopt the same stellate shape and are intimately associated with each other (Dawson et al., 2020; Hitchcock et al., 2020). This phenomenon has also been observed using fluorescent tracking of macrophages (Stewart et al., 2019), suggesting that leukocytes may carry out surveillance of luminal space in order to detect pathogens. This notion is supported by the serendipitous discovery of a gene that regulates lactation which arose from an ENU-mutagenesis screen that identified a mutation in the viral sensor Oas2 as a cause of lactation failure in the immediate post-partum period (Oakes et al., 2017). It is not unexpected that a viral recognition pathway would have a role in the regulation of lactation, because it is well established that viruses can be transmitted in milk. The robust activation of an interferon response in Oas2 mutant mammary glands suggests that viral detection could trigger cessation of milk production. Thus, the mammary gland may balance maximum milk production with an assessment of milk quality in terms of pathogenic organisms. It is now clear that organisms in the milk make an essential contribution to the microbiome of the infant gut.
After lactation ceases, the alveolar compartment regresses and the mammary gland is remodelled to an arbourised ductal tree that resembles a virgin gland, albeit with more side-branches. The process of involution is characterised by extensive programmed cell death and has been utilised as one of the most dramatic models of physiological cell death. In vivo, involution is a gradual process but, in an experimental setting, involution can be synchronised by either removal of the suckling pups at the peak of lactation (day 10 in the mouse) or by the sealing of teats with veterinary glue (Watson, 2006b). The latter approach allows for the investigation of locally-acting factors with the contralateral glands serving as controls (Li et al., 1997). Involution is triggered by the accumulation of milk and occurs in two distinct phases: first, a reversible phase that is marked by rapid and extensive cell death with limited alveolar collapse, and a second, irreversible, phase that is accompanied by tissue remodelling and the reappearance of adipocytes (Lund et al., 1996; Watson, 2006b). The first phase lasts for up to 48 h in the mouse and lactation can recommence if the pups are returned to the dam. In other species, the length of the first phase is more difficult to determine, but it is suggested to be 11 days in the cow and up to 3 weeks in the fur seal (Sharp et al., 2007). A caveat with forced involution is that it does not occur naturally, unless the pups die, and there is some evidence of differences post forced involution including higher levels of inflammation and ductal hyperplasia (Basree et al., 2019). Nevertheless, forced involution does occur in women who choose not to breastfeed or who force wean their child either for health reasons or upon returning to work.
Programmed cell death during involution
It has long been thought that cells die during involution by the process of apoptosis (Watson, 2006a), a mechanism of programmed cell death that utilises caspase proteases to dismantle the dying cell (Inoue et al., 2009). However, more recent studies have highlighted that this is not the case and that the first phase of involution is independent of executioner caspase activity. Instead, the luminal alveolar cells die via a lysosomal-mediated pathway of cell death (Kreuzaler et al., 2011; Sargeant et al., 2014). Historical electron microscopy studies had suggested that there are changes in the lysosomal compartment during involution and that the lysosomal hydrolase cathepsin D is released from lysosomes (Helminen and Ericsson, 1970, 1971; Helminen et al., 1968) and more, recently, that cathepsin D is processed and active during involution (Margaryan et al., 2010; Zaragozá et al., 2009).
Confirmation of a role for the lysosomal compartment in mediating cell death during involution followed on from the discovery that the transcription factor Stat3 is required for the initiation of involution (Chapman et al., 2000). Microarray analysis of Stat3 target genes revealed a dramatic 60-fold upregulation of Serpina3G (also known as Spi2a) mRNA in mammary glands of 24 h involution Stat3 conditional knockout mice. Spi2a is an irreversible inhibitor of cathepsin proteases and prompted the discovery that the expression of cathepsins B and L is dramatically upregulated in early involution, in a Stat3-dependent manner, concomitantly with the Stat3-mediated inhibition of Spi2a expression (Kreuzaler et al., 2011). Subsequent studies have shown that the uptake of lumenal milk fat globules (MFGs) is enhanced by Stat3 (Sargeant et al., 2014), which also modulates trafficking of proteins, such as annexins and flotillins, from the plasma membrane to the lysosome (Lloyd-Lewis et al., 2018). Thus, involution is a dynamically regulated process that is orchestrated by Stat3, which enhances the lysosomal compartment and switches the fate of luminal cells from their secretory function to a phagocytic function, resulting in the delivery of MFGs to lysosomes. Degradation of the lipid in the MFGs to constituent fatty acids results in elevation of the levels of oleic acid that is proposed to perturb the lysosomal membrane, allowing the release of cathepsins to kill the cell (Fig. 4). The fate of the alveolar myoepithelial cells is less clear; recent deep-imaging studies suggest that they may not die, but contract and shrink back to the ductal tree (Hitchcock et al., 2020). If this supposition is correct, then the network of myoepithelial cells that surrounds each alveolus may arise from ductal myoepithelial cells that adopt a different shape when in contact with alveolar epithelium.
The zinc transporter ZnT2 (Slc30a2) has also been suggested to play a role in lysosome biogenesis and lysosome-mediated cell death during involution (Rivera et al., 2018), confirmed using ZnT2 knockout mice that exhibit impaired alveolar regression, together with reduced levels of pStat3. Furthermore, vacuolar ATPase assembly on lysosomes is inhibited resulting in fewer and smaller lysosomes. Thus, ZnT2 is an important regulator, along with Stat3, of elevated lysosome biogenesis at the onset of involution (Rivera et al., 2018).
As milk production halts, calcium transport into milk ceases. The calcium pump ATPase 2 (PMCA2; Atp2b2), which is located at the apical surface of the plasma membrane, transports calcium into milk and interacts with the PDZ domain-containing scaffolding molecule sodium-hydrogen exchanger regulatory factor (NHERF1; Slc9a3r1), which is upregulated during lactation (Jeong et al., 2019). Importantly, genetic ablation of NHERF1 revealed that it is required to suppress Stat3 activation during lactation and thereby prevents premature involution. At the onset of involution, expression of both PMCA2 and NHERF1 is downregulated leading to lysosome-mediated cell death via Stat3 activation. The increase in cytosolic Ca2+ levels in luminal epithelial cells during involution is likely to activate calpains, Ca2+-sensitive non-lysosomal cysteine proteases (Arnandis et al., 2012).
During involution, the white adipocyte compartment extensively regenerates, filling space occupied by the regressing luminal epithelium. The mechanism regulating adipocyte regeneration is largely unknown, but recent studies using genetic tracing of mature adipocytes have revealed that hypertrophy is the primary mechanism (Zwick et al., 2018), with no evidence for transdifferentiation of luminal epithelial cells to adipocytes during involution. Interestingly, these adipocytes fill with lipids derived from milk and in the absence of lipid trafficking to adipocytes, the mammary epithelium fails to properly remodel.
Remodelling and immune cells
As the mammary gland can generate full lactational competence with every pregnancy, this indicates that the remodelling process during involution occurs without inflammation or tissue damage. This may be due in part to the influx of immune cells during involution that clear milk and debris (efferocytosis) (Martinson et al., 2015; O'Brien et al., 2010, 2012). Mammary epithelial cells themselves act as non-professional phagocytes and re-uptake milk fat droplets and dead cells (Sargeant et al., 2014). The receptor tyrosine kinase MerTK is elevated early in involution and loss of MerTK results in accumulation of dead/dying cells in post-lactational MerTK-deficient mammary glands, but this is not observed when the other family member receptor tyrosine kinases Axl and Tyro3 are deficient (Sandahl et al., 2010). Thus, efferocytosis mediated by MerTK is essential for controlled involution and may help to prevent accumulated dead cells rupturing and releasing intracellular components that could damage the surrounding ducts. It has been suggested that milk engorgement can cause tissue damage, and that the molecules released are recognised by Toll-like receptor 4 (TLR4), which triggers an inflammatory response and milk stasis (Glynn et al., 2014; Ingman et al., 2014).
As discussed, studies using tissue clearing and deep imaging of lactating mammary glands have revealed the unexpected interaction between leukocytes and myoepithelial cells (Hitchcock et al., 2020). In early involution, the shape and association with myoepithelial cells is maintained but, as involution progresses, the number of leukocytes increases substantially and they begin to adopt a different shape at 72 h of involution (Hitchcock et al., 2020). Immunofluorescence studies show that the majority of these leukocytes express activated MHC class II antigens and are, therefore, likely to be macrophages or dendritic cells. These images confirm previous studies showing an influx of macrophages during involution, which are required for the involution process (O'Brien et al., 2010, 2012). The study of involution has provided insights into possible mechanisms of breast tumourigenesis. Furthermore, involution can be viewed as tumourigenesis in reverse, as epithelial cells are induced to die concomitantly with remodelling and reduction of associated vasculature and lymphatics (Elder et al., 2018).
Technical challenges and outstanding questions
The past decade has seen unprecedented progress in understanding mouse mammary gland development. Much of this is a consequence of technological developments, such as high-throughput sequencing and advanced microscopy. It might also be said that much of the work in the past decade has used approaches, such as FACS and fat pad transplantation, which resulted in conclusions about MaSC markers and MaSC characteristics that do not necessarily reflect their capacity in situ and in vivo. The continued emergence of new technologies to analyse the mammary gland provides not just impetus to the field, but results in a continuous state of revelation and subsequent refinement of our ideas and understanding. As a community, however, we need to be critical of the shortfalls and caveats associated with these advances. For example, single cell genomics is quickly becoming a staple for the analysis of tissue heterogeneity and cell fate decisions. However, unlike lineage tracing, single cell genomics reports on the current epigenetic or transcriptomic state of a cell and not on its history or potential. Also, lineage trajectories based solely on single cell genomics are computationally derived and embrace the nascent field of computational biology that is constantly evolving. Promoter-driven lineage tracing has undoubtedly contributed to major paradigm shifts in our understanding of mammary gland biology over the last 10 years. However, there are caveats to this technology, such as the use of reporter knock-ins that disrupt one allele of the gene of interest (e.g. Gata3 is haploinsufficient) (Lawrence et al., 2016), untargeted transgenes that use minimal promoter sequences from different species as surrogates for in vivo gene expression patterns, and the use of BAC transgene reporters. These different transgenic approaches could lead to different conclusions being drawn. Interpretation is further confounded by evidence that transcription does not necessarily equate with protein production and/or stability. Finally, due account needs to be taken of the influence of the different genetic backgrounds used in generating these models. As a community, we need to assess these many variables that impact upon the interpretation and conclusions of new findings. A concerted effort to provide multi-lab benchmarking studies to determine the impact of technical differences, tools and mouse strains on the interpretation of new results would be a valuable resource.
One outstanding question is the nature and location of the mammary stem and progenitor cells and their niches, and how they reciprocally interact with, and shape, non-epithelial cells within their microenvironment. Cells are constantly exposed to varying levels of hormones and cycles of growth and regression. Maintaining stem/progenitor cells within a niche is essential to ensure their long-term survival and potency. It is also important to understand how, during early stages of tumour initiation, tumour initiating cells dictate tumour promoting changes in the microenvironment and how that is related to the normal biological functions of the cell of origin. The emergence of spatial transcriptomics will play a major role in understanding these questions and one could foresee this being coupled with unbiased lineage tracing for high resolution mapping of stem/progenitor niches.
The nature of mammary stem cells and their niche was first suggested by the late Gil Smith over 30 years ago in his seminal electron microscopy study (Smith and Medina, 1988). Gil's hypotheses were so often subsequently shown to be correct, including the notion that a single stem cell could generate an entire mammary gland (Kordon and Smith, 1998). We have made enormous progress in the 50 years since mammary gland stem cells were first shown to exist through fat pad transplantation assays (Daniel et al., 1968; Hoshino and Gardner, 1967), but the niche still eludes us.
Work in the laboratory of C.J.W. is supported by the Medical Research Council (MR/J001023/1 and MR/N022963/1) and work in W.T.K.'s laboratory is supported by Cancer Research UK (Career Establishment Award C47525/A17348).
The authors declare no competing or financial interests.