The T-box family of transcription factors exhibits widespread involvement throughout development in all metazoans. T-box proteins are characterized by a DNA-binding motif known as the T-domain that binds DNA in a sequence-specific manner. In humans, mutations in many of the genes within the T-box family result in developmental syndromes, and there is increasing evidence to support a role for these factors in certain cancers. In addition, although early studies focused on the role of T-box factors in early embryogenesis, recent studies in mice have uncovered additional roles in unsuspected places, for example in adult stem cell populations. Here, I provide an overview of the key features of T-box transcription factors and highlight their roles and mechanisms of action during various stages of development and in stem/progenitor cell populations.
Transcription factors of the T-box family play varied yet crucial roles throughout development. The founding member of the family, brachyury (from the Greek for short tail), was discovered following studies of a short-tailed mouse that harbored a mutation affecting tail length and embryonic development (Dobrovolskaïa-Zavadskaïa, 1927). The brachyury gene (which is also known as T) soon became a classic developmental gene studied in great detail for its pivotal role in the development of the notochord and posterior mesoderm; mutations in T cause embryonic lethality in homozygotes and short tails in heterozygotes (Papaioannou, 2001). The subsequent cloning and sequencing of T led to its classification as a DNA-binding protein and transcription factor (Herrmann et al., 1990; Kispert and Herrmann, 1993; Kispert et al., 1995) and, soon after, a family of related genes encoding proteins with transcriptional regulatory activity and linked by a similar DNA-binding motif was rapidly uncovered (Bollag et al., 1994).
Given the developmental roles of T, the T-box family genes were initially studied from a developmental biology perspective. Embryonic phenotypes were predicted on the basis of embryonic expression patterns; functions were investigated on the basis of the severe, usually embryonic lethal, phenotypes caused by ablation of the genes. In addition, most T-box gene mutations, like those in T, show dose sensitivity, thereby giving rise to dose-dependent effects in heterozygotes. These early studies led to the elucidation of the genetic basis of a number of human syndromes and defined some of the crucial embryonic roles for T-box factors (Table 1). These studies suggest that T-box genes, which are expressed in a highly specific manner but can be widespread spatially and temporally, are involved in all of the major developmental signaling pathways, and accumulating evidence indicates that they act at multiple stages of development through multiple downstream target genes that can differ in different tissue contexts.
Although by no means a comprehensive review, this Primer covers the major features of the T-box family with an emphasis on recent discoveries of their roles during the development of vertebrates, primarily mouse and human, while recognizing some untapped areas for further study. Of the many aspects of development that involve T-box genes, several are highlighted to emphasize how multiple T-box factors can feed into a single system and also to illustrate how a single T-box factor can feed into multiple systems at different times and locations. A comprehensive review of the older literature and work on other species can be found in earlier reviews (e.g. Morley et al., 2009; Naiche et al., 2005; Papaioannou, 2001; Papaioannou and Silver, 1998; Showell et al., 2004; Smith, 1999; Takashima and Suzuki, 2013; Wardle and Papaioannou, 2008).
Defining and emerging features of the T-box family
Common to all T-box proteins is the DNA-binding motif, or T-box domain, which spans 180-200 amino acid residues and binds DNA in a sequence-specific manner. Phylogenetic relationships between T-box genes are reflected by homology within the region encoding this domain. On this basis, five subfamilies can be identified (Fig. 1): the T, Tbx1, Tbx2, Tbx6 and Tbr1 subfamilies. T-box proteins function as transcriptional repressors or activators and some, such as T and Tbx4, appear to have both activation and repression domains that may function in different cellular or promoter contexts (Kispert et al., 1995; Ouimette et al., 2010) (Fig. 2A). Unique among the T-box genes, Mga contains a single-exon T-box domain and a basic helix-loop-helix (bHLH) zipper domain (Hurlin et al., 1999), possibly representing a reverse transcription and reintegration event of the T-box domain into a bHLH zipper domain gene.
The T-box domain target sequence was first identified for the T protein as a consensus, near-palindromic sequence made up of two half sites to which the protein binds as a monomer (Kispert and Herrmann, 1993). It was subsequently found that other T-box proteins can bind the consensus half site (AGGTGTGAAA), which is called the T-box binding element (TBE), although the optimal target sequences vary, as do the preferences of different proteins for the number and spacing of TBEs. Crystallographic analysis of Xenopus Xbra bound to a palindromic sequence derived from the consensus sequence revealed that it binds as a dimer with a small protein-protein interface area and with the T-box domains contacting DNA in the major and minor grooves at the half sites (Müller and Herrmann, 1997) (Fig. 2B). However, the significance of dimerization was called into question by subsequent studies of TBX3 and TBX1 bound to the palindromic consensus sequence, as the dimerization interface in this context is arguably too small to constitute a biologically relevant protein interface, and the proteins may be kept in register only by the DNA (Coll et al., 2002; El Omari et al., 2012). Furthermore, only half sites have been recognized in the promoters of target genes and, notably, it was shown that TBX5 binds as a monomer to a single half site in the promoter of its target gene ANF (NPPA) (Stirnimann et al., 2010) (Fig. 2B). For the growing number of target genes, further studies are needed to elucidate the DNA binding mechanism in the context of endogenous target gene promoters.
Target gene recognition and selectivity, as well as target gene regulation, can also be influenced by interactions with other proteins or co-factors. Several T-box proteins bind homeobox factors. For example, Tbx5 and Nkx2-5 interact with each other through their respective T-box domain and homeodomain and together they bind the Anf promoter in tandem and act synergistically to activate Anf during cardiomyocyte differentiation (Hiroi et al., 2001). Domains other than the T-box are also important for mediating interactions with co-factors. For example, in melanoma cells Tbx2 interacts with the tumor suppressor protein Rb1 through a domain that is immediately C-terminal to its T-box. This association increases the ability of Tbx2 to interact with its DNA recognition sequence in the promoter of p21Cip1 (Cdkn1a), thereby enhancing the transcriptional repression function of Tbx2 (Vance et al., 2010).
In addition to mediating transcriptional regulation, evidence for T-box protein-mediated gene regulation through epigenetic modifications is mounting. Tbx3, for example, was shown to regulate histone H3 lysine 27 (H3K27) methylation at its target gene Gata6 during embryonic stem cell differentiation (Lu et al., 2011). In another study, two conserved domains within the T-box domain of Tbet (Tbx21) were shown to functionally recruit methyltransferase activity to target promoters to establish a permissive chromatin state, pointing to epigenetic regulation as an important component of T-box protein-mediated regulation of target genes (Miller et al., 2008).
Recently, a proteomic screen to detect Tbx3 interacting proteins identified a number of RNA-binding proteins and splicing factors. It was found that Tbx3 physically associates with mRNA and directly binds RNAs containing TBEs. Furthermore, it was found that the ability of RNA-binding proteins to influence splicing is Tbx3 dependent in some cases, implicating Tbx3 in the regulation of alternative splicing and providing additional complexity to the regulation of gene expression by Tbx3 (Kumar et al., 2014). Clearly, there is much more to be learned about T-box proteins and their co-factors to fully understand the target selectivity and transcriptional and post-transcriptional regulation that leads to their regional and temporal specificity in the regulation of target genes.
Ancient origins and evolution of the T-box gene family
It has long been known that the T-box gene family is ancient in origin and present in all metazoans (Bollag et al., 1994), but the increasing availability of sequenced genomes from diverse animal taxa has pushed back the origins of the family to unicellular organisms and fungi, in which one or two T-box genes, including T, have been identified (Degnan et al., 2009; Sebe-Pedros et al., 2013) (Fig. 3). Analysis of the genomes of bilaterians and representatives of the four basal metazoan phyla indicate that T is the most ancient member of the family and that the family expanded throughout metazoan evolution. Genes were added progressively by gene or genome duplication (Box 1) and were sometimes lost or gained in specific lineages. In present day vertebrates, the radiation of the T-box family resulted in genes that can be grouped into five subfamilies (Fig. 1), four of which were already present in the common ancestor of vertebrates and sponges (Sebe-Pedros et al., 2013) (Fig. 3).
The bilaterian cephalochordate amphioxus is a close invertebrate relative of vertebrates and has therefore been used as a model organism to deduce information about the last common ancestor of vertebrates and invertebrates (Bertrand and Escriva, 2011). Phylogenetic analyses indicate that two whole-genome duplications occurred in the vertebrate lineage after the divergence of the cephalochordates, such that each amphioxus gene generally corresponds to two, or sometimes three, vertebrate genes (Dehal and Boore, 2005; Ruvinsky et al., 2000b). For example, a single amphioxus gene, AmphiTbx1/10, corresponds to two vertebrate genes, Tbx1 and Tbx10, which presumably arose during one of the genome duplications. AmphiTbx1/10 is expressed during gastrulation in ventral somites and the branchial arches (Mahadevan et al., 2004), corresponding to the expression of mouse Tbx1 in the ventromedial somites and pharyngeal arches (Chapman et al., 1996). Mouse Tbx10, however, is expressed only in the developing hindbrain (Bush et al., 2003). Thus, the primordial function of Tbx1/10 in chordates might have been branchial arch patterning and ventral somite specification, functions retained by the Tbx1 gene while Tbx10 lost its role in pharyngeal arch patterning and gained a novel role in hindbrain development. Another example is AmphiTbx4/5, which is the ortholog of vertebrate Tbx4 and Tbx5, which both have roles in limb development and, in addition, Tbx5 has a role in heart development (Duboc and Logan, 2011; Greulich et al., 2011). AmphiTbx4/5 is transiently expressed in restricted, bilateral areas of ventral mesoderm that superficially resemble the restricted, limb field-specific expression of Tbx4 and Tbx5 in vertebrates. However, amphioxus has no limbs and the area of expression also coincides with presumptive heart precursors, suggesting that the original function of the gene was in heart specification and that it might have been co-opted for a novel role in limb outgrowth when paired limbs evolved in vertebrates (Horton et al., 2008). In this scenario, the two duplicated genes could have further evolved to take on limb-specific expression and functions. In an amazing feat of trans-subphylum transgenesis, an AmphiTbx4/5 transgene was used to rescue limb outgrowth when heterologously expressed in the forelimb field of a mouse lacking Tbx5, indicating that, even across millions of years of evolution, the protein function has been conserved, suggesting that the acquisition of novel functions came about through changes in gene regulation (Minguillon et al., 2009).
Among vertebrates, there is usually a one-to-one correspondence of orthologs. For example, mouse and humans each have 17 orthologous T-box genes (Table 1). However, exceptions do occur, as exemplified by the existence of a paralogous gene pair in Xenopus, Xbra and Xbra3, which resulted from a Xenopus-specific gene duplication of the ancestral T gene (Hayata et al., 1999), and the Xenopus Tbx6 subfamily gene XlTbx6r, which is not present in other vertebrates (Callery et al., 2010). Furthermore, in zebrafish there are two additional Tbx6 family genes, which apparently have no orthologs in mammals (Ruvinsky et al., 2000a; Windner et al., 2012) (Fig. 1), possibly indicating loss of these genes in the mammalian lineage. As the genomes of more organisms are sequenced, the positive identification of orthologs in more distantly related taxa will improve.
T-box genes are scattered throughout the vertebrate genome with the exception of two parologous pairs, Tbx2 and Tbx3, which are linked to Tbx4 and Tbx5, respectively. This arrangement appears to have originated from a tandem duplication of a single ancestral gene followed by a duplication of the linked pair prior to the separation of bony fish from tetrapods (Agulnik et al., 1996).
Examples of T-box gene function during development
In the following sections, the involvement of T-box genes during cell fate specification and differentiation in the early embryo, during somite development and during heart and limb formation are highlighted to illustrate the complex and sometimes interactive or combinatorial roles that T-box genes play in a variety of developmental processes. In addition to these selected areas, the development of many other organs and tissues is known to be regulated by T-box genes but is not covered in detail here. T-box genes make major contributions to craniofacial development (Tbx1, Tbx10, Tbx15, Tbx22) and to development of the brain (Tbr1, Eomes), mammary gland (Tbx2, Tbx3), pituitary gland (Tbx3, Tbx19), thymus (Tbx1), liver (Tbx3), lung (Tbx2, Tbx4, Tbx5), pigmentation (Tbx15) and the immune system (Tbx21), among others (see Table 1 for references), and control processes that are also relevant to the development of cancers (Box 2).
During development, T-box genes have roles in differentiation, proliferation, tissue integrity and the EMT, which are processes that are also relevant to the development of cancer and metastasis. Upregulation or downregulation of T-box genes has been associated with a variety of different types of cancer and may be causal in promoting neoplasia:
Increased levels of TBX2 and TBX3 are associated with several types of cancer, including melanoma, pancreatic and mammary carcinoma, and the genes are expressed during normal development of the corresponding tissues (Abrahams et al., 2010; Begum and Papaioannou, 2011; Carreira et al., 1998; Douglas and Papaioannou, 2013; Rowley et al., 2004). Both genes repress the cyclin-dependent kinase inhibitors p16INK4a (Cdkn2a) and p21Cip1 (Cdkn1a), which are involved in proliferation control, and in fibroblast cell lines TBX2 and TBX3 expression leads to bypass of senescence (Carlson et al., 2001; Jacobs et al., 2000). In mammary tumor cell lines, TBX2 is a powerful growth promoter and TBX3 drives migratory behavior (Peres et al., 2010) and promotes cancer stem-like cell phenotypes (Fillmore et al., 2010).
Mutations or increased copy number of T are commonly found in the bone tumor chordoma and T expression is necessary for proliferation of chordoma cell lines. As the name implies, chordomas are thought to be of notochordal origin and one of the major roles of T is in the development of the notochord (Nibu et al., 2013). T has also been implicated in controlling EMT in some cancers such as adenoid cystic carcinoma and colorectal cancer (Sarkar et al., 2012; Shimoda et al., 2012).
Tbet (Tbx21) and Eomes have indirect effects on metastasis as they are both involved in the normal development of natural killer (NK) T cells (Gordon et al., 2012) and, in their absence, adaptive antitumor immune responses are compromised by lower numbers of NK cells such that susceptibility to metastatic cancers is greatly increased (Lazarevic et al., 2013).
Early development, gastrulation and somite patterning
A number of T-box genes are expressed in pre-implantation and early post-implantation embryos and play varied roles in embryo patterning and survival (Fig. 1 and Fig. 4). Tbx3 is expressed in the inner cell mass (ICM), which is the pluripotent lineage of the pre-implantation blastocyst, and is later expressed in the extraembryonic endoderm of the developing yolk sac where it has a role in yolk sac development (Davenport et al., 2003). Mga is also expressed in the ICM and its derivative epiblast and is necessary for its survival after implantation (A. J. Washkowitz, C. Schall, K. Zhang, W. Wurst, T. Floss, J. Mager and V.E.P., unpublished) (Fig. 4). Trophectoderm (TE), which is the first tissue to differentiate, is required for the early interactions between the embryo and the uterus that establish implantation. The differentiation and proliferation of the TE beyond the blastocyst stage is critically dependent on Eomes (also known as Tbr2), which is expressed in the TE at the blastocyst stage (Fig. 4). Eomes loss results in lethality through failure of trophoblast stem cell maintenance after implantation (Ciruna and Rossant, 1999; Hancock et al., 1999; Russ et al., 2000; Strumpf et al., 2005). This is one of the earliest known roles for a T-box gene in mammals, but Eomes also has additional roles in the early patterning of the embryo, as revealed by tissue-specific deletion of the gene at later stages of development: loss of Eomes from its expression domain in the proximal posterior epiblast at the pre-streak stage blocks the epithelial-to-mesenchymal transition (EMT) and the migration of nascent mesoderm away from the primitive streak (Arnold et al., 2008a). Progenitors of the cardiac mesoderm, which are normally in the earliest cohort of cells to exit the streak, are not specified in the absence of Eomes and fail to express Mesp1, the master regulator of cardiovascular cell fate and a direct transcriptional target of Eomes (Costello et al., 2011). A few hours later, the specification and migration of definitive endoderm (DE) is also blocked (Arnold et al., 2008a). Specific loss of Eomes from the visceral endoderm (VE) results in the primitive streak not being positioned properly and in failure of induction and migration of the anterior visceral endoderm (AVE), which is an essential component of axis patterning. This effect is mediated through direct transcriptional regulation of Lhx1, which is part of the gene regulatory network controlling AVE formation and function (Nowotschin et al., 2013). Thus, as the pattern of expression shifts with the progressive tissue rearrangements that characterize gastrulation (Fig. 4), Eomes adopts different, context-dependent roles in cell lineage specification and embryo patterning.
Like Eomes, T also affects the specification and migration of nascent mesoderm cells during gastrulation, in particular that of precursors of the node, notochord and posterior mesoderm. There are many mutant alleles of T in mice, all of which exhibit the hallmark loss of the notochord and posterior mesoderm but display differences in the extent of posterior development, with more severe alleles progressively affecting mesoderm at more anterior axial levels in heterozygotes as well as homozygotes (Kispert, 1995; Papaioannou, 2001; Showell et al., 2004). The notochord acts as an important signaling center during neural tube and somite patterning and its loss thus results in patterning defects. The morphology of the node is also compromised in T mutants, contributing, along with disruption of the midline, to abnormalities in right/left axis determination (Concepcion and Papaioannou, 2014; Conlon et al., 1995; King et al., 1998). However, despite T being one of the best studied developmental genes, there is still a lot to learn about how it exerts its effects on mesodermal structures and the process of gastrulation. A screen to identify targets of zebrafish ntl (also known as ta), which is one of two zebrafish orthologs of T (Martin and Kimelman, 2008), identified a transcription factor gene, flh, the homolog of the mouse Noto gene, which is necessary for normal node and cilia formation. In addition, a large number of other genes involved in the formation of primary germ layers, mesoderm formation and mesoderm cell migration were identified by genome-wide binding site mapping as potential targets of T orthologs in zebrafish (Morley et al., 2009), Xenopus (Gentsch et al., 2013) and mouse (Lolas et al., 2014), indicating the existence of gene regulatory networks directed by brachyury and providing a treasure trove of candidate genes to elucidate the additional roles of this T-box gene.
T is expressed in the primitive streak and continues to be expressed in the core of the developing allantois, which is a midline, mesodermal structure derived from the posterior primitive streak that subsequently forms the umbilical cord (Fig. 4). In line with this, homozygous T mutants die due to failure of allantois development; in mutants, the core fails to survive and the vascular network fails to form (Inman and Downs, 2006). A different T-box gene, Tbx4, is similarly required for allantois growth, vascularization and fusion with the chorion, although in Tbx4 mutants endothelial cells are formed but fail to undergo vascular remodeling (Arora et al., 2012a; Naiche and Papaioannou, 2003). A number of Tbx4 candidate target genes that are involved in chorio-allantoic fusion and vascular remodeling have been identified, indicating that Tbx4 affects a network of genes in the allantois (Arora et al., 2012a; Naiche et al., 2011; Naiche and Papaioannou, 2003).
In contrast to T, Tbx6 is expressed in the presomitic mesoderm of the primitive streak and has several distinct effects on the specification and differentiation of the presomitic mesoderm, as well as on ciliogenesis in the node. Tbx6 mutations result in disruption of the anterior-posterior polarity of somites and, in homozygous mutants for a loss-of-function allele, in the formation of ectopic neural tubes in place of the posterior somites, as well as disruption of left/right patterning (Chapman et al., 2003; Chapman and Papaioannou, 1998; Hadjantonakis et al., 2008; Watabe-Rudolph et al., 2002). As with other T-box genes, Tbx6 is likely to be acting in different tissue contexts through different target genes, some of which have been identified. For example, the regulation of Dll1, which encodes a Notch1 ligand, may contribute to the left/right asymmetry defect of Tbx6 mutants through perinodal Notch signaling, but, as yet, it is unknown how node ciliogenesis is regulated by Tbx6. In the presomitic mesoderm, Tbx6 normally represses the neural determinant gene Sox2 through an indirect interaction with its N1 enhancer, thus influencing the neural versus mesodermal cell fate choice in axial stem cells (Takemoto et al., 2011). In its role in somite patterning, Tbx6 in conjunction with Wnt and Notch signaling lies at the center of a regulatory network that controls somite polarization and border formation through the direct targets Dll1 (Hofmann et al., 2004; Watabe-Rudolph et al., 2002; White and Chapman, 2005), Msgn1 (Nowotschin et al., 2012; Wittler et al., 2007), which encodes a transcription factor involved in segmentation, Mesp2 (Yasuhiko et al., 2006), which encodes a transcription factor essential for somite border formation and rostrocaudal patterning, and Ripply2 (Dunty et al., 2008), which encodes a component of segment boundary formation. At least one other T-box gene, Tbx18, has a role in somite patterning. It is transiently expressed in the anterior half of newly formed somites and its expression is maintained in the anterior lateral sclerotome (Kraus et al., 2001). Tbx18 interacts with another transcription factor, Pax3, in regulating the gene expression program necessary for maintenance of anterior-posterior somite polarity (Bussen et al., 2004; Farin et al., 2008).
T-box genes are also involved in all stages of heart development: the initial specification of the cardiac mesoderm, the regionalization of the primitive heart tube into chamber and non-chamber myocardium, the formation of the valves and septa that separate the chambers, the recruitment of second heart field (SHF) cells to the outflow tract (OFT), and formation of the cardiac conduction system (CCS). In addition, some T-box genes (T and Tbx6) indirectly affect the direction of heart looping through their effects on left/right axis formation, as discussed above. Importantly, several human developmental syndromes that include cardiac abnormalities are associated with mutations in T-box genes: TBX1 (DiGeorge syndrome), TBX3 (ulnar mammary syndrome), TBX5 (Holt-Oram syndrome) and TBX20 (atrial septal defect 4 and other abnormalities) (Table 1).
Similar to the situation in the early mesoderm, a number of T-box genes – Eomes, Tbx1, Tbx2, Tbx3, Tbx5, Tbx18 and Tbx20 – are expressed in the developing heart with both overlapping and unique areas of expression but with each gene playing at least some unique and essential role in cardiogenesis, as revealed by mutational analysis (reviewed by Greulich et al., 2011; Hariri et al., 2012) (Fig. 1 and Fig. 5). As noted above, Eomes is involved in the initial specification of cardiac mesoderm and, once the primitive heart tube has formed from the cardiac crescent, Tbx5 and Tbx20 both play early, non-redundant roles in the elaboration of the chambers of the linear heart tube. Homozygous mutant embryos for either gene die at mid-gestation with poorly developed chambers in which chamber myocardial genes are not activated (Bruneau et al., 1999; Stennard et al., 2005). The cardiac expression of Tbx5 is primarily limited to the first heart field (FHF) derivatives in the posterior region of the heart. Through its interaction with co-factors such as Nkx2-5 and Gata4, Tbx5 is responsible for the activation of a set of chamber myocardial genes including Nppa and Cx40 (Gja5) (Bruneau et al., 2001). Heterozygous mutations reveal later septation defects and defects in the CCS that are also characteristic of patients with Holt-Oram syndrome (Bruneau et al., 2001; Greulich et al., 2011; Takeuchi et al., 2003). Similarly, Tbx20 activates a chamber myocardial program in derivatives of the FHF through direct targets such as Nppa, Nkx2-5 and Mef2c. Loss-of-function mutations of Tbx20 result in poorly developed chambers and ectopic upregulation of Tbx2, a gene normally restricted to the atrioventricular canal (AVC) and OFT, throughout the heart tube (Greulich et al., 2011). Tbx20, acting via activation of Bmp2, also plays a role in the development and patterning of the non-chamber myocardium of the AVC and in the EMT that is required to form cushion mesenchyme (Cai et al., 2011). In the endocardium, Tbx20 is required upstream of Wnt signaling for endocardial cushion formation and valve elongation (Cai et al., 2013). Studies with conditional alleles have also identified roles for Tbx20 in the adult heart, where it regulates the cardiac ion flux that is crucial for excitation-contraction functions in myocytes, a role that helps explain the involvement of this gene in adult cardiomyopathy (Shen et al., 2011).
In contrast to Tbx5 and Tbx20, Tbx2 and Tbx3 are co-expressed in the non-chamber myocardium of the AVC and serve to suppress proliferation and chamber-specific gene expression, setting up a dichotomy between chamber and non-chamber myocardium (Christoffels et al., 2004). Both genes are involved in establishing the identity and function of the CCS and in inducing atrioventricular cushion development through Bmp2. Although mutations in each gene produce a distinct cardiac phenotype, indicating unique functions, they also have redundant roles in the AVC (Frank et al., 2012; Greulich et al., 2011; Singh et al., 2012). Similarly, during the development of the arterial pole of the heart, Tbx2 and Tbx3 play roles in SHF deployment and OFT alignment. Together with Tbx1, which is required for Tbx2 and Tbx3 expression in the pharyngeal mesenchyme and neural crest, they form a gene regulatory network for arterial pole development. Accordingly, loss of function of Tbx1 and either Tbx2 or Tbx3 causes severe defects in OFT development (Mesbah et al., 2012). Tbx1 is expressed throughout the pharyngeal region in endoderm, ectoderm and neural crest-derived mesenchyme and it affects many aspects of pharyngeal development, including the development of SHF, in which it drives elevated proliferation and delayed differentiation as SHF cells are added to the elongating arterial pole (Parisot et al., 2011). Loss-of-function mutations of Tbx1 in mice result in severe OFT defects including truncus arteriosus and aortic arch artery defects, among other craniofacial and glandular abnormalities, similar to the defects seen in the DiGeorge syndrome in humans (Papangeli and Scambler, 2013; Scambler, 2010).
From the venous pole of the heart, the proepicardium spreads over the surface to form the epicardium, which then undergoes EMT, invades the underlying myocardium, and contributes to the smooth muscle cells of the coronary vasculature. In addition to its role in somite patterning, Tbx18 is expressed in the proepicardium, the precursors of the myocardium of the sinus venosus, and the sino atrial node (SAN). Mutants for Tbx18 die at birth with defects in the venous return of the heart including delayed myocardial differentiation and a severely reduced SAN. In addition, the epicardium and coronary vessels have structural and functional defects that lead to defective vascular plexus remodeling in the coronary vasculature (Greulich et al., 2011; Wu et al., 2013).
Many T-box genes, including T, Eomes, all the members of the Tbx2 subfamily and three members of the Tbx1 subfamily, are expressed in the developing limbs, and functional roles for most of these have been identified (for reviews see Duboc and Logan, 2011; King et al., 2006; Tanaka, 2013; Washkowitz et al., 2012) (Fig. 1 and Fig. 6). T, which is expressed in the subridge mesoderm beneath the apical ectodermal ridge (AER), plays a role in the regulation and maintenance of the AER (Liu et al., 2003). Tbx1 is expressed in the muscle masses of the developing limbs and may play a role in limb myoblast differentiation (Dastjerdi et al., 2007), although no limb abnormalities have yet been identified in mice lacking Tbx1. Mutation of Tbx15 results in mild skeletal abnormalities, notably in the scapular blade, a defect that is exacerbated in Tbx15/Pax3 compound mutants (Farin et al., 2008). In humans, mutation of TBX15 results in Cousin syndrome, which includes abnormalities of the limb girdles similar to the defects observed in the corresponding mouse mutant (Table 1). Mutation of the closely related Tbx18, which is expressed in the anterior lateral sclerotome and co-expressed with Tbx15 in the core of the limb bud, results in no apparent limb defects, possibly indicating redundant functions, although defects in the scapular blade are apparent in Tbx18/Pax3 compound mutants (Farin et al., 2008; King et al., 2006). Tbx2 and Tbx3 are both involved in digit development. They are expressed in overlapping regions at the margins of both fore- and hindlimbs and Tbx3 is also expressed in the AER (Gibson-Brown et al., 1996). Tbx2 controls digit formation by repressing the Bmp antagonist Grem1 in the posterior limb margin, thereby terminating Fgf4/Shh signaling in the posterior limb (Farin et al., 2013). In line with this, mutation of Tbx2 results in mild polydactyly (Harrelson et al., 2004). Mutation of Tbx3, by contrast, results in more severe limb defects, with lack of development of the footplate and failure to form the posterior elements of the forelimb (Davenport et al., 2003). The situation is somewhat different in humans with mutation in TBX3, as only the forelimbs are affected in ulnar-mammary syndrome (Table 1) (King et al., 2006).
The other two members of the Tbx2 subfamily, Tbx5 and Tbx4, show limb expression that is primarily limited to the forelimb or hindlimb, respectively, and are involved in limb initiation and differentiation. In the earliest stages, Tbx5 is at least partly responsible for the localized EMT of the somatopleure that initiates forelimb bud formation (Gros and Tabin, 2014), although it is not known whether Tbx4 plays a similar role in the hindlimb region. Both genes regulate Fgf10 in the mesenchyme to establish the Fgf10/Fgf8 positive-feedback loop between mesenchyme and AER that drives limb outgrowth. However, the precise roles for these two genes differ slightly, as Tbx4 is not exclusively required for the initiation of Fgf10 expression in the hindlimb, whereas Tbx5 alone is necessary and sufficient to initiate Fgf10 expression in the forelimb. Conditional deletion mutations have shown that each gene is required during a brief period of limb initiation but is not required thereafter to maintain outgrowth, as the Fgf10/Fgf8 signaling loop is self-sustaining. Instead, both genes later play roles in muscle and tendon morphogenesis (Duboc and Logan, 2011; King et al., 2006; Naiche et al., 2005). It has already been mentioned that mutations in TBX5 result in Holt-Oram syndrome, which is also known as heart-hand syndrome due to abnormalities in development of both the arm and hand. Similarly, mutation in TBX4 results in the small patella syndrome, which includes a variety of defects in the hindlimbs (Table 1).
Because of their almost exclusive expression in either the fore- or hindlimb, Tbx5 and Tbx4 are also candidate determinants of limb type-specific patterning (Logan, 2003). Although ectopic misexpression studies in the chick lend support to this idea, gene deletion and replacement experiments in the mouse indicate that Tbx4 can direct limb development in a forelimb lacking Tbx5 and that the resulting limb has the molecular and morphological characteristics of a forelimb, indicating that Tbx4 does not direct hindlimb-specific morphologies and that Tbx5 is not required for forelimb-specific morphology (Minguillon et al., 2005). Complementary experiments have not been performed in the hindlimb and, although Tbx4 can partially rescue hindlimb characteristics in a Pitx1-deficient hindlimb to a greater extent than Tbx5, there is no strong evidence to suggest that the two genes direct limb-specific morphologies (Duboc and Logan, 2011).
T-box genes and stem cell biology
Cells with the stem cell properties of self-renewal and differentiation potential exist either transiently during embryonic development or permanently in adult tissues, thereby providing a means of maintaining homeostasis by renewing or regenerating tissues. These cells may be pluripotent, giving rise to many cell types, or may have a more limited range of differentiation possibilities as progenitors of one or two differentiated cell types. Stem cell lines have been derived from early embryos and include ICM-derived pluripotent embryonic stem cells (ESCs) and epiblast-derived stem cells (EpiSCs), as well as trophoblast-restricted trophoblast stem cells (TSCs). It is also possible to induce stem cell properties by genetic or other manipulation of fibroblast cells in vitro to obtain induced pluripotent stem cells (iPSCs). Given their pivotal role in many developmental processes, it is perhaps not surprising that several T-box genes have been implicated either as stem cell factors that promote self-renewal or as differentiation factors that drive the differentiation of stem or progenitor cells (Takashima and Suzuki, 2013) (Fig. 1 and Fig. 7). As outlined below, complementary studies of T-box factors in stem cell lines in vitro and in stem/progenitor cell populations in vivo have been valuable for understanding the mechanisms of early development and adult tissue homeostasis, as well as furthering the goal of harnessing stem cells for use in regenerative medicine.
The pivotal role of Tbx3 in stem cell renewal and differentiation
Tbx3, which is expressed in the ICM of the pre-implantation embryo (Fig. 4) and in undifferentiated ESCs, has been identified as one of the core transcription factors in the regulatory circuitry of pluripotency that is required to maintain ESCs in a pluripotent state in vitro, promoting self-renewal and suppressing differentiation (Ivanova et al., 2006; Lu et al., 2011; Niwa et al., 2009). In the absence of leukemia inhibitory factor (LIF), Tbx3 is sufficient to support the self-renewal of mouse ESCs (Niwa et al., 2009) and it has also been shown to improve the quality of iPSCs, including their germline competence (Han et al., 2010) (Fig. 7A). The maintenance of pluripotency in ESCs involves an interconnected transcriptional network and parallel circuitry that integrates signaling pathways with the core transcription factors, and there is evidence to suggest that the loss of one regulator can be compensated by adjusting the expression of other components (Ivanova et al., 2006). This might explain why the loss of Tbx3, which plays such an important role in ESC maintenance in vitro, does not result in a lethal phenotype during peri-implantation development in vivo (Davenport et al., 2003).
During the lineage specification and differentiation that occurs in vitro following the removal of pluripotency-promoting factors from ESCs, Tbx3 takes on different, context-dependent roles (Fig. 7B). It is necessary for mesendoderm and extraembryonic endoderm (ExEn) differentiation, as well as for the suppression of TE and ectoderm differentiation (Lu et al., 2011; Weidgang et al., 2013). Mechanistically, Tbx3 functions through direct binding and epigenetic modification of histones on the promoter of Gata6, which encodes an essential regulator of ExEn (Lu et al., 2011). Overexpression of either of the two isoforms of Tbx3 in ESCs results in differentiation and the downregulation of the pluripotency-related transcription factor Nanog, although only the Tbx3+2a isoform directly binds the Nanog promoter (Zhao et al., 2014).
Human ESCs (hESCs) present a somewhat different picture, as TBX3 promotes neuroepithelial but not endoderm differentiation during hESC differentiation (Esmailpour and Huang, 2012), possibly reflecting the different states of human and mouse ESCs. However, TBX3 in conjunction with EOMES has been shown to play a non-transcriptional role in the differentiation of DE from hESCs and mouse EpiSCs; during the early steps of differentiation, Tbx3 recruits the histone demethylase Jmjd3 (Kdm6b) to the Eomes enhancer to alter chromatin structure, thereby allowing enhancer-promoter interaction and Jmjd2-mediated transcriptional activation of Eomes, a mechanism that is conserved in mouse and human ESCs (Teo et al., 2011; Kartikasari et al., 2013). Additional studies are clearly needed to clarify the complex and highly context-dependent functions of Tbx3 in ESCs and differentiation.
In addition to its role in pluripotent stem cells in vitro, Tbx3 has been implicated in several stem/progenitor cell populations in vivo, including in the liver (Fig. 7B). During liver development, hepatoblasts are the bipotential progenitors of hepatocytes and cholangiocytes of the bile duct. Tbx3 is expressed in the developing liver bud and its loss leads to a small liver with both fewer hepatocytes and increased differentiation of cholangiocytes. It is somewhat controversial whether Tbx3 acts through repression of p19ARF (Cdkn2a) to promote hepatoblast proliferation (Suzuki et al., 2008) or whether it maintains hepatocyte differentiation and suppresses cholangiocyte differentiation through control of regulatory genes for these two cell types (Ludtke et al., 2009). In another example, overexpression of Tbx3 in the mammary gland of mice causes hyperplasia and accelerated mammary gland development, and is associated with an increased number of cells identified as mammary stem-like cells. In this context, Tbx3 directly represses Nfκbib, an inhibitor of the NF-κB pathway that plays a role in cell proliferation (Liu et al., 2011).
Additional T-box genes in stem/progenitor cell populations
As discussed earlier, Eomes plays an essential role in TE survival in the pre-implantation embryo but it is also necessary for TSC derivation and self-renewal and for the induction of TSCs from human fibroblasts (Chen et al., 2013; Kidder and Palmer, 2010; Russ et al., 2000). Furthermore, during the differentiation of ESCs, Eomes can promote either endodermal fate or cardiovascular fate depending on high or low levels of Activin/Nodal signaling, respectively (Fig. 7B). The induction of cardiac mesodermal fate in the absence of added morphogens is at least partly controlled by direct regulation of the transcription factor Mesp1 (van den Ameele et al., 2012). Mesp1 is also directly regulated by T, which appears to be necessary for robust cardiovascular precursor cell differentiation from ESCs (David et al., 2011).
Differentiating ESCs have been used to investigate the role of T-box genes during heart development; for example, within the FHF and SHF lineages that arise from a common cardiovascular progenitor cell population. Loss of function of Tbx1, which is expressed in the multipotent progenitor cell population, results in premature differentiation, whereas gain of function results in reduced differentiation, indicating that Tbx1 regulates the balance between proliferation and differentiation (Chen et al., 2009). Tbx5, which is expressed in the FHF and its derivatives, favors differentiation to the FHF lineage over the SHF lineage in differentiating ESCs, although it apparently cannot drive differentiation as it has no effect on ESC self-renewal; the effect of overexpression only becomes apparent in differentiating ESCs (Herrmann et al., 2011). Furthermore, Tbx5, in combination with several other factors, can reprogram postnatal fibroblasts directly into cardiomyocyte-like cells both in vitro and in vivo without passing through a stem/progenitor cell state (Fu et al., 2013; Ieda et al., 2010; Inagawa et al., 2012; Qian et al., 2012; Song et al., 2012).
Stem and progenitor cell populations within the intact embryo or adult are more difficult to investigate, although T-box genes are increasingly being identified as regulators of self-renewal or drivers of lineage-specific differentiation in various tissues in vivo. For example, in the developing tail of vertebrates, a self-renewing population of mesoderm progenitor cells resides in the tail bud and continuously differentiates throughout the process of somitogenesis (Dubrulle and Pourquie, 2004). In zebrafish, the T ortholog ntl has been implicated in maintaining this progenitor cell niche through the direct regulation of Wnt ligands and the retinoic acid degradation enzyme gene cyp26a1 (Martin and Kimelman, 2010). In a different situation, T has been found to be required for the self-renewal of spermatogonial stem cells (SSCs) derived from mouse testes. In this context, T is directly regulated by Etv5, one of the transcription factors involved in regulating SSCs (Wu et al., 2011).
In another example, Tbx1 was identified as a factor affecting the self-renewal of hair follicle stem cells (HF-SCs) in an RNA interference screen to identify regulators of self-renewal capacity, possibly acting through repression of Bmp signaling. Loss of Tbx1 diminishes HF-SC renewal capacity resulting in eventual thinning of the hair (Chen et al., 2012).
Eomes, as well as Tbr1, also plays a role in the differentiation of neural stem cells (NSCs) in developing and adult brains. Tbr1 regulates neuronal output from NSCs in the olfactory bulb by promoting the production of neurons and oligodendrocytes and inhibiting the formation of astrocytes (Mendez-Gomez et al., 2011). Similarly, although not directly affecting the self-renewal capacity of radial glial cells (RGCs) within the developing cerebral cortex, Eomes is a key determinant of the differentiation of intermediate progenitor cells (IPCs), which comprise the RGC daughter transit amplifying progenitor population (Arnold et al., 2008b; Sessa et al., 2008). The transition from NSC to IPC is regulated by miR-92a through post-transcriptional regulation of Eomes (Bian et al., 2013). Eomes is also a crucial regulator of granule cell formation from NSCs in the dentate gyrus, both during development and in the adult. In this case, Eomes is also thought to play a feedback role in regulating NSCs, as in its absence the NSC pool increases, possibly through regulation of the NSC factor Sox2 (Hodge et al., 2012).
Finally, it has been shown that Tbx18 transduction into the adult guinea pig heart using an adenoviral vector results in the direct conversion of ventricular myocytes to pacemaker cells without the need for passage through a pluripotent state (Kapoor et al., 2013) – a case of direct reprogramming that could have implications for the use of T-box genes in directed differentiation.
There are many other organs and tissues in which T-box genes have major effects that have not been covered here (e.g. craniofacial development, mammary gland development, immune T cells, lung development; see Table 1), but from this sampling of some of the developmental processes controlled by T-box genes it should be clear that this ancient family of transcription factor genes has a multitude of diverse functions throughout development and that, throughout the evolution of metazoans, some ancient functions have been conserved while novel ones have evolved with the evolution of complexity. It is also evident that T-box genes are involved in all of the major developmental signaling pathways, and we have seen how multiple T-box genes with both unique and overlapping functions can affect the development of a single organ and how a single T-box gene can play roles in different tissues and/or at different times in development. This tissue and/or temporal specificity is accomplished in part by tight regulation of the expression of T-box genes, by the availability of co-factors, by the presence of other T-box proteins, and by the location and binding specificity of TBEs in different promoters. With the availability of conditional mutant alleles in mice, this complexity is gradually being elucidated and functions are being assigned to each area of expression, although there is still much work to be done. The regulation of T-box gene expression is an area ripe for exploration, as is the important issue of determining the battery of downstream targets affected in different developmental contexts. Finally, through the study of non-mammalian organisms, this highly conserved family offers a rich source of material that can further our understanding of the evolution of developmental mechanisms.
I thank Ripla Arora and Jeremy J. Gibson-Brown for helpful discussions and critical reading of the manuscript and Andreas Kispert for providing Fig. 5.
This work was supported in part by a grant from the National Institute of Child Health and Human Development of the National Institutes of Health. Deposited in PMC for release after 12 months.
The author declares no competing financial interests.