Changes in developmental regulatory programs drive both disease and phenotypic differences among species. Linking human-specific traits to alterations in development is challenging, because we have lacked the tools to assay and manipulate regulatory networks in human and primate embryonic cells. This field was transformed by the sequencing of hundreds of genomes – human and non-human – that can be compared to discover the regulatory machinery of genes involved in human development. This approach has identified thousands of human-specific genome alterations in developmental genes and their regulatory regions. With recent advances in stem cell techniques, genome engineering, and genomics, we can now test these sequences for effects on developmental gene regulation and downstream phenotypes in human cells and tissues.
Introduction
Humans differ from chimpanzees, our closest living relatives, and other mammals in a variety of traits, including disease susceptibilities. Many of these differences have their origins in development. The fossil record shows that some human traits, such as the pelvic morphology associated with upright walking (Harcourt-Smith et al., 2004), evolved around the time of divergence from our common ancestor with chimpanzees about six million years ago. Other traits, such as loss of a prominent brow ridge (Lieberman, 2000), emerged only after modern humans split from Neanderthals and other extinct hominins. Some phenotypes are very recently evolved, as evidenced by variation between modern human populations. These include pigmentation (Hancock et al., 2011), keratinization (Gautam et al., 2015), hair texture (Jablonski and Chaplin, 2014; Kamberov et al., 2013) and high altitude adaptation (Huerta-Sanchez et al., 2014; Simonson et al., 2010; Yi et al., 2010). For many distinctive traits, such as social behaviors, symbolic thought and spoken language (Sterelny, 2011), we have no evidence of the time period of evolution or only indirect evidence from changes in the material culture left behind in the archaeological record.
Collectively, human-specific traits have allowed our species to dominate all climates and modify the landscape in a dramatic way never seen in the history of life on earth. At the same time, our species has acquired a unique profile of susceptibility to different diseases compared with our close relatives. Some examples are AIDS (Varki and Altheide, 2005), cardiovascular disease (Varki et al., 2009), neurodegenerative disease (Finch, 2010) and psychiatric disorders (Crow, 2000, 2007). Our high burden of neurological disease may be an ‘Achilles heel’ associated with cognitive adaptations (Crow, 2000, 2007). To understand the evolutionary forces that shaped human-specific traits and the molecular mechanisms through which changes occurred, we must first track down the genetic alterations underlying phenotypic differences between humans and our close relatives. As in other species, the evolution of human-specific traits must have emerged through genetic modification of developmental, physiological or behavioral programs.
Development is a highly constrained and tightly regulated process, orchestrated through complex gene regulatory networks (Davidson and Erwin, 2006). There is a remarkable conservation of development between organisms as evolutionarily distant as cnidarians, insects and mammals. The same molecular pathways control this conserved developmental program across animal lineages. For example, humans, flatworms and cnidarians use the same basic elements of paracrine signaling cascades, such as the Wnt and TGFβ pathways (Finnerty et al., 2004; Carroll, 2005). Transcription factors that regulate development are also highly conserved, and they tend to regulate the same general processes in diverse species, including regulation of body axis patterning by Hox genes (Lemons and McGinnis, 2006), light-sensing organs by Pax6 (Gehring, 2011), head formation by Otx homologs (Yasuoka et al., 2014) and heart morphology by Tinman/Nkx2-5 (Erwin, 1999). This high level of conservation has allowed us to study human development, despite limited access to human tissues for research, using model organisms as proxies.
Conserved developmental programs, and hence morphology and other traits, are modified through two primary mechanisms: direct alterations to genes and thereby the functions of the encoded products (proteins, RNAs), and changes in gene regulation (transcription, splicing, translation, post-translational modifications). As most developmental genes are pleiotropic and participate in multiple independent developmental processes, their evolution is highly constrained. By contrast, gene regulatory elements, including distal enhancers (Pennacchio et al., 2006), tend to function in a more limited number of cell types and stages, combining additively to control the complex expression patterns of developmental genes (Noonan and McCallion, 2010). This modularity makes gene regulation an ideal template for the evolution of morphology (Carroll, 2008). Supporting this idea, there is mounting evidence that human traits evolved largely through genetic changes in regulatory regions (Haygood et al., 2010; Horvath et al., 2014; McLean et al., 2011; Pollard et al., 2006a,b), as initially proposed by King and Wilson (1975), although the relative contributions of coding versus non-coding sequences continue to be debated.
In this Review, we discuss how genomics is transforming the study of human developmental biology to enable direct analysis of genetic variants that arose during human evolution in their native context. The transformation started with genome sequencing and the development of comparative genomic techniques for pinpointing sequences, including both genes and regulatory elements, that are unique to humans compared with chimpanzees and other mammals. Although a few developmental genes have been associated with the evolution of particular human-specific traits (discussed further below), it has been more challenging to characterize the role of regulatory sequences in human evolution and disease. Functional genomics has helped to address this challenge by generating rich information about the cell types and developmental stages in which uniquely human sequences, including non-coding sequences, function. By introducing human variants of proteins and gene regulatory sequences into model organisms and assaying their effects during embryogenesis, a handful of human-specific sequences have been partially characterized (reviewed by Devoy et al., 2012; Enard, 2014). However, significant further work is needed to test the hypothesis that these sequences altered traits during human evolution. Fortunately, we are entering an era in which it is possible to identify human-specific regulatory elements and to characterize them functionally in human and non-human primate cells using emerging techniques from stem cell biology and genome engineering. Directly studying human development will accelerate understanding of what makes our species unique.
The genome sequencing era
The availability of genome sequences for humans (International Human Genome Sequencing Consortium, 2004), primates (The Chimpanzee Sequencing and Analysis Consortium, 2005; Rhesus Macaque Genome Sequencing and Analysis Consortium et al., 2007), extinct hominins (Meyer et al., 2012; Prufer et al., 2014) and many other vertebrates has fueled the discovery of human-specific DNA sequences. Comparative genomic studies have cataloged three general classes of genomic differences between humans and other primates: large chromosomal alterations, smaller insertions and deletions (indels), and single nucleotide substitutions (Fig. 1). Developmental loci have accumulated polymorphic and fixed differences of all three types during human evolution. Although the catalog of human-specific differences is continually growing, functional proof of how genetic changes translate into phenotypic changes is scarce. In Table 1, we summarize select examples of functional studies that have been undertaken in order to assess the impact of human evolutionary changes. In a few cases, these studies have illuminated the role that human-specific changes could have had in the evolution of human traits, although a definitive causal role in altering human phenotypes has yet to be established for any of these human genomic regions.
Chromosomal alterations
Large genomic duplications, deletions and rearrangements (Fig. 1) are relatively rare, but they encompass many developmental loci owing to their size, which is usually thousands of base pairs (Coe et al., 2014; Girirajan et al., 2011, 2013; Ma et al., 2006). For example, segmental duplications are typically defined as regions greater than one kilobase (kb) with >97% sequence identity (Marques-Bonet et al., 2009). Human-specific structural variation can be challenging to identify, because these loci are difficult to assemble and align. They are therefore poorly represented in genome assemblies, and their discovery often requires targeted sequencing (Chaisson et al., 2015) and cytogenetic techniques. The first structural differences between the human and chimp genomes were discovered using chromatin-stained banding techniques and include the fusion of two ancestral ape chromosomes to form human chromosome 2, human-specific constitutive heterochromatin C bands on chromosomes 1, 9, 16 and Y, and human-specific pericentric inversions on chromosomes 1 and 18 (Yunis and Prakash, 1982). Fluorescent in situ hybridization (FISH) and comparative genomic hybridization (CGH) arrays identified >60 human-specific segmental duplications (Goidts et al., 2006; Jauch et al., 1992; Wilson et al., 2006) and 152 genes displaying copy number variation (Armengol et al., 2010; Fortna et al., 2004).
Many of these structural variants have altered gene expression or downstream phenotypes in humans. The pericentric inversion of chromosome 1, for example, is associated with human developmental and neurogenetic diseases and contains copy number increases of the developmental genes SLIT-ROBO Rho GTPase activating protein (SRGAP2) (Box 1) (Dennis et al., 2012), HYDIN (Doggett et al., 2006), and several DUF1220 domain-containing gene families [e.g. the neuroblastoma breakpoint family (NBPF)] (Fortna et al., 2004). This region demonstrates the complex functional consequences of structural variants. The human-specific duplication of the locus created two duplicate genes (SRGAP2B and SRGAP2C). SRGAP2C dimerizes with the ancestral SRGAP2A and phenocopies SRGAP2 inhibition, which appears to have led to changes in radial neuron migration and cellular phenotypes (discussed further in Box 1) (Charrier et al., 2012; Dennis et al., 2012). Another interesting human-specific structural variant occurred at chromosome 15q13-q14. This region contains duplications of several genes, including ARHGAP11B, which is a partial copy of the gene ARHGAP11A, truncated by the boundary of the duplication. ARHGAP11B is expressed in the developing brain of humans and appears to regulate brain development (Florio et al., 2015). A polymorphic human-specific duplication also created the salivary amylase gene AMY1, which probably enabled humans to eat a high-starch diet (Perry et al., 2007) and is associated with obesity (Falchi et al., 2014; but see Usher et al., 2015). Supporting the adaptive role of duplications in human evolution (Iskow et al., 2012), both coding (Hahn et al., 2007) and non-coding (Kostka et al., 2010) elements in duplicated loci show signatures of positive selection.
The gene SRGAP2 duplicated in a series of genomic events in the lineage leading to Homo, between 3.4 and 1 million years ago (Dennis et al., 2012). In humans, three additional, although partial, copies of this gene are present: SRGAP2B, C and D (Charrier et al., 2012; Dennis et al., 2012). The ancestral copy, present in all mammals, is named SRGAP2A in humans and has been recently demonstrated to be important in brain cortical development in mice (Guerrier et al., 2009). The SRGAP2 duplicates encode a truncated F-BAR domain that binds to SRGAP2A and antagonizes its function during neuronal migration and morphogenesis. The introduction of human-specific SRGAP2C in utero in mouse pyramidal neurons induces a reduction in dendritic spine heads, longer spine necks and higher spine density compared with control neurons. In addition, it was found that mutant mice lacking SRGAP2 display reduced width of dendritic spine heads, longer spine necks and a higher density of dendritic spines (Charrier et al., 2012). So, the expression of SRGAP2C mimics SRGAP2 deficiency during neuronal migration, leading to a deficit in branching in the leading process of migrating neurons and allowing neurons to reach their final position in the cortical plate faster than control neurons. As dendritic spines are known to enhance synaptic connectivity, enable linear integration of synaptic inputs, and implement synapse specific plasticity (Yuste, 2011), Charrier and colleagues (Charrier et al., 2012) speculate that expression of SRGAP2C might allow human cortical pyramidal neurons to receive and integrate a significantly higher number of synaptic inputs without saturation, which could have important implications for cognition, learning and memory. However, spine density in various human and chimpanzee cortical regions is similar (Bianchi et al., 2013). Charrier et al. (2012) also reported that Srgap2 knockout mice display reduced viability, suggesting that loss of SRGAP2 activity might also have detrimental consequences. Supporting this idea, a large genomic alteration affecting the human ancestral SRGAP2A gene may be responsible for the early-infantile encephalopathy and associated epilepsy displayed in a patient (Saitsu et al., 2012). At this time, it is not clear what human traits, if any, were directly affected by the duplication of SRGAP2.
Indels
Human-specific duplications and deletions of DNA shorter than one kilobase are numerous and comprise ∼3.5% of the human genome, most of which is non-coding (Britten, 2002; The Chimpanzee Sequencing and Analysis Consortium, 2005; Varki and Altheide, 2005). They contribute more base pairs to human-chimp differences than do individual DNA substitutions (see below), albeit fewer than larger chromosomal alterations. Indels can have large functional effects. Non-coding indels can alter human phenotypes by modifying or completely deleting conserved developmental enhancers (Table 1). A genome-wide survey revealed 510 highly conserved sequences that were lost in humans, most of which were non-coding, that included a forebrain subventricular zone enhancer near the tumor suppressor gene GADD45G and a sensory vibrissae and penile spine enhancer for the androgen receptor gene (McLean et al., 2011) (Box 2). Additionally, retroelements resembling transcription factor binding sites have expanded regulatory networks in a variety of species (reviewed by de Souza et al., 2013). Indels also affect genes, including changing the splicing, reading frame, start or end of genes (Varki and Altheide, 2005). These changes have generated many human-specific pseudogenes (Karro et al., 2007) and transcripts targeted to nonsense-mediated decay (Lareau et al., 2007), including the loss of hundreds of olfactory receptors during human evolution (Gilad et al., 2005; Malnic et al., 2004). Quantifying rates and patterns of human-specific chromosomal alterations and shorter indels is difficult, despite the fact that their evolution can be modeled, because estimation and inference with indel models is computationally challenging (Chindelevitch et al., 2006; Diallo et al., 2007). Additionally, indels are difficult to detect in whole-genome and exome data (Fang et al., 2014). Thus, developing bioinformatics methods for detecting indels and statistical tests for selection on indels are important areas for future research.
McLean and colleagues (McLean et al., 2011) discovered several non-coding functional regions that may have played a role in human evolution. They conducted a genome-wide search for conserved regions that were lost in the human genome after divergence from chimpanzees. These deleted regions are enriched near genes involved in steroid hormone receptor signaling and neural function (McLean et al., 2011). They decided to examine in detail a 60-kb human deletion downstream of the androgen receptor (AR) locus. Within this human-specific deletion lies an approximately 5-kb region that contains non-coding sequences that are highly conserved in other mammals. The authors cloned the corresponding chimpanzee and mouse regions and tested their capacity to drive expression of an hsp68-lacZ reporter gene during mouse development. Chimpanzee and mouse constructs both drove consistent lacZ expression in the facial vibrissae and genital tubercle of five or more independent transgenic embryos and the mouse sequence also drove expression in hair follicles. Authors found that lacZ expression was located in the mesoderm surrounding vibrissae follicles, and in the superficial mesoderm within the presumptive glans of the developing genital tubercle. Sixty-day-old transgenic mice showed expression in the superficial tissue underlying epidermal spines of the penis (McLean et al., 2011). Although previous studies have shown that AR is expressed in mesenchyme surrounding developing epithelial structures (Crocoll et al., 1998), colocalization studies of lacZ expression and AR would confirm if the enhancer drives lacZ to a subset of AR expression domains. It has been shown that humans lack micro- and macro-sensory vibrissae, whereas chimpanzees have micro-sensory vibrissae and mice have both micro- and macro-vibrissae (Muchlinski, 2010). However, the presence or absence of penile spines in humans and chimpanzees is controversial (Reno et al., 2013). In addition, it has been shown that AR is required for normal development of vibrissae, as castration shortens vibrissae in mice, and excess testosterone increases growth (Ibrahim and Wright, 1983). It has not been demonstrated that lack of vibrissae or penile spines is caused by lack of AR expression in the vibrissae follicles or developing penile spines. These results indicate that the human deletion removes a conserved enhancer sequence that directs expression in a subset of the AR expression pattern. However, functional experiments are still required to test whether deletion of this single enhancer can disrupt the formation of penile spines and vibrissae, a molecular event that may help explain the phenotypic loss of these structures in the human lineage.
Single nucleotide substitutions
The human and chimpanzee genomes differ by >30 million single nucleotide substitutions (1.2% of the human genome), and slightly less than half of these occurred on the human lineage, mostly in non-coding DNA (The Chimpanzee Sequencing and Analysis Consortium, 2005). Evolutionary theory posits that most substitutions are nearly neutral and are therefore unlikely to have produced uniquely human traits. To identify functional differences, research performed before whole-genome sequencing focused on non-synonymous changes to individual protein-coding sequences (Dorus et al., 2004). The first genome-wide comparative genomic analyses of humans and chimpanzees also focused on protein-coding differences and revealed that genes involved in immunity, sensory perception, and reproduction are enriched for positive selection in humans (Brown et al., 2013; Clark et al., 2003; Nielsen et al., 2005; Voight et al., 2006). Similar approaches, including ones that incorporate population genetic data (Racimo et al., 2014), have been used to identify genes that underwent selection after modern humans diverged from Neanderthals and Denisovans (Meyer et al., 2012; Prufer et al., 2014). Several developmental genes that acquired human-specific coding changes have been hypothesized to be responsible for traits that changed during human evolution (reviewed by Sikela, 2006; O'Bleness et al., 2012) (Table 1). These include the forkhead transcription factor FOXP2, which is associated with speech (Enard et al., 2002) and may have undergone positive selection, although it is not clear if selection targeted amino acid changes in the protein or nearby non-coding substitutions (Ptak et al., 2009) (Box 3). Another trait that changed significantly during human evolution is pigmentation. Several regulators of pigmentation, including the ligand for the c-KIT receptor (KITLG) (Sturm and Duffy, 2012), contain human-specific protein-coding changes that may have evolved through positive selection. Such examples are compelling, but additional work is required to show that these genes are indeed responsible for modification of the associated traits in humans.
The gene FOXP2 is one of the most extensively studied examples of a uniquely human genome sequence. First, a family with severe speech disabilities and an arginine-to-histidine substitution at position 553 (R553H) was identified (Hurst et al., 1990; Lai et al., 2001). To understand the function of FOXP2 in humans, mouse models that express the mutant form of FOXP2 present in affected members of the family were generated (Groszer et al., 2008). The homozygotes were severely developmentally delayed and died 3-4 weeks after birth. The cerebellum was abnormally small, with decreased foliation, and, behaviorally, the pups emitted fewer ultrasonic distress calls than did heterozygotes or wild-type mice. Further analyses of the gene identified two human-specific amino acid substitutions in comparison with chimpanzee, gorilla and macaque [a threonine-to-asparagine substitution at position 303 (T303N) and an asparagine-to-serine substitution at position 325 (N325S)] with evidence of positive selection (Enard et al., 2002). To investigate the phenotypic change resulting from the two amino acid differences, mice with humanized FOXP2 were generated and intensively studied (Enard et al., 2009). These ‘humanized mice’ were fully viable and fertile, in contrast to the R553H mouse model. They had no gross behavioral or anatomical abnormalities but showed increased neuronal dendritic length, increased synaptic plasticity and changes in ultrasonic vocalization in comparison with wild-type mice (Enard et al., 2009). Despite the gigantic effort taken to generate and study the consequences of the evolutionary changes in human FOXP2, there is still no clear or direct connection between the human-specific amino acid substitutions in FOXP2 and speech or language. It is clear that mutations of FOXP2 in humans result in speech impairments, indicating that FOXP2 plays a role in speech development, but the nature of this role and whether the gene participated in the evolution of language remain unknown.
As more vertebrates were sequenced, it became possible to use models of DNA evolution to scan the whole human genome for sequences that changed significantly more than expected by chance since divergence from chimpanzees (Bird et al., 2007; Bush and Lahn, 2008; Pollard et al., 2006b; Prabhakar et al., 2006a). To focus on those changes outside coding portions of genes that have a high probability to be functional, these studies analyzed regions that are highly conserved in non-human species (mammals or vertebrates) but significantly changed in humans. In the absence of functionally annotated non-coding sequences, using this signature of negative selection in other species helps to enrich for regulatory elements with constrained function (Ovcharenko et al., 2004; Prabhakar et al., 2006b; Schwartz et al., 2000; Siepel et al., 2005). These studies collectively identified >2500 non-coding regions defined as ‘human accelerated regions’ (HARs) (Hubisz and Pollard, 2014), most of which show signatures of positive selection but some of which were probably shaped by non-selective mechanisms, such as GC-biased gene conversion or loss of constraint (Katzman et al., 2010; Kostka et al., 2012; Pollard et al., 2006a; Ratnakumar et al., 2010; Sumiyama and Saitou, 2011). Similar techniques have also been used to analyze regions of the human genome that changed significantly since divergence from extinct hominins (Green et al., 2010). Interestingly, HARs are enriched for substitutions that pre-date the divergence from Neanderthals and Denisovans, suggesting that our genome did not evolve particularly rapidly during the emergence of modern humans (Burbano et al., 2012; Hubisz and Pollard, 2014). From a developmental perspective, HARs have a particularly interesting genomic distribution: they cluster nearby transcriptional factors and other regulatory genes expressed in embryos (Capra et al., 2013; Kamm et al., 2013b), suggesting that HAR mutations could be responsible for the evolution of human traits through modification of developmental gene regulatory networks.
The next-generation sequencing era
Comparative genomics identifies uniquely human genome sequences, but it does not tell us the cell types and developmental stages in which a human-specific sequence functions, making it challenging to link these genetic differences to human traits. Additional data are needed to develop and test hypotheses about the molecular and organismal phenotypes affected by human-specific mutations. This gap is being rapidly filled by functional genomics experiments that assay gene expression, epigenetic marks, protein-DNA binding events, and three-dimensional interactions of regulatory elements with gene promoters in many cell types and species (Box 4). For genes with human-specific changes, functional genomics has helped to shed light on tissue specificity, developmental timing, and, in the case of regulatory genes, downstream targets. For example, the gene network regulated by FOXP2 (Box 3) is becoming clearer through identification of its binding locations and DNA-binding motif (Nelson et al., 2013), as well as studies of how FOXP2 mutations alter neuronal gene expression (Konopka et al., 2009). These datasets lay the groundwork for a deeper understanding of how FOXP2 might be involved in language acquisition and, more broadly, human evolution.
Functional genomics, broadly defined as sequencing experiments that probe genome activity, helps to annotate and interpret uniquely human DNA sequences in light of human development. To date, functional genomics has mostly been applied to embryonic stem cells and other homogenous cell cultures, although some embryonic tissues (Nord et al., 2013; Visel et al., 2009) and developmental cell lines (Roadmap Epigenomics Consortium et al., 2015; Romanoski et al., 2015) are being studied. Several of the primary techniques are briefly defined here.
ChIP-seq. Chromatin immunoprecipitation coupled to sequencing (Furey, 2012; Zentner and Henikoff, 2014) is used to generate genome-wide maps of protein binding for transcription factors, structural proteins, polymerases, and modified histones (The ENCODE Project Consortium, 2012; Roadmap Epigenomics Consortium et al., 2015; Vierstra et al., 2014a). These binding events are integrated by computational methods to predict regulatory elements and their activity. For example, enhancers are associated with the acetyltransferase and transcriptional co-activator p300 (Visel et al., 2009; Blow et al., 2010; Ghisletti et al., 2010), histone H3 lysine 4 monomethylation (H3K4Me1) in the absence of significant trimethylation (H3K4Me3) (Heintzman et al., 2007; Xi et al., 2007; Koch et al., 2007), and combinations of transcription factor binding (Zinzen et al., 2009). Additional chromatin modifications help to distinguish active, inactive and poised enhancers (Rada-Iglesias et al., 2011). Regulatory elements are frequently distal to the promoters they target (Bulger and Groudine, 2011; Ong and Corces, 2011), making it hard to predict the genes, pathways and phenotypes affected when they are mutated.
Chromatin capture. Three-dimensional regulatory interactions can be revealed by chromatin conformation capture (3C) and extensions thereof (4C; Cullen et al., 1993; Dekker et al., 2002; Miele and Dekker, 2009) (5C, Hi-C; Dostie et al., 2006; Lieberman-Aiden et al., 2009; Rao et al., 2014). The National Institutes of Health 4D Nucleome program promises to produce a reliable three-dimensional map of many cell types from humans and model organisms (Pennisi, 2015), and it should also be possible to make maps for chimpanzees or other primates.
DNA methylation. Chemical modification of regulatory DNA that can be assayed by a variety of techniques.
DNase-seq, FAIRE (formaldehyde-assisted isolation of regulatory elements). Measurements of open chromatin associated with regulatory elements (Buenrostro et al., 2013; He et al., 2014; Vierstra et al., 2014b).
RNA sequencing. Quantifies gene expression; can also be used to predict enhancers (Arner et al., 2015).
Functional genomics has been crucially important for the study of non-coding changes in the human genome (Sholtis and Noonan, 2010). First, it provides an alternative to simply relying on evolutionary conservation to identify regulatory regions and pinpoint changes that might affect them, which is important given the dynamic turnover of regulatory DNA (Kunarso et al., 2010; Rands et al., 2014). Functional genomics profiles can be computationally integrated to identify different types of regulatory sequences (promoters, enhancers, silencers and insulators; Riethoven, 2010) that function in particular cell types and developmental stages (Kellis et al., 2014). This approach provided strong support for the hypothesis that non-coding HARs function as gene regulatory elements (Hubisz and Pollard, 2014). A second use of functional genomics data is to generate information about the tissue- and stage-specificity of human-specific non-coding sequences. Comparing chromatin states, transcription factor binding profiles and gene expression across multiple cell types predicts that one-third or more of HARs are gene regulatory enhancers active in various embryonic tissues (Capra et al., 2013). Such predictions can help identify the phenotypes in which a human-specific non-coding mutation is involved (Trynka et al., 2013).
Functional genomics also provides a means to probe human and non-human cells directly for differences in regulatory regions and gene expression (Nowick et al., 2009; Somel et al., 2011), which can then be traced back to genetic determinants. For example, combined analysis of human and chimpanzee gene expression data and sequences suggested that pain perception and nociception may have changed in humans through differential regulation of opioid signaling (Cruz-Gordillo et al., 2010). Comparative epigenetic profiling of human, rhesus macaque and mouse corticogenesis revealed promoters and enhancers that have gained activity during human evolution (Reilly et al., 2015), although the sequence differences driving these changes are yet to be identified. Similar investigations have compared patterns of open chromatin (Vierstra et al., 2014a), associated with gene regulation, across tissues in various humans and other mammals. However, most of these studies do not include chimpanzee tissues, making it impossible to pinpoint which of the identified differences occurred after the human-chimpanzee divergence. Moreover, although these studies provide a valuable catalog of differences between humans and other mammals, more work is still required to understand how these differences actually impact on human-specific phenotypes.
The power and limitations of transgenic model organisms
Efforts to characterize human-specific genes and non-coding sequences functionally have primarily utilized mouse or zebrafish models and low-throughput transgenic approaches. For example, humanization of the Foxp2 gene in mice showed that the human genotype leads to changes in learning (Schreiweis et al., 2014), behavior, vocalizations and brain dopamine concentrations, suggesting alterations to cortico-basal ganglia circuits (Enard et al., 2009) (Box 2). Mouse models also play a large role in studies of the functional effects of human-specific gene duplications, including stimulation of mitosis in neocortical progenitors by ARHGAP11B (Florio et al., 2015) (Table 1) and antagonism of SRGAP2A function in neuronal spine maturation by SRGAP2C (Charrier et al., 2012) (Box 1).
Transgenic animals can also be used to test non-coding sequences hypothesized to play a role in human evolution. Reporter gene assays allow candidate enhancers to be validated in mouse and zebrafish embryos with transient transfections or stable lines (for mouse examples, see Fig. 2). Enhancer activity has been demonstrated for two conserved non-coding sequences deleted in humans – an androgen receptor enhancer (see Box 2) and a forebrain enhancer upstream of GADD45G (McLean et al., 2011). Multiple HARs have also been tested in this manner: HAR2/HANCS1 (which drives expression in the limb, pharyngeal arches, ear and eye; Prabhakar et al., 2008); a cluster of 14 HARs near the gene encoding the transcription factor NPAS3 (11 of which drive expression in the nervous system; Kamm et al., 2013a); 23 HARs tested by the VISTA Enhancer Browser project (17 in the nervous system, three in limb, two in heart, eight in other tissues; Visel et al., 2007); 29 HARs with epigenetic signatures of active developmental enhancers (20 in the nervous system, eight in limb, four in heart, eight in other tissues; Capra et al., 2013); and HARE5/ANC516 (Bird et al., 2007), a non-coding region located upstream of the Wnt receptor frizzled 8 gene (FZD8) that is apparently neocortex specific (Boyd et al., 2015). Supporting the hypothesis that human-specific mutations in HARs may have altered their developmental regulatory functions, several show expression differences between reporter constructs carrying the chimpanzee and human HAR sequences (Boyd et al., 2015; Capra et al., 2013; Kamm et al., 2013a,b; Prabhakar et al., 2008). These include human gains of enhancer activity for NPAS3-associated 2xHAR.142 in forebrain at embryonic day (E) 12.5 (Kamm et al., 2013a) and for HAR2/HANCS1 at the base of the limb bud at E11.5 (Prabhakar et al., 2008), which may have resulted from the destruction of a repressor-binding site (Sumiyama and Saitou, 2011). These studies aiming to analyze the function of human-specific non-coding sequences and then to study them comparatively in model organism enhancer assays are adding valuable information about the possible functional impact of human-specific DNA changes.
However, to demonstrate further the role of HARs in human evolution will require additional studies of molecular and organismal phenotypes. Boyd et al. (2015) took one step in this direction in the functional study HARE5/ANC516 (Table 1). They showed that the human enhancer is active earlier in forebrain development than its chimpanzee ortholog, and then they generated transgenic mice expressing the mouse Fzd8-myc tagged coding region under the control of the human and chimpanzee HARE5 enhancer and analyzed them comparatively. They found that human-HARE5-Fzd8 mice display faster progenitor cell cycles in the developing brain and increased neocortical size compared with mice in which Fzd8 is controlled by chimpanzee HARE5 or with wild-type mice (Boyd et al., 2015). However, this study also highlights several of the challenges of model organism transgenic approaches. First, as the authors used random-insertion transgenic mice and not locus-directed insertions of transgenes (Tasic et al., 2011) or knock-in strategies, we do not know if the observed phenotype is the result of differences between human and chimpanzee HARE5 sequences or the number of insertions of transgenes controlling Fzd8. In addition, analyzing overexpression phenotypes is always challenging. Finally, studying human and chimpanzee regulatory sequences or genes in mice is not guaranteed to recapitulate their function in primates. There are important differences between embryonic development in humans compared with model organisms, including mice (Rossant, 2015), which ultimately limit these studies to investigations of small pieces of the human genome out of context.
The era of human development in a dish
Until recently, it was impossible to do genetics or apply functional genomics techniques in developmentally relevant contexts in humans, chimpanzees and other primates. Comparative studies at the molecular and cellular level, therefore, have been mainly based on analysis of preserved tissues from different adult organs (frequently postmortem) or immortalized cell lines (Romero et al., 2012; Schmidt et al., 2010; Zhou et al., 2014). Human embryonic tissues and cells have also occasionally been utilized, for example in the characterization of HAR1 (Pollard et al., 2006,b) and SRGAP2 (Charrier et al., 2012; Dennis et al., 2012), and for comparative epigenetic profiling (Reilly et al., 2015). However, the chimpanzee samples necessary for determining whether observed differences are human specific are not available.
Induced pluripotent stem cells (iPSCs) (Takahashi et al., 2007) and techniques allowing their in vitro differentiation into various cell lines and tissues (reviewed by Karagiannis and Yamanaka, 2014; Yu et al., 2014) make it feasible to study and manipulate developmental pathways in human and non-human primate cells (Marchetto et al., 2013; Wunderlich et al., 2014). This is particularly important for chimpanzees and other apes, for which we do not have the same embryonic stem cell resources as we have for humans. Various human and non-human primate cell types derived from pluripotent lines have the potential to provide an unprecedented view on how primate development unfolds at the molecular, cellular and tissue level. Initial efforts in this direction compared gene expression (Marchetto et al., 2013) and DNA methylation (Romero et al., 2015) in iPSCs from multiple individuals of different species, revealing rather limited differences between humans and non-human primates in pluripotent cells. This approach will become more relevant to human evolution and development as iPSCs are differentiated into many cell types and additional functional genomics assays are applied across developmental time courses in multiple species (Fig. 3).
In recent years, there have been dramatic advances in our ability to recapitulate organ growth in vitro, with culture techniques moving away from monolayers of cells towards 3D cultures and organoids, including of neural tissue, where many of human-specific genetic variants appear to be most relevant (Lancaster and Knoblich, 2014; Lancaster et al., 2013; Huch and Koo, 2015). Human cell and organoid cultures are rapidly becoming central tools for understanding human development and disease, and we predict that this will soon also transform human evolutionary studies by enabling direct comparisons of regulatory networks underlying the control of tissue-specific developmental programs across species or between genotypes engineered onto an isogenic human or chimpanzee background (see below). These systems will probably allow researchers to compare human and chimpanzee tissues directly as they develop.
Future prospects
Although it is still early days for ‘human development in a dish’, the emerging techniques discussed above, both in the cell culture and the genomics fields, have enormous potential. As comparative developmental studies using pluripotent cell line-derived cells and tissues become routine, it will be exciting to see these approaches applied to human and non-human primates in an effort to reconstruct the genetic history of our species and connect uniquely human genotypes to phenotypes. Other promising approaches include proteomics, metabolomics and lipidomics, which have already been leveraged to compare adult tissues and cell lines across primates (Blekhman et al., 2014; Bozek et al., 2014, 2015; Khan et al., 2013). Another new method that is particularly powerful for decoding regulatory networks is the massively parallel reporter assay (MPRA), which is a high-throughput version of the transgenic enhancer assay described above that is enabled by high-fidelity DNA synthesis to make libraries of thousands of reporter constructs and low-cost RNA sequencing to assay enhancer activity using unique transcribed sequences associated with each candidate enhancer (Melnikov et al., 2012; Patwardhan et al., 2012). MPRA libraries currently can only be introduced into cell lines or by tail-vein injection into an adult mouse, and hence this technique has not yet been applied to developing embryos. Nonetheless, MPRAs promise to make screening human-specific non-coding sequences for regulatory function (e.g. in organoids or developmental cell types) much less laborious.
Potentially the most transformative breakthrough in human evolutionary genomics will be genome editing [i.e. transcription activator-like effector nucleases (TALENs), clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9] (Gilbert et al., 2014; Zheng et al., 2014). Manipulating the genome to introduce individual human mutations into non-human cells or to knock out single human-specific elements in their native context, and documenting resulting changes in phenotype, is the key to demonstrating causality (Fig. 3). Moreover, such analyses can in principle be carried out in the context of different specified cell types using pluripotent stem cell differentiation protocols, at different stages of development, and in a comparative manner across multiple species. Coupled with other techniques, genome editing can, in principle, help researchers to understand the importance of every nucleotide in the human genome by generating and testing different genotypes on an isogenic human or non-human primate background. Editing is also accelerating the generation of model organisms (e.g. mouse, zebrafish) carrying humanized genome sequences to study in vivo the effects of genetic changes on development and behavior.
Conclusions
To understand fully the differences between humans and chimpanzees, and other mammals, we need to understand how genetic differences impact on molecular and cellular mechanisms of development leading to the morphological differences that separate us. Comparative and population genomics enabled the identification of thousands of genes and non-coding sequences that are uniquely human. The availability of data about the cell types in which these human-specific genotypes might be relevant allows us to develop testable hypotheses about their roles in human evolution. However, testing these hypotheses has proven to be quite challenging, primarily because of the obvious limitations that we cannot perform genetic manipulations to demonstrate causality or assay phenotypes in humans or great apes, and appropriate human and primate tissues for other studies are scarce. A handful of genes and regulatory enhancers that were lost, duplicated or changed in humans have been assayed for expression differences or other molecular and organismal phenotypes, primarily in transgenic model organisms. The majority of these are active in the developing brain, although some function in limb, eye, heart and other tissues. Emerging techniques for expanding such studies to human and non-human primate organoids and developmental cell lines, coupled with genomic techniques to increase their throughput, may be the key to bridging the gap between uniquely human DNA and the developmental events that produce the traits that are unique to our species. We conclude that the field is now poised to decipher the developmental pathways that translate human genetic changes into human traits.
Funding
L.F.F. was supported by grants from the Agencia Nacional de Promoción Cientıı́fica y Tecnológica. L.F.F. received an External Fellowship for Young Investigators from the Consejo Nacional de Investigaciones Cientı́ficas y Técnicas (CONICETArgentina). K.S.P. was supported by the Gladstone Institutes and by a gift from the San Simeon Fund.
References
Competing interests
The authors declare no competing or financial interests.