Over the past few years, interest in chromatin and its evolution has grown. To further advance these interests, we organized a workshop with the support of The Company of Biologists to debate the current state of knowledge regarding the origin and evolution of chromatin. This workshop led to prospective views on the development of a new field of research that we term ‘EvoChromo’. In this short Spotlight article, we define the breadth and expected impact of this new area of scientific inquiry on our understanding of both chromatin and evolution.
Chromatin is a complex of DNA, RNA and protein that is found ubiquitously in both eukaryotes and prokaryotes (Alva and Lupas, 2018; Talbert et al., 2019). The evolution of histones (the primary components of chromatin) and their association into nucleosomes generates eukaryotic chromatin with a more diverse composition and a more complex organization than that found in prokaryotes. In addition to its primary function in packaging the genome in eukaryotes, chromatin has acquired regulatory roles, including the control of gene expression, cellular differentiation, cellular metabolism and responses to environmental stimuli. Chromatin also plays key roles in mediating DNA repair and recombination, defining regional genomic identity, repressing selfish genetic elements and mediating chromosome segregation. Recent explorations into the biology of non-model organisms have revealed an unexpectedly diverse chromatin components and mechanisms of chromatin-based regulation. With the rapid emergence of newly sequenced genomes, investigation of the evolutionary dynamics of chromatin has emerged as a powerful and promising interdisciplinary field.
Fueled by insights from a recent The Company of Biologists workshop on ‘Evo-chromo: towards an integrative approach to chromatin dynamics across eukaryotes’, which was held in November 2018, we discuss here the evolutionary dynamics of chromatin composition and its impact on genome function and evolution. We begin by proposing a model for the ancestral function of chromatin in an early eukaryotic ancestor. We then summarize and discuss recent reports that provide clues as to how chromatin has shaped evolution, and vice versa. We conclude with an overview of pioneering experimental and bioinformatic technologies that will likely facilitate the study of chromatin in an evolutionary context, and with a summary of the current challenges and gaps in the field.
Ancestral functions of chromatin
The fundamental repeating unit of eukaryotic chromatin is the nucleosome, which comprises DNA wrapped around a histone core complex (Luger et al., 1997). In all organisms, histones share a conserved histone fold domain, composed of three alpha helices, that mediates both interactions with DNA and interactions between histone dimers in the context of a nucleosome. These histone fold dimers are the ancient building blocks from which tetramers, hexamers and octamers can be built.
Histones have deep evolutionary roots and are found in all eukaryotes studied to date. Furthermore, histone fold-containing proteins have been found in all three major branches of archaea (Bhattacharyya et al., 2018; Henneman et al., 2018). This deep conservation indicates that histones are likely ancestral to archaea and eukaryotes. Although histones are not present in bacteria, several bacterial proteins contain one or two histone folds, arguing that an association between DNA and histone fold-containing proteins likely evolved in the last universal common ancestor (Alva and Lupas, 2018). Histones have also been identified in a group of extant giant viruses suspected to represent ancestral viruses active in a histone-containing proto-eukaryotic ancestor (Erives, 2017).
In extant archaeal species, histones mainly fulfil structural functions in maintaining genome integrity (Bhattacharyya et al., 2018; Henneman et al., 2018). This is consistent with the fact that the majority of archaeal histones do not have histone tails (Bhattacharyya et al., 2018; Henneman et al., 2018). In eukaryotes, histone tails are subject to post-translational modifications and are associated with regulatory roles. Still, given the sparse data on the role of archaeal histones and their diversity, a regulatory function for archaeal chromatin analogous to that of extant eukaryotic chromatin cannot be ruled out, and has indeed been demonstrated in at least one instance (Mattiroli et al., 2017). Clarifying these relationships must be a priority for future studies on the phenotypic effects of chromatin disruption in archaea.
While the role of histones in eukaryotes has mostly been investigated in the context of chromatin, recent studies of histone proteins outside of the nucleosome context have suggested a new hypothesis for additional, and perhaps ancestral, functions for histones. Early eukaryotic ancestors faced drastic changes to their physical environment, which may be important considerations in understanding the ancestral role of histones. Although oxygen levels were low when eukaryotes evolved (Lyons et al., 2014), once oxygen levels rose, more and more metals became oxidized at various time points, and organisms needed to reduce these into bio-usable forms. A recent report proposed that a cysteine-histidine configuration within the histone 3:histone 4 (H3:H4) tetramer, the ancestral form of nucleosomal histones, is capable of reducing copper to its bio-usable form for use by mitochondria or enzymes that use copper as a co-factor (Attar et al., 2018 preprint). Based on this, it was proposed that an association between DNA and histones might have evolved in order to protect DNA from reactive oxygen through the reducing activity of histones. The resulting nucleoprotein complex may have then acquired a novel role in packaging the genome and protecting its integrity from biotic and abiotic injuries.
It is also likely that the genomes of early eukaryotes and archaea were not as complex as those of extant eukaryotes, precluding a requirement for complex transcriptional regulation. Therefore, the regulatory function of chromatin may have evolved alongside increases in genome size and complexity. Hence, chromatin components may have diversified to fulfil more specialized roles in gene regulation, maintenance of genome integrity, DNA repair and inheritance, all of which we can observe in extant organisms.
Mechanisms of chromatin evolution
How did chromatin diversify? Gene duplications and gene losses are a recurrent theme in evolution and may have played a role in changing the repertoire of chromatin components. The repertoire of histone isoforms has diversified over evolutionary time, leading to functional specialization of histone variant classes (Talbert et al., 2012). Strikingly, in several cases this has resulted in the convergent evolution of similar amino acid motifs that are associated with specific regulatory roles. For example, the ‘SQ motif’ found in the tails of H2A.X, a histone variant involved in DNA repair, has convergently evolved in other H2A variants (Malik and Henikoff, 2003; Talbert and Henikoff, 2010). In Drosophila, the SQ motif is found in the tail of H2A.Z, whereas in Arabidopsis it is found in the tail of some members of the H2A.W family. In both organisms, the motif becomes phosphorylated upon DNA damage, like H2A.X, implying a conserved function for this motif (Friesner et al., 2005; Lorković et al., 2017; Madigan et al., 2002). As a further example of convergent evolution, the heterochromatin-associated histone variants H2A.W and macroH2A recruited the KSPK motif in their C-terminal tail in plants and animals, respectively (Kawashima et al., 2015). However, the series of molecular steps leading to the convergent evolution of identical functional motifs in different H2A histone fold domains remains open to speculation.
Rapid protein sequence evolution, which is frequently associated with chromatin components participating in genetic conflicts (discussed below), can also act as a driving force to diversify chromatin. For example, whereas most core histones (H2A, H2B, H3 and H4) are under purifying selection and are conserved across large evolutionary distances, the centromeric H3 variant (CenH3) has undergone rapid evolution, leading to incompatibilities between CenH3 variants even between closely related species (Maheshwari et al., 2015; Malik and Henikoff, 2001).
Novel chromatin components with specialized functions can also evolve from existing components. An example of rapid gene turnover is the evolution of heterochromatin protein 1 (HP1) proteins in flies, where prolific gene gains and losses might be driven in part by karyotypic changes in these organisms (Levine et al., 2012). Orthologs of HP1 can be identified by the presence of a chromodomain and a chromoshadow domain that bind to methylated H3K9 or H3K27 (Berke and Snel, 2015). Importantly, orthologs of HP1 acquired different domains and distinct targets in plants and animals (Wang et al., 2018).
Another intriguing example of convergent evolution is the multiple independent origins of protamine-like proteins, which are found in animal and plant sperm and which evolved from linker histone H1 (Eirín-López and Ausió, 2009; Kasinsky et al., 2011). In the bryophyte Marchantia, cleavage products of a histone H1 precursor resulted in three protamine-like proteins (Higo et al., 2016; Kasinsky et al., 2014). In some tunicates, by contrast, protamine-like proteins are derived from frameshift mutations within H1 that generate arginine-rich proteins from their lysine-rich precursors (Eirín-López and Ausió, 2009). As arginine is better able to complex water, these arginine-rich proteins allow sperm chromatin to be compacted more tightly.
Taken together, these findings highlight that numerous mechanisms likely led to chromatin diversification, raising the question of what evolutionary forces drive diversification and what their effects are on the fitness/phenotype of an organism?
What drives chromatin diversification?
The increasing size and complexity of genomes might act as major drivers that help to define different functional regions. Genomes have expanded primarily due to changes in ploidy and the massive expansion of repetitive elements, leading to an increase in gene and transposable element (TE) families. Given these changes, it is conceivable that chromatin also diversified to regulate different parts of the genome according to their function (i.e. endogenous gene regulation versus the regulation of centromeres and telomeres), as well as mitigating potential threats (i.e. via the suppression of TEs and ectopic recombination between repetitive elements). An intriguing example of co-evolution between genome structure and chromatin composition comes from comparative studies of histone H2A. Genome size correlates with the number of positively charged amino acids in the tail of H2A, a property that enables more efficient chromatin compaction (Macadangdang et al., 2014). More broadly, genome size also correlates with the diversification of H2A variants (Macadangdang et al., 2014). Together, this suggests that histone H2A variants may have evolved to facilitate genomic compaction and stability, in order to address the challenges of an expanding genome. Furthermore, it has been proposed that histone variants form a functional topographic code (i.e. with domains enriched in specific variants corresponding to transposons, promoters, enhancers, gene transcriptional activity, etc.) (Hake and Allis, 2006; Millar, 2013). This idea receives support from studies in plants, which deposit nucleosomes containing only a single type of H2A variant in different genomic contexts, with each variant having distinct biophysical properties (Osakabe et al., 2018; Yelagandula et al., 2014). However, it remains to be shown whether this code is sufficient to define the functions or properties of the domains marked by each type of variant.
Concomitant with genome complexity and increased transposon load, genomic conflict is another major driver of chromatin evolution. Conflicts arising between centromeres elegantly illustrate this point (Henikoff et al., 2001; Kursel and Malik, 2018). The rapid evolution of centromere identity proteins and centromeric DNA was proposed to arise from female meiotic drive. In this scenario, which applies to many groups of eukaryotes, only one of the four female meiotic products develops into an egg, whereas all four male meiotic products are used to produce sperm. In females, this imbalance creates competition for centromeres to distort chromosome segregation in such a way that biases their inclusion into the final gamete. In contrast, an imbalance in centromere strength in male meiosis could lead to increased non-disjunction and meiotic stalling, resulting in either reduced fertility or sterility. According to the centromere drive model, it is the rapid evolution of centromere-associated proteins that restores meiotic parity caused by selfish centromeric DNA (Henikoff et al., 2001; Kursel and Malik, 2018). In addition to rapid protein sequence evolution, duplication and specialization of centromeric proteins could act as evolutionary strategies that allow one paralog to act as a drive suppressor in the male germline, while the other functions as the canonical centromeric component (Kursel and Malik, 2017).
Conflicts have also been hypothesized or demonstrated in the case of telomeres, TEs and sperm. In each case, genetic conflict has generated an arms race between intra-genomic elements. For example, TE-driven conflict has been shown in Arabidopsis (Hosaka et al., 2017), Drosophila (Andersen et al., 2017) and mammals (Molaro and Malik, 2016). In Arabidopsis, a TE-encoded anti-silencing protein evolved to hypomethylate DNA on TEs from distinct groups, allowing the propagation of TEs while causing minimal host damage (Hosaka et al., 2017). In Drosophila gonads, Moonshiner, a paralog of a basal transcription factor IIA (TFIIA) subunit, causes transcription of PIWI-interacting RNA (piRNA) clusters from within TE-rich loci (Andersen et al., 2017). In mammals, transcription factor-binding sites and RNA-binding proteins are selected to control selective classes of TEs (Attig et al., 2016; Bulut-Karslioglu et al., 2014; Ecco et al., 2017). Selfish elements or parts of these can, in turn, also shape the chromatin landscape by adopting host functions. In mammals, for example, endogenous retroviruses or long terminal repeats can be co-opted as promoters and enhancers associated with their specific histone modifications (Chuong et al., 2013; Friedli and Trono, 2015). These examples highlight the multiple and diverse selfish elements that can shape the evolution of chromatin components and their regulation.
Extrinsic abiotic factors, such as temperature, metabolic fluxes and exposure to DNA damage reagents, can also select for changes in chromatin. Indeed, chromatin-based packaging is known to be sensitive to exposure to abiotic stress (Probst and Mittelsten Scheid, 2015). Therefore, the evolution of particular biophysical features in chromatin components might help organisms to cope with different extrinsic conditions. Extrinsic biotic factors such as sperm competition may also shape chromatin. In humans, protamines have been shown to play a role in sperm head morphology, which in turn is an important determinant of male fertility (Belokopytova et al., 1993; Cree et al., 2011). Sperm competition may therefore drive the sequence evolution and whole-gene turnover observed in protamine and protamine-like proteins (Lüke et al., 2016; Martin-Coello et al., 2009).
Together, these intrinsic and extrinsic factors contribute to driving the diversification of chromatin composition and its regulation. Identifying and better characterizing such mechanisms of chromatin diversification will enable us to provide insight into the origin of novel and complex features of chromatin across eukaryotic evolution. Although genetic work under laboratory conditions is of invaluable impact, it will be necessary to directly assess the impact of the natural environment on chromatin evolution using population genetics. For example in Arabidopsis, genome-wide association studies have identified loci controlling DNA methylation in the context of adaptation (Dubin et al., 2015), validating the importance of this type of strategy.
How does chromatin affect the evolution of genomes?
Chromatin components are not only the target of selective constraints but can also impact the evolutionary dynamics of the genome. For example, as the whole genome is under the regulation of multiple levels of chromatin factors (i.e. histones are incorporated genome-wide), mutations affecting chromatin regulators have the potential to cause a plethora of phenotypic changes. A change in protein sequence, the gain of a new component or the loss of an existing chromatin component could therefore lead to large structural changes in the genome. One example of this is the recurrent loss of centromeric histone H3 variants, which has been shown to influence centromere organization in insects (Drinnenberg et al., 2014). A second relevant observation is the convergent evolution of the SQ motif in different histones across multiple organisms (Malik and Henikoff, 2003; Talbert and Henikoff, 2010), which may influence local DNA repair efficiencies across the genome, and hence influence mutation rate.
In addition, epigenetic changes occur at a faster rate than genetic changes (i.e. before their consolidation as a genetic DNA sequence change) (Prakash and Fournier, 2018). Furthermore, many of the genomic regions that are frequently involved in rapid chromatin evolution (e.g. centromeres and telomeres) function as a chromatin state per se, epigenetically defining a stretch of DNA that does not contain genes, which might free these regions of additional selective constraints. This may lead to accumulation of selfish elements that engage with host factors and shape their rapid evolution.
Potential genome-wide consequences associated with changes in chromatin components would likely cause significant and abrupt changes in phenotype. This could be particularly important in cases of rapid environmental change, as the rate of organismal evolution would correspondingly accelerate. This is illustrated by independent losses of the same type of chromatin regulation across independent lineages, which likely caused large-scale, rapid phenotype evolution. These observations raise the issue of whether common selective pressures act across the tree of life to shape particular chromatin features. An intriguing example is the loss of gene body methylation, or DNA methylation, in multiple eukaryotic lineages (Bewick and Schmitz, 2017; Muyle and Gaut, 2018; Takuno et al., 2016). What are the connections between species that have lost this type of epigenetic mark? One potential explanation comes from a recent study showing that DNA methyltransferase activity can be harmful as it can introduce toxic lesions into DNA, and that this toxicity may therefore have promoted the loss of DNA methylation in multiple lineages (Rošić et al., 2018). Similarly, the independent occurrence of holocentric architectures in multiple eukaryotic lineages (Melters et al., 2012) might have been driven by a similar selective advantage associated with this type of centromere organization. These types of questions are becoming increasingly tractable as phylogenomic data increases alongside an increasing number of model species.
From a more locus-specific perspective, multiple lines of evidence, from biochemistry to comparative genomics, indicate that chromatin influences the local mutation rate (Chen et al., 2012; Prendergast and Semple, 2011; Tolstorukov et al., 2011; Warnecke et al., 2008). The formation of nucleosomes, the presence of specific histone marks and the binding of transcription factors have been found to affect the incidence of DNA lesions, as well as DNA repair efficacy, leading to chromatin-dependent variability in mutation rates across the genome, which can also be observed as local differences in variation at the population level (Makova and Hardison, 2015). Along similar lines, chromatin affects the insertion probabilities of TEs, thus influencing how selfish genetic elements spread across the genome and also perhaps how genes spread horizontally across genomes (Huisinga et al., 2016; Lesage and Todeschini, 2005; Quadrana et al., 2018 preprint). In a likely related manner, the nature of chromatin strongly affects recombination landscapes (Székvölgyi et al., 2015). Therefore, chromatin affects the mutation rate, the raw material of evolution, over a variety of scales.
Given its impact on the composition and organization of the genome, chromatin is expected to affect evolvability. As nucleosomes mask access to transcription and other regulatory factors, mutations that arise within chromatin-silenced regions might frequently have cryptic effects, exposed only when the local chromatin landscape is disturbed (Lehner et al., 2006; Tirosh et al., 2010). However, chromatin-mediated evolvability is seemingly highly dependent on genetic backgrounds (Richardson et al., 2013).
Finally, chromatin components can also affect evolution independently of their DNA packaging function. For example, it is conceivable that the abundance of methyl, acetyl and ubiquitin groups, common forms of histone modification, impact metabolic and protein homeostatic pathways, and act as metabolic capacitors affecting the evolvability of metabolic fluxes. However, although attractive, there is currently little concrete support for these hypotheses, as answering such questions requires experimental set-ups or datasets that are not available to date.
Opportunities and technical limitations for the study of chromatin evolution
The expansion of single cell technologies and new strategies to study chromatin accessibility, structure and composition opens up major opportunities for studying the evolution of chromatin (Fraser et al., 2015). In particular, techniques like ATACseq (Buenrostro et al., 2013) and CUT&RUN (Skene and Henikoff, 2017) are broadly applicable to non-model species with otherwise limited genetic resources. New quantitative technologies, such as Mint-ChIP (van Galen et al., 2016), promise to facilitate comparative studies across species. The application of these technologies in more species will provide a broader phylogenetic context to studies of chromatin evolution. In particular, the current growth of single cell techniques to analyze chromatin [e.g. single cell ATACseq (Buenrostro et al., 2015) and chromatin profiling by chromatin integration labelling (ChIL; Harada et al., 2018)] opens up new avenues for investigating variation in chromatin landscapes, not just across species but also between cells in a single species and even within tissue/cell types. Beyond the linear organization of the genomes, the 3D structure of chromatin impacts transcription and can now be routinely assessed using HiC (Lieberman-Aiden et al., 2009). This technique, broadly validated by the orthogonal approach of genome architecture mapping through sequencing thin slices of nuclei (Beagrie et al., 2017), enables contacts between distant regions of chromatin to be mapped. These include enhancer-promoter contacts that are key transcriptional controls during development. The further standardization of 3D mapping (Marti-Renom et al., 2018) will render these techniques applicable to a broad array of species.
Given the expansion and potentially broad applicability of tools in this space, it seems that our progress in understanding the evolution of chromatin arises primarily from challenges associated with resources relating to individual species. For example, a full chromosomal assembly of the genome is required in order to identify the presence or absence of components of the chromatin machinery, and therefore the lack of high-quality genome assemblies in key phylogenetic positions remains an outstanding issue. In fact, there is no doubt that there are highly informative groups of species that are completely unstudied owing to a lack of available genomics resources. However, these phylogenetic ‘missing links’ will likely be resolved over time, given the ever-increasing ease with which genomes can be sequenced and assembled.
Likewise, establishing new experimentally tractable model organisms represents a major bottleneck. The advent of CRISPR-based methods goes some way to alleviating this problem, by providing a tool for directly manipulating chromatin factors and genomic regions of choice. Yet addressing evolutionary questions in the organismal context still remains challenging. Some species are difficult to maintain under laboratory conditions, whereas others are not amenable to gene targeting methods. Some species-specific problems can be overcome by studying chromatin in vitro. However, there are currently limitations to reconstructing and studying chromatin dynamics and regulation in vitro, owing to the complexity of the molecular components involved and the dynamic nature of chromatin-related processes. Another aspect that must also be considered is the challenge faced by a single research group in studying biological processes at the molecular, organismal and evolutionary scales. Therefore, it will not only be advantageous but essential to establish more connections between individual laboratories with unique expertise and backgrounds in order to foster interdisciplinary collaborations and the exchange of knowledge and expertise.
Conclusions and perspectives
Evolutionary perspectives have fundamentally enriched our conceptual and mechanistic understanding of developmental biology (Brakefield, 2011; Moczek et al., 2015). Likewise, an evolutionary approach to studying cell biology has also been advocated (Lynch et al., 2014). From an evolutionary perspective, chromatin components occupy an idiosyncratic status. They are the products of evolution but they also have a broad impact on evolutionary mechanisms, which can take place at different scales. Chromatin is the substrate that packages genes and, as such, chromatin regulates the mode of action of evolutionary mechanisms and impacts evolvability. Across generations, chromatin-based mechanisms such as imprinting and silencing affect the long term modulation of genome expression and extend genetic rules beyond Mendelian laws, thus having the potential to modulate selective pressures. As we have highlighted here, this emerging field of ‘EvoChromo’ not only extends chromatin biology from an evolutionary perspective but is also expected to unravel new evolutionary mechanisms that are directly influenced by chromatin. New technological advances and increasingly available data from a broader range of organisms will greatly expand the impact of this field in coming decades. We hope that EvoChromo studies will not only provide fresh perspectives on the evolution of living organisms, but will also enable us to use this new knowledge to better understand and respond to rapid changes in our planet's environment.
We thank The Company of Biologists for providing us with the opportunity to organize the first EvoChromo workshop.
K.R.’s research is supported by a project grant from the Schweizerischer Nationalfonds zur Förderung der wissenschaftlichen Forschung (SNF 310013A_149974 to C. Baroux) and post-doctoral fellowships from SCIEX – Scientific Exchange Programme NMS.CH, the COFUND Program PLANT FELLOWS (2010-267243) of the European Union's 7th Framework and from the ‘Forschungskredit’ of the University of Zürich. The travel grant for the meeting was provided by the Julius Klaus Foundation. I.A.D. receives salary support from the Centre national de la recherche scientifique. I.A.D. is supported by Labex DEEP (ANR-11-LABX-0044) part of the IDEX Université de recherche Paris Sciences et Lettres (ANR-10-IDEX-0001-02), the Institut Curie and the European Research Council (CENEVO-758757) and a Gatsby Charitable Foundation studentship (GAT3401) to A.R.B. J.Y.K.’s research was supported by the National Institute of General Medical Sciences of the National Institutes of Health (5 F32 GM116321-02). J.M.G. is funded by a grant to Fabian Rentzsch from Norges forskningsråd and the University of Bergen (251185). P.S. and T.W. are funded by the National Science Foundation (IOS-1542703 to B.S.G.) and by the Medical Research Council. Z.H.H is funded by a National Institutes of Health training grant (T32GM113854). F.B. is supported by the Fonds zur Förderung der wissenschaftlichen Forschunggrant (P28320-B21). P.R.A is supported by a Novo Nordisk grant (NNF14OC0009189). S.K.K is supported by a W. M. Keck Foundation award and National Institutes of Health grant (CA178415). H.S.M. is supported by the National Institute of General Medical Sciences of the National Institutes of Health (GM074108) and is an investigator of the Howard Hughes Medical Institute.