Transcriptional enhancers are central to the function and evolution of genes and gene regulation. At the organismal level, enhancers play a crucial role in coordinating tissue- and context-dependent gene expression. At the population level, changes in enhancers are thought to be a major driving force that facilitates evolution of diverse traits. An amazing array of diverse traits seen in insect morphology, physiology and behavior has been the subject of research for centuries. Although enhancer studies in insects outside of Drosophila have been limited, recent advances in functional genomic approaches have begun to make such studies possible in an increasing selection of insect species. Here, instead of comprehensively reviewing currently available technologies for enhancer studies in established model organisms such as Drosophila, we focus on a subset of computational and experimental approaches that are likely applicable to non-Drosophila insects, and discuss the pros and cons of each approach. We discuss the importance of validating enhancer function and evaluate several possible validation methods, such as reporter assays and genome editing. Key points and potential pitfalls when establishing a reporter assay system in non-traditional insect models are also discussed. We close with a discussion of how to advance enhancer studies in insects, both by improving computational approaches and by expanding the genetic toolbox in various insects. Through these discussions, this Review provides a conceptual framework for studying the function and evolution of enhancers in non-traditional insect models.
Understanding how genes are regulated is fundamental to various disciplines of biology. In the field of insect science, molecular mechanisms underlying gene regulation are best studied in the fruit fly Drosophila melanogaster. With a suite of sophisticated genetic tools available in this insect (Hales et al., 2015), scientists have been able to decipher complex interactions among genes and their protein products, revealing comprehensive networks of gene interactions and regulations (i.e. gene regulatory networks, GRNs) for various tissues and contexts.
Two innovations in the last two decades have drastically changed the way we study insects beyond Drosophila. The first is the advancement of next-generation sequencing technology, which allows researchers to gather genomic and transcriptomic information from the insect they study relatively easily and even sometimes prior to detailed ‘wet’ investigation. The second is the application of RNA interference (RNAi), and more recently, CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR associated protein 9) genome editing, in various insects (reviewed in Bellés, 2010; Gilles and Averof, 2014). These gene knockdown/knockout techniques now allow loss-of-function (LOF) analyses in many (albeit not all) insects without the need for creating mutants through traditional means, and often with the capability of controlling the timing of gene disruption. With these critical advances, we can now study various tissues and contexts of non-traditional model insects or even non-model insects at the detailed molecular level (Reardon, 2019).
Although researchers are gradually stepping out from Drosophila to explore molecular mechanisms underlying various intriguing processes found in other insects, the knowledge obtained from Drosophila studies continues to play a critical role in insect science. Researchers often use Drosophila GRNs as a starting point (i.e. Drosophila paradigm) and investigate the function of the genes that are homologous/orthologous to the genes in the Drosophila GRNs through RNAi-based LOF studies, often combined with expression analyses, in their insects. This approach has been very fruitful in gaining new insights into gene function and regulation, as well as into the evolution of GRNs among insects, that are difficult to obtain through studying Drosophila alone (Bellés, 2010).
When discussing GRNs, there are two types of components: trans and cis (Fig. 1A). trans components are transcription factors (TFs) and their upstream regulators that provide instructive cues to cells for patterning, differentiation and various other biological processes. In contrast, cis components are non-coding DNA elements that integrate the upstream trans information and determine the expression of the genes downstream in the GRNs. Enhancers (often also called cis-regulatory elements or cis-regulatory modules) are a class of cis components that play a central role in determining spatial and temporal gene expression (Blackwood and Kadonaga, 1998; Buffry et al., 2016; Cho, 2012; Long et al., 2016; Pennacchio et al., 2013; Rickels and Shilatifard, 2018). As mentioned, most current studies in insects outside of Drosophila utilize RNAi-based LOF analyses (or knocking out coding genes via CRISPR/Cas9) as a central approach. This allows for an investigation of GRNs from the trans point of view (Fig. 1A), by inhibiting the function of trans components and assessing their influence on GRNs. However, although it is at least as important to study GRNs from the cis perspective to gain a comprehensive view of gene regulatory mechanisms, the lack of a reliable method to identify enhancers in non-Drosophila insects has made it difficult to study the function and evolution of cis components beyond Drosophila species.
There are several reasons as to why studying enhancers is so challenging in non-Drosophila insects. First, compared with trans components, cis components, especially enhancers, are extremely labile (Li et al., 2007), which makes identification of enhancers based on sequence conservation challenging even among closely related species (Papatsenko et al., 2006) and nearly impossible among species with divergence time beyond ∼60 million years (Kazemian et al., 2014; Li et al., 2007). This is especially problematic for insects, which underwent an early radiation (winged insects had diversified into at least 10 orders by the early Permian, 300 million years ago; Kukalova-Peck, 1991) and have short generation times, leading to limited non-coding homology beyond the genus or family level. Second, functional validation of enhancers often requires the use of modern genetic and genomic tools, which are currently largely absent from most non-Drosophila insects. Because of these hurdles, investigations into the function and evolution of enhancers have been quite limited in insects outside of Drosophila, despite the clear awareness among researchers that changes in enhancers and other cis-regulatory elements play a crucial role in facilitating evolution and diversification of various traits among insects and other organisms (reviewed in Carroll, 2008).
Typically, an enhancer study consists of two steps: (1) the identification of possible enhancer regions (either focusing on a single gene of interest or genome-wide), and (2) the validation and further downstream functional evaluation of enhancer activity in vivo. Some approaches allow functional validation of enhancer activity in the first step, while others require separate in vivo validation experiments. In this Review, we will first summarize currently available approaches to identify possible enhancer regions in insect genomes. Although many of these approaches are technology and resource intensive (i.e. model system-centered), recent advances in genomics and computational biology have started making some of these approaches more accessible to researchers that use insects other than Drosophila as their model. We will discuss the pros and cons of each of these approaches when applied to non-traditional model insects. We will then turn our attention to in vivo validation of enhancer activity in non-traditional insect models and discuss several possible approaches for enhancer validation, such as reporter assays and CRISPR/Cas9-based genome editing. We will use our recent attempt to establish a cross-species compatible reporter construct as a case study, and discuss some of the key points in establishing a reporter assay system in insects outside of Drosophila. Lastly, we will touch on our current effort to advance enhancer studies in insects, both by improving the computational approach to identify possible enhancer regions in insect genomes and by expanding the genetic toolbox for enhancer studies in various insects.
Experimental approaches to identifying possible enhancer regions in insect genomes
Classic reporter assay
By definition, enhancers are short DNA sequences that act in cis and increase the transcription of a nearby gene regardless of their orientation and the distance from the gene they regulate (Blackwood and Kadonaga, 1998; Pennacchio et al., 2013). The reporter assay takes advantage of this feature and places a candidate enhancer in front of a marker gene that can be easily visualized (i.e. a reporter gene), such as the IacZ gene of Escherichiacoli or fluorescent protein genes, along with a core promoter (Fig. 1B). The enhancer activity of this ‘reporter construct’ can then be assayed in vivo by visualizing the expression of the reporter gene in various tissues and contexts. This approach was first used to investigate the regulation of several segmentation genes in Drosophila, such as fushi tarazu (ftz) (Hiromi et al., 1985) and even skipped (eve) (Goto et al., 1989; Harding et al., 1989), and now has become the ‘gold standard’ approach when evaluating the activity of enhancers in Drosophila and other traditional model organisms (Suryamohan and Halfon, 2015). However, this approach is often inefficient and incomplete as a method to identify enhancers, as it requires the generation of many transgenic lines to be able to survey a sufficient length of the genome, many of which will inevitably only provide negative results.
Genome-wide reporter assay
Despite the arduous and time-consuming nature of the reporter assay, this approach has been used in a genome-wide fashion in Drosophila. Flylight and Fly Enhancers are the two major projects attempting genome-wide enhancer identification through reporter assays, with a focus on brain development and embryogenesis, respectively (Jenett et al., 2012; Kvon et al., 2014; Pfeiffer et al., 2008). The Flylight collection was also used to describe genome-wide enhancer activities in several different developmental contexts (Jenett et al., 2012; Jory et al., 2012; Tokusumi et al., 2017). These projects have identified thousands of functionally validated enhancers and provided us with an overall outlook of the cis-regulatory landscape in the Drosophila genome. Furthermore, over 10,000 lines generated through these projects use the yeast Gal4 transcription factor as the reporter, which allows application of the Gal4-UAS bipartite expression system (Brand and Perrimon, 1993) to enable researchers to misexpress genes and trace the lineage of cells and tissues with unprecedented precision in Drosophila.
The reporter assay system has also been used in a high-throughput setting. An example of a high-throughput approach used in Drosophila is STARR-seq (self-transcribing active regulatory region sequencing) (Arnold et al., 2013; Muerdter et al., 2015). STARR-seq utilizes a library of reporter constructs, which covers the entirety of the Drosophila genome >10 fold. The reporter constructs are designed with the candidate enhancer sequences placed downstream of the core promoter. As the result, active enhancers are directly transcribed, thus serving double-duty as their own reporter genes when transfected cultured cells are subjected to RNA sequencing (RNA-seq) analysis (Fig. 2C). Furthermore, STARR-seq enables identification of enhancers in a quantitative manner, because the number of RNA-seq reads corresponding to each candidate enhancer sequence is directly proportional to the strength of the enhancer activity of the genome fragment.
Phylogenetic footprinting is based on the concept that functionally important sequences within a genome, even outside of the coding regions (such as TF binding sites), should be evolutionarily conserved. The fast-evolving nature of insect genomes appears to make the alignment of genomic sequences among multiple insect species challenging. Nonetheless, this approach can be powerful at identifying enhancers when genome sequences from a set of closely related species are available. For example, multiple sequenced genomes within the genus Drosophila, along with the available genome alignments for these species, have made it possible to quickly identify blocks of conserved sequence outside of the coding regions (Frazer et al., 2004; Mayor et al., 2000; Papatsenko et al., 2006; Sosinsky et al., 2007; Stark et al., 2007). However, several studies functionally validating the conserved non-coding sequences point toward a consensus that conservation alone might not be sufficient to efficiently identify enhancers (i.e. not all demonstrated enhancers are well conserved and not all conserved non-coding sequences appear to function as enhancers) (Bergman et al., 2002; Kharchenko et al., 2011; Li et al., 2007; Richards et al., 2005; Roy et al., 2010).
Changes in chromatin status through epigenetic modifications are critical to facilitate precise gene regulation (reviewed in Klemm et al., 2019). Various cis-regulatory elements, including enhancers, are ‘open’ (i.e. nucleosome free) when they are active, so TFs have access to these regions. Several methods exploit this feature of the genome and identify possible enhancer regions through chromatin profiling (reviewed in Klemm et al., 2019; Meyer and Liu, 2014; Suryamohan and Halfon, 2015). DNase-seq (DNase I hypersensitive sites sequencing) uses high sensitivity to DNase as the indicator of open chromatin regions (Boyle et al., 2008). FAIRE-seq (formaldehyde-assisted isolation of regulatory elements, combined with sequencing) and ATAC-seq (assay for transposase-accessible chromatin using sequencing) also identify open chromatin regions. FAIRE-seq uses organic phase separation chemistry to isolate nucleosome-free chromatin away from nucleosome-containing DNA (Giresi et al., 2007; McKay, 2019; McKay and Lieb, 2013), while ATAC-seq uses transposase accessibility as the indicator of open chromatin (Buenrostro et al., 2013).
Another type of chromatin-related method that might be useful to identify enhancers is chromosome conformation capture (3C) (reviewed in de Wit and de Laat, 2012). Hi-C (3C combined with high-throughput sequencing) allows genome-wide investigation of the spatial chromatin organization, including long-distance interactions among multiple loci (Lieberman-Aiden et al., 2009). This technique can be useful in identifying enhancers by analyzing promoter–enhancer interactions (Ron et al., 2017).
Antibody-based enhancer identification
A number of antibody-based methods have proven useful when identifying possible enhancer regions (reviewed in Suryamohan and Halfon, 2015). ChIP-seq (chromatin immunoprecipitation followed by sequencing) is a widely used technique to either (1) identify the binding sites of a specific TF or (2) gain a genome-wide chromatin profile (Ghavi-Helm and Furlong, 2012; Ghavi-Helm et al., 2016; Park, 2009). For the former application, antibodies that specifically recognize the TF of interest are used to identify the regions that are occupied by the TF throughout the genome. Those binding sites are often indicative of the enhancers that are regulated by the investigated TF. For the latter application of ChIP-seq, antibodies against global chromatin modification markers are used. For example, antibodies against histone H3 with its lysine at position 27 acetylated (H3 K27Ac) can be used to identify active chromatin regions, while antibodies against histone H3 with its K27 trimethylated (H3 K27me3) are often useful to identify inactive regions in the genome (Bannister and Kouzarides, 2011). Antibodies against histone acetyltransferase p300 are also often used to identify active chromatin regions (Kharchenko et al., 2011; Nègre et al., 2011; Visel et al., 2009).
More recently, a new antibody-based method, CUT&RUN (cleavage under targets and release using nuclease combined with sequencing), has been developed (Meers et al., 2019; Skene and Henikoff, 2017; Skene et al., 2018). Briefly, in this method, unfixed permeabilized tissues/cells are incubated with antibodies that target a protein of interest (such as TFs or histones with a specific modification). Then, protein A conjugated micrococcal nuclease (MNase) is added, which binds to the antibody and cuts the DNA in its vicinity. The released DNA fragments are isolated through size selection and used for sequencing. CUT&RUN allows researchers to obtain data equivalent to ChIP-seq, but with fewer procedures and much less input tissue.
Unlike RNAi, CRISPR/Cas9-based genome disruption can interrogate not only the function of transcriptionally active regions of the genome but also the non-coding portions, such as enhancers. Several high-throughput strategies have been established to identify possible enhancer regions through CRISPR/Cas9-based genome disruption, many of which use a tiling approach in a cultured cell setting and comprehensively survey a locus of interest with a collection of short guide RNAs (sgRNAs) designed to cover the entirety of the locus (reviewed in Catarino and Stark, 2018; Klein et al., 2018; Lopes et al., 2016). More recently, the next generation of CRISPR/Cas9 technologies, such as CRISPR-based transcriptional activation (CRISPRa) or interference (CRISPRi), have allowed researchers to manipulate the transcription of endogenous loci by taking advantage of the sequence specificity of the CRISPR/Cas9 system (reviewed in Adli, 2018; Pickar-Oliver and Gersbach, 2019). In brief, these techniques utilize a nuclease-inactive version of the Cas9 protein (dCas9) fused with either a transcription activation domain, such as VP64 or p300, or a repressive chromatin modifier domain, such as Krüppel-associated box (KRAB). These dCas9–effector fusion proteins can facilitate transcriptional regulation at any desired genomic site guided by an sgRNA (Gilbert et al., 2013). The initial studies utilizing the dCas9–effector fusion proteins focused on the transcribed regions of the genome [coding regions as well as long non-coding RNA (lncRNA) loci] (e.g. Ewen-Campen et al., 2017; Jia et al., 2018; Lin et al., 2015); however, this technique was later successfully used to modulate enhancer functions (e.g. Thakore et al., 2015, reviewed in Klein et al., 2018; Lopes et al., 2016). Considering that dCas9–effector techniques have successfully been used in a genome-wide fashion (albeit currently limited to a cultured cell setting), these techniques should be adoptable to identify endogenous enhancers in insects, especially if the context of interest can be studied in cell culture.
Computational enhancer prediction through integration of multiple enhancer features
Evolution of computational approaches
Early approaches to computational enhancer prediction often relied on a limited degree of knowledge of enhancer features, such as evolutionary conservation (i.e. phylogenetic footprints) and/or the tendency of TF binding motifs to cluster within an enhancer (Berman et al., 2002; Halfon et al., 2002; Markstein et al., 2002; also reviewed in Halfon and Michelson, 2002; Markstein and Levine, 2002). These studies resulted in successful identification of enhancers in Drosophila, especially during embryogenesis, but success rates were low and false-positive prediction rates high. In recent years, the field has started to coalesce around supervised machine learning approaches that are trained using one or more features from a known set of enhancers. These features can include the DNA sequence itself, epigenetic information such as histone methylation and acetylation status, DNA methylation status and nucleosome positioning, transcription factor and co-factor binding, and evidence of transcription (e.g. of ‘enhancer RNAs’), among others. Support vector machines (SVMs) and random forest classifiers remain common approaches (e.g. Arbel et al., 2019; Chen et al., 2018; He et al., 2017; Le et al., 2019; Liu et al., 2018), although ‘deep learning’ approaches using artificial neural networks (ANNs) have been increasing in popularity as these methods become more mature and more feasible with current advances in computing power (e.g. Chen et al., 2018; Li et al., 2018; Liu et al., 2016; Min et al., 2017; Yang et al., 2017). In-depth reviews of computational enhancer discovery approaches have been provided elsewhere (Kleftogiannis et al., 2016; Lim et al., 2018; Suryamohan and Halfon, 2015), and the interested reader is directed to these for detailed treatment.
Generic versus specific enhancer prediction
In general, the current computational approaches can be classified into two types: ‘generic’ and ‘specific.’ Generic approaches rely on characteristics likely to be common among all enhancers regardless of particular spatio-temporal specificity, such as histone modifications and chromatin accessibility. This broad applicability means that a method trained on a single set of known enhancers in a particular cell line, or functioning under a given set of biological conditions, is still likely to be effective for enhancer discovery in a different tissue, cell line or physiological milieu. Indeed, these characteristics may be able to carry over across vast evolutionary distances, allowing models trained on insect enhancers to be used for mammalian enhancer discovery (Sethi et al., 2018 preprint), and presumably vice-versa. However, generic approaches primarily provide a large list of sequences with predicted enhancer function, but no information as to what spatial, temporal or physiological characteristics these putative enhancers may have. Specific approaches, in contrast, attempt to discover discrete subsets of enhancers with common activity. While these may include general features such as chromatin accessibility or histone modification status obtained through the use of chromatin profiling techniques (such as FAIRE-seq or ATAC-seq) performed on a specific tissue of interest, they also include specific features such as presence of particular bound TFs or their binding sites, or the DNA sequence itself. Although specific approaches in general find fewer enhancers overall than generic approaches, true-positive prediction rates tend to be similar for both types of methods.
Enhancer prediction independent of experimentally derived features
When considering application to non-traditional insect models, methods that require training based on multiple experimentally derived features are of considerably less utility, as these data sets are often not available, and rarely, if ever, exist for multiple cell types or conditions. Therefore, approaches that rely solely on genome sequence are likely to be the most appealing to researchers that use non-traditional insect models. A number of these are available, most of which fall into the ‘specific’ enhancer discovery class (Chen et al., 2018; Kazemian and Halfon, 2019; Le et al., 2019; Liu et al., 2018). In general, these approaches deconstruct the training sequences into a set of small (e.g. 4–8 nucleotides) subsequences, or ‘k-mers’, which are then evaluated against a similarly deconstructed set of non-enhancer background sequences. With the notable exception of SCRMshaw (Kantorovitz et al., 2009; Kazemian and Halfon, 2019; Kazemian et al., 2011, 2014), most such approaches have not been tested with respect to insect genomes, including that of Drosophila (a somewhat ironic situation given the unmatched availability of empirically confirmed Drosophila enhancers for use as training data; Rivera et al., 2019). Although methods demonstrated to work using vertebrate genomes are expected to function equally well in insects, comparing efficacies is difficult given the different training and validation regimens applied. An evaluation platform for assessing methods using a uniform set of Drosophila training and validation data has recently been described (Asma and Halfon, 2019), and a critical comparison of various approaches would be a valuable addition to the field.
Considerations when choosing enhancer identification methods for non-traditional insect models
As showcased above, there are a variety of approaches that allow identification of possible enhancer regions from the genomes of insects (and many more that we could not cover here; see Suryamohan and Halfon, 2015 for a more comprehensive review on currently available techniques). However, the options are limited when using non-traditional insect models owing to the early stage of genetic and genomic resource development in these insects. Below, we focus on several options that are more likely applicable to non-traditional insect models, and discuss key points to consider when choosing a method depending on the insect used or the context studied (Figs 2 and 3).
Choice of experimental approaches
Perhaps the first factor that influences the decision as to which approach to take would be whether one is interested in analyzing multiple loci or focusing on just one locus. A brute-force classic reporter assay-based survey for enhancers is a feasible option when focusing on analyzing the regulation of just one gene (Fig. 2A), assuming that either a valid reporter assay system is available in the study insect or the assay can be performed in Drosophila (see ‘Validating and investigating enhancer function’ below for more detailed discussion). However, the position of enhancers in relation to the gene of interest is quite unpredictable, from tens of thousands of base pairs upstream or downstream of the gene they regulate to inside of an intron or even sometimes within an exon (for example, see Arnold et al., 2013; Kvon et al., 2014), making the reporter assay-based enhancer search time-consuming, tedious and quite risky. Therefore, considering that genome sequences of many insects are now available and long-read sequencing technologies continue to advance (reviewed in Levy and Myers, 2016), it is beneficial to utilize some additional experimental approaches to identify possible enhancer regions even when focusing on only a single locus.
When genome sequences of a set of closely related species (such as within the same genus) are available or can be obtained, phylogenetic footprinting can mitigate the risk of a reporter assay-based enhancer search by providing candidate regions based on evolutionary conservation (Fig. 2B). However, as mentioned, conservation is not always reliable for predicting enhancers. Also, phylogenetic footprinting does not provide any context-dependent information (such as which enhancers are active in which tissues, at which time points or under which physiological conditions). Nonetheless, evolutionary conservation may be informative in a certain genus/family of insects, making phylogenetic footprinting a possible option to consider, particularly as a means of refining the boundaries of a putative enhancer sequence predicted by other methods.
Unfortunately, a genome-wide reporter assay is currently not an option for most insects, as it requires a large workforce, a well-annotated genome, and a highly established and efficient transgenic technique (the Flylight project in Drosophila, for example). However, some unique circumstances make a high-throughput reporter assay a feasible option for genome-wide enhancer identification. For instance, STARR-seq is an option when the context of interest can be studied using a cultured cell line or perhaps even with a tissue that is culturable and transfectable in vitro (Fig. 2C). Therefore, although contexts are limited, when applicable, a high-throughput reporter assay can be a very powerful approach that will allow a comprehensive identification of functionally validated enhancers.
When performing a reporter assay-based enhancer search, whether a brute-force survey, aided by phylogenetic footprinting, or a high-throughput approach, there is a significant caveat in regard to the backbone structure of the reporter construct used in the assay, such as the choice of core promoter. There is no guarantee that reporter constructs previously established in Drosophila are transferable to a different insect of interest. We discuss this point, along with other potential challenges of establishing a reporter assay system in non-traditional insect models, in a later section (see ‘Validating and investigating enhancer function’).
When a well-annotated genome sequence is available, chromatin profiling can provide rich information about the cis-regulatory landscape of the genome of the study insect. Currently, FAIRE-seq and ATAC-seq appear to be the primary approaches when investigating non-traditional insect models (Fig. 2D) [e.g. Aedes (Behura et al., 2016), Anopheles (Pérez-Zamorano et al., 2017), Tribolium (Lai et al., 2018), Heliconius (Lewis and Reed, 2019), Junonia (van der Burg et al., 2019), Bombyx (Zhang et al., 2017c, 2019)] as they do not require any special reagents and can be performed with a relatively small amount of input tissue. For example, we previously performed FAIRE-seq with tissues of the red flour beetle (Tribolium castaneum) and successfully obtained genome-wide chromatin profiles from various tissues and stages of this insect (Fig. 4) (Lai et al., 2018). More than 40,000 open chromatin regions in the Tribolium genome are detected, and comparison of the profiles across the samples revealed a distinct set of open chromatin regions in each tissue and at each stage. Many of these context-dependent openings likely correspond to tissue- and timing-specific enhancers, thus demonstrating the usefulness of chromatin profiling when studying non-traditional model insects.
With the use of antibodies against global chromatin modification markers, antibody-based methods, such as ChIP-seq, are also powerful at obtaining a genome-wide chromatin landscape (Fig. 2D). These techniques can reveal context-dependent and/or tissue-specific chromatin profiles, which is very useful when identifying possible enhancer regions that are active uniquely in a certain context. The requirement of a large amount of input tissues has been a significant limiting factor when using ChIP-seq (for instance, thousands of discs are likely required for one biological replicate if ChIP-seq is performed with Drosophila imaginal discs); however, CUT&RUN might now allow researchers to perform an equivalent analysis with a much smaller amount of input tissue. Through a combination of these chromatin profiling techniques, the Reed lab has revealed the genome-wide chromatin landscape of Heliconius butterfly wings, a beautiful example of the use of chromatin profiling in a non-traditional insect model (Lewis and Reed, 2019; Lewis et al., 2016).
ChIP-seq can also be used for identifying the binding sites of a particular TF throughout the genome (Fig. 2D). TF binding sites detected by ChIP-seq are often instructive when identifying context-dependent enhancers that are under the regulation of the investigated TF; thus this approach is quite advantageous when you know which TF to study, or which TF possibly regulates the gene of interest. The requirement for a high-quality ‘ChIP-compatible’ antibody against the TF of interest and the need for a large amount of input tissue have been significant drawbacks of this technique; however, the latter can now be bypassed by using CUT&RUN. One important point that requires attention when using TF binding sites to identify enhancers relates to the affinity of TFs to DNA. Recent studies have revealed that low-affinity binding of TFs to DNA can also be critical for gene regulation (Crocker et al., 2015, 2016). These low-affinity TF binding sites might not be readily detected by ChIP-seq and other antibody-based methods (as these techniques rely on strong binding of TFs to DNA), presenting a risk of missing biologically relevant TF binding sites. Moreover, low-affinity TF binding sites are often not evolutionary conserved (Crocker et al., 2015), further compounding the difficulty of finding enhancers.
It is worth emphasizing that the ‘enhancers’ identified through chromatin profiling described in this section, as well as the computationally identified enhancers (Fig. 2D, next section), are all still predictions. Therefore, it is imperative to functionally validate these candidate enhancer regions.
Computational approaches for non-traditional insect models
Computational approaches are an attractive option for use with non-traditional insect models in that they are quick, inexpensive, and in many cases do not rely on extensive empirically derived genomic data. As mentioned, approaches that rely solely on genome sequence (see Box 1 for discussion about how the status of genome assemblies and gene annotations influence enhancer prediction), such as SCRMshaw, are the most appealing. However, there is a significant caveat when applying supervised sequence-based approaches to non-traditional insect models: the acute dearth of training data, as few insect enhancers are known outside of Drosophila. Interestingly, SCRMshaw trained with known Drosophila enhancers was demonstrated to effectively discover enhancers throughout the 345 Mya range of holometabolous insects (Kazemian et al., 2014), indicating that Drosophila enhancers can be useful as training data at least for the genomes of the Holometabola. When SCRMshaw-predicted enhancers from other insects, including bees, wasps, beetles and mosquitoes, are tested in reporter gene assays in transgenic Drosophila, they validate at rates similar to those seen from within-species prediction of Drosophila enhancers (Kazemian et al., 2014; Suryamohan et al., 2016). Direct testing of Tribolium enhancers in transgenic Tribolium confirms that SCRMshaw can find bona fide enhancers cross-species (Lai et al., 2018). Although not tested in insects, Chen et al. (2018) similarly demonstrate that a k-mer-based prediction method trained using data from a single species can be used for enhancer discovery across a range of mammalian genomes. When trained on a tissue-specific enhancer set, their method performed better at discovering enhancers in the same tissue in other species than in different tissues of the same species. Together, these studies indicate that specific enhancer characteristics could be learned and applied in a cross-species setting, and therefore k-mer-based enhancer predictions will be useful when studying non-traditional insect models.
An important point to consider when applying computational enhancer prediction to non-traditional insect models is the status of their genome assemblies and gene annotations. Although the count of sequenced insect species is currently ∼470 (i5k: Sequencing Five Thousand Arthropod Genomes; http://i5k.github.io/arthropod_genomes_at_ncbi), assemblies are of varying quality, ranging from the extremely well-assembled Drosophila melanogaster (contig N50=21 Mb) to the poorly assembled meadow spittlebug Philaenus spumarius (contig N50=319 bp), and fewer than 40% have accompanying gene annotation (Li et al., 2019). How effective is enhancer discovery when genome assemblies are highly incomplete? Testing SCRMshaw with simulated dis-assembly of the Drosophila genome has revealed that contig N50s of at least 23,000 bp (which encompasses the upper 50% of current insect assemblies) are sufficient for effective SCRMshaw prediction, with minor loss of sensitivity and negligible increase in false-positive rates (Asma and Halfon, 2019). Therefore, highly complete genome assembly does not appear to be a prerequisite for successful enhancer prediction by SCRMshaw. Requirements for gene annotation are more difficult to assess. Annotation is not strictly necessary for enhancer prediction, but can certainly facilitate it. For example, SCRMshaw disregards coding sequences to focus on the regions that more likely contain enhancers, i.e. non-coding regions. The effect of gene annotation quality on computational enhancer prediction has not been explored.
Integrating experimental and computational approaches
As discussed above, each approach has its strengths and weaknesses. Therefore, the use of multiple strategies (ideally both experimental and computational), and comparison across the outcomes of several different approaches, is likely to be the most fruitful in narrowing down candidate regions to be functionally validated. In Tribolium, we compared the FAIRE profiles with SCRMshaw predictions and found surprisingly high overlaps between these two datasets (Fig. 4) (Lai et al., 2018). However, in the case of Tribolium (but we think this can be generalizable), chromatin profiling provided too many candidate regions (>40,000 peaks across samples), while k-mer-based computational prediction was too stringent and identified a relatively small number of candidate enhancers (∼1200 regions). Nonetheless, having two independent enhancer prediction approaches greatly helped us narrow down the enhancers for functional validation. Adding more tissues and/or using more homogeneous tissues/cell types for chromatin profiling, along with enhancing k-mer-based computational prediction through the use of improved training data and other refinements, will help increase resolution when identifying candidate regions for context-specific enhancers.
Validating and investigating enhancer function
Unless a reporter assay system is used to screen for enhancers (Fig. 2A–C), the enhancer regions identified through the methods described above (Fig. 2D), either chromatin profiling or computational approaches, are still predictions that require functional validation. In this section, we discuss several possible validation approaches when studying enhancers of non-traditional insect models, such as the reporter assay and CRISPR/Cas9-based genome editing. We also highlight some key issues and potential pitfalls when establishing a reporter assay system in insects outside of Drosophila.
Testing activity of non-Drosophila enhancers in Drosophila
As mentioned, confirmation of enhancer activity in vivo via a reporter assay is widely considered to be the gold standard when validating enhancer function (Fig. 1B). Since the first application of a reporter assay in Drosophila in the 1980s (Goto et al., 1989; Harding et al., 1989; Hiromi et al., 1985), reporter assays have been used to investigate the regulation and evolution of numerous genes in Drosophila, identifying over 20,000 enhancers (Rivera et al., 2019) and generating a large variety of useful reporter constructs. Although now feasible in a growing number of species (Fraser, 2012), making transgenic lines is a laborious task when using non-traditional insect models. Therefore, considering the ease of making transgenic lines in Drosophila (in part thanks to low-cost commercial injection services) and the availability of established reporter assay systems, the logical first step is to test the activity of possible enhancer regions identified from non-D.melanogaster insects (including various species in the genus Drosophila) in D.melanogaster. This approach has been quite successful when studying enhancer evolution among multiple Drosophila species (e.g. Frankel et al., 2011; Gompel et al., 2005; also see Rebeiz and Williams, 2017; Stern and Frankel, 2013 for review). Some studies have even demonstrated that enhancers from insect orders outside of Diptera work in Drosophila. For example, enhancers of some developmental genes in beetles, honeybees and even spiders were demonstrated to be active in their expected contexts in Drosophila (e.g. Ayyar et al., 2010; Cande et al., 2009a,b; Kazemian et al., 2014; Lai et al., 2018; Prasad et al., 2016; Wolff et al., 1998; Zinzen et al., 2006). These studies show the power of the cross-species reporter assay using Drosophila as an in vivo test tube, but with one unavoidable concern: are these non-Drosophila enhancers really showing biologically relevant activities in Drosophila? Owing to this obvious caveat of using a cross-species reporter assay, it is ideal if the enhancer activity is also tested in the native species.
Establishing a reporter assay in non-traditional insect models
We often assume that the reporter constructs and other genetic tools established in Drosophila are readily transferable to other insects. However, considering the deep divergence and the vast diversity among insect orders, there is no guarantee that these reporter constructs will function properly in other insects, especially those outside of the order Diptera. In fact, several groups including ourselves have encountered various interesting issues when transferring Drosophila constructs to Tribolium (Lai et al., 2018; Schinko et al., 2010). One of the major issues was related to the choice of core promoter (also known as ‘minimal’ or ‘basal’ promoter; reviewed in Kadonaga, 2012; Vo Ngoc et al., 2019). The most widely used core promoter in insects, the core promoter of the Drosophila Heat Shock Protein 70 (Hsp70) gene, did not work reliably in Tribolium, forcing researchers to look for an alternative core promoter. Schinko et al. (2010) identified that a Tribolium-native promoter, the core promoter of Tc-hsp68 (Tc-bhsp68), works well for their Gal4/UAS system, while Lai et al. (2018) determined that a variation of the DSCP (Drosophila Synthetic Core Promoter) properly drives gene expression when placed in a reporter construct (see Box 2 for details). Other components of transgenic constructs, such the choice of untranslated regions (UTRs) or the inclusion of exogenous genes that are often used in modern genetics, could also present problems. For instance, the yeast Gal4 gene has been used routinely in the gene misexpression system (Gal4/UAS system) in Drosophila (Brand and Perrimon, 1993) and has been successfully transferred to mosquitoes (Kokoza and Raikhel, 2011; Lynd and Lycett, 2012; O'Brochta et al., 2012) and silk moths (Imamura et al., 2003) without any major modification. However, when Schinko et al. (2010) worked on transferring the Gal4/UAS system to Tribolium, they noticed that the full-length Gal4 gene does not work in Tribolium. We also confirmed this, even though the Gal4 gene is transcribed in Tribolium (K. D. Deem and Y. Tomoyasu, unpublished data). Schinko et al. (2010) also tried two shorter and more active versions of Gal4, Gal4-VP16 and Gal4Δ (Ma and Ptashne, 1987; Viktorinová and Wimmer, 2007), both of which worked in Tribolium. These outcomes regarding the core promoter and other components highlight the potential difficulty of establishing a reporter assay in non-traditional insect models; however, we hope that the various issues we and others have encountered (such as the deep rabbit hole of core promoters; Box 2) will serve as guidance when working on other insects.
Since the early time of reporter assays, the core promoter of the Heat Shock Protein 70 (Hsp70) gene has been widely used in Drosophila when an exogenous core promoter is required (Goto et al., 1989; Harding et al., 1989; Hiromi and Gehring, 1987). This core promoter functioned properly in other insects when 3xP3, a synthetic enhancer composed of three repeats of the P3 Pax6 homeodomain binding site (Mishra et al., 2010; Sheng et al., 1997), was used to drive expression in eye- and nervous-related tissues as a marker of transgenesis [e.g. Drosophila virilis (Horn and Wimmer, 2000), Tribolium (Berghammer et al., 1999; Lorenzen et al., 2003), Bombyx mori (Thomas et al., 2002)]. The Dm-hsp70 core promoter was also used for a reporter assay in Tribolium to identify embryonic enhancers that regulate the Tribolium hairy gene (Eckert et al., 2004). However, when establishing the Gal4/UAS system in Tribolium, Schinko et al. (2010) found that the Dm-hsp70 core promoter does not reliably work in their constructs. SCP1 (Super Core Promoter 1, a composition of Drosophila and viral core promoter motifs that was designed to drive a high level of transcription in Hela cells; Juven-Gershon et al., 2006) also did not work well when used in a UAS construct either in Tribolium or in Drosophila (Schinko et al., 2010). These outcomes led Schinko et al. (2010) to try Tribolium-native promoters, the core promoters of Tc-hsp68 (Tc-bhsp68) and Tc-hairy. Although the Tc-hairy core promoter failed to work properly (it drove an unexpected nervous-system-specific expression), the Tc-bhsp68 core promoter worked well with their UAS constructs in Tribolium. Based on the extensive characterization of core promoter compatibility in Tribolium by Schinko et al. (2010), we thought that the Tc-bhsp68 core promoter would be a safe choice when we attempted to establish a reporter assay system in Tribolium. However, surprisingly, the Tc-bhsp68 core promoter failed to work reliably in Tribolium in a reporter construct with a wing enhancer of the nubbin gene (Tc-nub), even though the same construct worked well in Drosophila (Lai et al., 2018). We decided to try DSCP (Drosophila Synthetic Core Promoter), the core promoter that was established for a genome-wide reporter assay in Drosophila (the FlyLight project) (Pfeiffer et al., 2008). This core promoter is a chimera of the SCP and the Drosophila eve gene core promoter, which was shown to work more efficiently, and with a more diverse array of developmental enhancers, as compared with gene-specific core promoters in Drosophila (Pfeiffer et al., 2008). The DSCP also was found to work preferentially with developmental gene enhancers (versus housekeeping gene enhancers) when tested with STARR-seq (Zabidi et al., 2015). The DSCP we used was a variation of this promoter, containing a fragment of Dm-hsp70 promoter downstream of the transcription initiation site in addition to the motifs taken from SCP and the eve core promoter (cloned from the construct used in McKay and Lieb, 2013; see fig. 6 of Lai et al., 2018 for the sequence and annotation of this promoter). This DSCP worked well in our reporter construct both in Tribolium and Drosophila and in two very different developmental contexts (wing development and embryogenesis) in Tribolium, thus allowing us to establish a cross-species compatible reporter assay system (Lai et al., 2018).
It is currently unknown why the Dm-hsp70 and Tc-bhsp68 core promoters did not work properly when used in some transgenic constructs in Tribolium. Interestingly (and confusingly), these core promoters drove various patterns of enhancer–trap expression when inserted in the Tribolium genome (Lai et al., 2018; Lorenzen et al., 2003; Trauner et al., 2009). This indicates that these core promoters are capable of working with a diverse array of enhancers in Tribolium, even though they did not work properly in the tested artificially configured transgenic constructs. Also worth mentioning is the use of gene-specific promoters (not to be confused with ‘core’ promoters, see Fig. 1A). Promoters of several housekeeping genes have been successfully used to drive gene expression in Tribolium (Gilles et al., 2019; Lai et al., 2018; Lorenzen et al., 2002; Rylee et al., 2018; Sarrazin et al., 2012; Schinko et al., 2012; Siebert et al., 2008; Strobl et al., 2018). However, when we tested the Tc-nub promoter with the Tc-nub wing enhancer in a reporter construct, this construct failed to drive any expression either in Tribolium or in Drosophila (Lai et al., 2018). These confusing outcomes regarding promoters might be related to enhancer–core promoter compatibility and/or an optimal distance between the enhancer and the core promoter, which will require detailed investigation in the future.
One aspect that often makes this type of technology transfer so challenging is the absence of reliable positive controls in non-traditional insect models. For instance, we did not have any known Tribolium enhancers that work in the context we study, forcing us to assume that the potential enhancer we identified through a cross-species assay was in fact a true functional Tribolium enhancer and blindly use it as a positive control when we were troubleshooting our reporter constructs (Lai et al., 2018). The enhancers we validated through our cross-species reporter assay (such as Tc-Nub1L) are functional in both Drosophila (Diptera) and Tribolium (Coleoptera); thus these enhancers might serve as positive controls in a wide range of insects (at least in Holometabola). Hopefully the number of positive control enhancers will quickly increase as more enhancer studies are performed in new insect models.
Functional analysis of enhancers through genome editing
Although validation by a reporter assay continues to be the gold standard when studying enhancers, reporter assays do come with several caveats, even beyond the species-specific issues discussed above. For instance, the distance between the enhancer and the core promoter within a reporter construct has been observed in some cases to affect proper enhancer activity (e.g. Small et al., 1993; Swanson et al., 2010). Compatibility between enhancers and core promoters is another potential issue, which can drastically influence the outcome of the assay (e.g. Pfeiffer et al., 2008; also see Zabidi et al., 2015 for an extreme case of compatibility issues between two core promoters, reviewed in Atkinson and Halfon, 2014). Considering these caveats, in addition to the potential hassles described above when establishing a reporter assay system in non-Drosophila insects, an LOF approach through CRISPR/Cas9-based disruption of enhancers is an attractive alternative when validating enhancer function (also discussed in Duester, 2019).
Several CRISPR/Cas9-based methods have been used to investigate the function of enhancers in Drosophila, some of which are described in the previous section. For instance, Xu et al. (2017) have used a split-drive configuration of a CRISPR/Cas9-based gene drive system (dubbed as CopyCat) to replace the wing vein enhancer of the knirps (kni) gene in Drosophila with that of a mutant allele or the homologous enhancers of other dipteran species. Also, some next-generation CRISPR/Cas technologies, such as CRISPRa, have been successfully used in vivo using Drosophila (Ewen-Campen et al., 2017; Jia et al., 2018; Lin et al., 2015). Although not CRISPR/Cas9-based, Crocker and Stern (2013) used a method that is conceptually similar to dCas9-effector technologies (transcription activator-like effectors, TALEs) to interrogate enhancer function in Drosophila, demonstrating that the dCas9-effector technologies can be quite useful when studying insect enhancers.
Although these and other elaborated CRISPR/Cas technologies (reviewed in Bier et al., 2018) are attractive, a simple CRISPR/Cas9-based knockout via NHEJ (non-homologous end joining) may be most appealing to researchers that use non-traditional insect models owing to the challenging nature of implementing these new technologies in insects outside of Drosophila. Since the first breakthrough application of the CRISPR/Cas9 system in genome editing in 2012 and 2013 (Cong et al., 2013; Jinek et al., 2012), CRISPR/Cas9-based knockout techniques have already been applied to various orders of non-traditional model insects and other arthropod species [e.g. Coleoptera (Gilles et al., 2015), Lepidoptera (Connahs et al., 2019; Mazo-Vargas et al., 2017; Prakash and Monteiro, 2018; Wei et al., 2014; Zhang and Reed, 2016), Hymenoptera (Trible et al., 2017; Yan et al., 2017), Orthoptera (Watanabe et al., 2017), Zygentoma (Ohde et al., 2018) and crustacean species (Martin et al., 2016; Nakanishi et al., 2014), reviewed in Gantz and Akbari, 2018; Gilles and Averof, 2014], showing the relative ease of adopting this technique in insects.
The idea of using CRISPR/Cas9 knockout to validate enhancer function is straightforward: remove/disrupt the candidate enhancer from the genome and evaluate the resulting phenotype. There are several points that require attention when designing an enhancer knockout experiment using non-traditional insect models. From a technical point of view, researchers need to decide whether to analyze the mutant phenotype in the somatic cells of G0 insects or in established mutant lines. Analyzing in G0 is beneficial for insects with difficulties in husbandry and/or a longer generation time, as it allows bypassing multiple generations of crosses to establish a mutant strain and analyzing the outcome immediately in the individuals that received CRISPR/Cas9 injection. This approach has been very successful in butterflies (e.g. Connahs et al., 2019; Mazo-Vargas et al., 2017; Prakash and Monteiro, 2018; Zhang and Reed, 2016; Zhang et al., 2017a,b) as well as in other species such as in the crustacean Parhyale (Bruce and Patel, 2018 preprint; Clark-Hachtel and Tomoyasu, 2017 preprint; Martin et al., 2016). However, to be able to detect mutant phenotypes in G0, (1) genome editing events need to happen in a large enough number of somatic cells and (2) mutant phenotypes need to be clearly visible (e.g. pigmentation and patterning defect, loss of tissues) in the context of study. In other words, the outcome is interpretable likely only when genome editing works properly and results in a clear mutant phenotype, thus presenting a potential caveat of analyzing mutant phenotypes in G0. Establishing a mutant line is a safer option if genetics of your insects allows, and if a valid screening scheme is available to identify and track mutations. A stable line allows for more quantitative measures to be brought to bear, such as qRT-PCR to measure changes in gene expression. Replacing and/or disrupting the targeted enhancer with a marker construct (such as a fluorescent gene driven by 3xP3) might allow an easier tracking of the mutation compared with the mutation created through a simple NHEJ knockout, which likely requires genomic PCR-based methods to identify the mutation (unless the mutation results in a haplo-insufficient, visible and viable phenotype). Although the HDR (homology directed repair) knock-in approach appears to suffer from low efficiency when used in non-traditional insect models (C. M. Clark-Hachtel, K. D. Deem and Y. Tomoyasu, unpublished data), various strategies have been developed to improve success rates (e.g. Aird et al., 2018; Savic et al., 2018, also see Bier et al., 2018; Liu et al., 2019 for review of additional methods to increase the HDR efficiency), some of which might be worth pursuing when performing HDR knock-in in insects. In addition, NHEJ knock-in (Auer et al., 2014; Watanabe et al., 2017), including the CRISPaint (CRISPR-assisted insertion tagging) system (Bosch et al., 2020; Schmid-Burgk et al., 2016), or MMEJ (microhomology mediated end joining) knock-in (e.g. Nakade et al., 2014) might facilitate a more efficient disruption of enhancers while also tagging the genome-editing event with a visible marker.
A further aspect that requires attention when knocking out enhancers is related to the nature of enhancers. It has been shown that multiple enhancers sometimes act redundantly (dubbed as ‘shadow enhancers’), which fosters a robust and stable gene expression (Perry et al., 2010) and might also facilitate evolution of novel traits (Hong et al., 2008). Enhancer redundancy appears to be pervasive among developmental genes (Cannavò et al., 2016). This may cause an issue when performing an enhancer knockout, as targeting one enhancer might not be sufficient to cause any visible abnormalities, which is especially problematic when analyzing in G0. Whether these features of enhancers found in Drosophila are conserved in other insects is not known, which is one of the very reasons why it is essential to study enhancers in various organisms. Also, considering that knocking out enhancers also presents some caveats, it would be ideal to validate enhancer function via multiple methods, such as a reporter assay and knockout, through which both necessity and sufficiency of an enhancer can be evaluated (discussed in detail in Catarino and Stark, 2018; Halfon, 2019).
Pushing the frontiers of enhancer studies in non-traditional insect models
With the recent confluence of effective enhancer-discovery approaches established in Drosophila, and the sequencing of numerous insect genomes (Thomas et al., 2018 preprint), the time is ripe to start broadening the investigation of enhancers and other cis-regulatory mechanisms to a wider range of insects. There are several critical areas that should receive high priority for active development to make enhancer studies readily possible in non-traditional insect models.
On the computational side, these areas include (1) the use of improved training data by incorporating the latest available data (such as the data from the REDfly database; Rivera et al., 2019), and (2) better integration of computational and experimental enhancer-discovery methods (such as ATAC-seq and FAIRE-seq). Another interesting area to pursue would be to leverage comparative genomics in improving the accuracy of computational predictions. As mentioned, enhancer sequences are frequently alignable between closely related species, but conservation diminishes with increasing evolutionary distance. Nevertheless, enhancer locations are often maintained in equivalent locations despite the inability to directly align the enhancer sequences themselves (Cande et al., 2009a; Kazemian et al., 2014). By prioritizing those enhancer predictions among moderately related species that fall into the same approximate genomic location, we should be able to not only reduce false-positive predictions, but also build up sets of evolutionarily related but non-alignable enhancers. The latter will provide an unprecedented resource for probing the evolution of regulatory sequences. With its average 75% success rate (Kazemian et al., 2014), SCRMshaw has emerged as a promising first-line tool for identifying enhancer sequences in non-traditional insect models, whose prediction capacity will be further improved through these areas of focus.
Although advances in computational and genomics approaches have begun to enable enhancer studies in insects outside of Drosophila, the underdeveloped nature of functional genetics and genomics tools in non-traditional insect models still hinders researchers from functionally dissecting diverse mechanisms underlying cis-regulation and investigating how changes in cis-regulation have contributed (and are contributing) to the evolution of various traits at the detailed molecular level. The cross-species compatible reporter assay construct we previously established (Lai et al., 2018) is a step forward towards performing enhancer studies in various insects, and there are several areas that we can explore to continue our progress for a better implementation of a reporter assay system and other functional genetics tools in non-traditional insect models. The first area is related to core promoters. As we discussed extensively in Box 2, it is crucial to choose the right core promoter that fits the gene, context and species of the study. The modified DSCP we used in our study worked well in two species (a coleopteran and a dipteran, representing a large span of the Holometabola) and in two developmental contexts (appendage development and embryogenesis), suggesting that this core promoter can be used in a wide taxonomy of insects. Nonetheless, the DSCP is mainly constructed with Drosophila core promoter motifs (Juven-Gershon et al., 2006; Lai et al., 2018; Pfeiffer et al., 2008); it might therefore be possible to tailor a more efficient core promoter for each species by a similar strategy. It may also be possible to design a universal core promoter that works across multiple orders of insects. The second area is to expand the genetic toolkit for non-Drosophila insects. Tissue- and context-specific enhancers identified through enhancer studies will be essential when developing additional genetic tools and resources. With the use of these enhancers combined with some modifications to reporter constructs, we will be able to build various modern genetic tools useful for lineage tracing, gene misexpression, tissue-specific RNAi and CRISPR, and beyond. Developing these functional genetic tools will further accelerate investigation of enhancers as well as of many other aspects of biology in non-traditional insect models.
Armed with ample genetic, genomic and computational tools and resources, studies in Drosophila have revealed a wealth of intriguing aspects of enhancer function and evolution. In this Review, we mainly focused on how to ‘find’ enhancers in non-traditional insect models. This is just the first step in investigating the amazing array of diverse traits found in insects from the cis point of view, a widely unexplored area of biology.
Y.T. thanks Leslie Vosshall, Michael Dickinson and Julian Dow for the kind invitation to the JEB Symposium on Genome Editing for Comparative Physiology, Michaela Handel for her help on facilitating the symposium, and the editors of JEB for organizing the special issue derived from the symposium. We also thank Kevin Deem and other members of the Tomoyasu lab for helpful comments.
This work was supported by the National Science Foundation (NSF) (grant IOS1557936 to Y.T.) and the U.S. Department of Agriculture (USDA) (grant 2018-08230 to M.S.H. and Y.T.).
The authors declare no competing or financial interests.