Many of the major biological discoveries of the 20th century were made using just six species: Escherichia coli bacteria, Saccharomyces cerevisiae and Schizosaccharomyces pombe yeast, Caenorhabditis elegans nematodes, Drosophila melanogaster flies and Mus musculus mice. Our molecular understanding of the cell division cycle, embryonic development, biological clocks and metabolism were all obtained through genetic analysis using these species. Yet the ‘big 6’ did not start out as genetic model organisms (hereafter ‘model organisms’), so how did they mature into such powerful systems? First, these model organisms are abundant human commensals: they are the bacteria in our gut, the yeast in our beer and bread, the nematodes in our compost pile, the flies in our kitchen and the mice in our walls. Because of this, they are cheaply, easily and rapidly bred in the laboratory and in addition were amenable to genetic analysis. How and why should we add additional species to this roster? We argue that specialist species will reveal new secrets in important areas of biology and that with modern technological innovations like next-generation sequencing and CRISPR-Cas9 genome editing, the time is ripe to move beyond the big 6. In this review, we chart a 10-step path to this goal, using our own experience with the Aedes aegypti mosquito, which we built into a model organism for neurobiology in one decade. Insights into the biology of this deadly disease vector require that we work with the mosquito itself rather than modeling its biology in another species.
Progress in the experimental biological sciences is driven by work in simple systems, from carefully controlled in vitro approaches that utilize purified biological material outside its natural context to in vivo studies in model organisms from microbes to primates. These organisms provide opportunities to ‘model’ complex biological processes relevant to human health or provide a powerful window into fundamental biological principles shared across the tree of life. For instance, the components of the cell division cycle were identified using the humble yeast Schizosaccharomycespombe and Saccharomyces cerevisiae (Nurse, 2017), while S.cerevisiae was the first eukaryotic organism to have its genome sequenced (Goffeau et al., 1996). Many of the genetic rules governing embryonic development were discovered through the pioneering work of Christiane Nüsslein-Volhard and Eric Wieschaus working in Drosophilamelanogaster flies (Nüsslein-Volhard and Wieschaus, 1980). The genetic basis of biological rhythms, a fundamental principle of plant and animal life organized around the circadian rhythm of the sun, was also worked out in flies (Bargiello et al., 1984; Hardin et al., 1990; Konopka and Benzer, 1971). Major insights into the endocrine signaling that links hunger, metabolism and body weight came from analysis of leptin and leptin receptors in ob and db mutant mice (Friedman, 1998). These are but a few of the many examples from the last century of the power of model organisms in producing important basic and clinical insights.
Model organisms are relatively easy and inexpensive to culture in the laboratory, have fast generation times that facilitate genetic analysis, and can be readily manipulated using increasingly comprehensive and powerful experimental genetic tools to visualize and manipulate specified cells across developmental time and space. For neuroscientists, access to behaviors that can be genetically dissected has driven interest in flies and worms (Brenner, 2009; Vosshall, 2007), and there are now thousands of laboratories around the world that generate, maintain and openly share thousands of distinct strains to understand the workings of these invertebrate nervous systems. The success of these organisms is self-perpetuating: the community of researchers working with them continues to grow, new methodologies and resources are developed and shared, and the body of specific knowledge and access to powerful tools to manipulate, observe and experiment upon these model organisms further lowers the bar to entry so that the cycle can repeat. Model organisms are irreplaceable for studying fundamental aspects of biology but fall short in their ability to address biological specializations that arise in specific branches of the evolutionary tree. Migration of monarch butterflies, vocal learning in songbirds and camouflage in cuttlefish are examples of fascinating biological problems that can only be fully understood by moving these non-traditional species toward model organism status.
In 2009, we began a journey to build Aedesaegypti into a model organism for neurobiology. After decades of working in ‘the fly’, D. melanogaster, we were ready for the challenge of moving into a new species to ask questions that could not be addressed in the traditional model organisms. Some species of mosquito are the deadliest animals on the planet, driven by anthropophilic species that bite humans to take their blood and simultaneously act as a vector for disease-causing pathogens such as the Plasmodium malaria parasites and arboviruses including chikungunya, dengue, eastern equine encephalitis, yellow fever, Zika and others. The elements of mosquito biology most relevant to disease transmission are specialized and thus elude comprehensive study in model organisms. Mosquitoes possess unique anatomical appendages that facilitate piercing skin and sucking blood, and have exquisitely sensitive sensory systems that are tuned to help them locate vertebrate hosts in their environment. Drosophila, by contrast, feed on yeast and lay eggs in rotting fruit, uninterested in many of the cues that lure mosquitoes. While many of the gene families involved in these processes are relatively conserved, an estimated 260 million years of evolution separates mosquitoes from Drosophila (Arensburger et al., 2010) and many critically important genes do not share one-to-one orthology. Thus, understanding how mosquitoes operate in their environment requires studying mosquitoes themselves.
This review will focus on the practical challenges and recent opportunities available to those seeking to establish a new model organism, drawing on our experiences with the mosquito Ae. aegypti. The review is organized into 10 steps that we followed in our own work (see Box 1). While we focus on the tools and approaches required to perform rigorous and reproducible science on the genes and neural circuits that generate behavior, many of the lessons and approaches are generalizable to other areas of biology as well. Not all steps are required – or even feasible – in a given species, but the more steps that can be accomplished, the more progress that can be made in working with the species in a mechanistic framework.
The figure shows an outline of 10 broadly defined steps towards building a genetic model organism, illustrated by the dates of published progress for select steps: (3) whole-genome assembly of D. melanogaster (Adams et al., 2000) and Ae. aegypti (Nene et al., 2007); (5) the generation of stable transgenic animals in D. melanogaster (Rubin and Spradling, 1982) and Ae. aegypti (Coates et al., 1998; Jasinskiene et al., 1998); (7) targeted mutagenesis in D. melanogaster (Rong and Golic, 2000; Bassett et al., 2013) and Ae. aegypti (DeGennaro et al., 2013; Basu et al., 2015; Dong et al., 2015; Kistler et al., 2015); (8) implementation of binary effector systems in D. melanogaster (Brand and Perrimon, 1993; Potter et al., 2010) and Ae. aegypti (Kokoza and Raikhel, 2011; Matthews et al., 2019). Photo credit: André Karwath (fly) and Alex Wild (mosquito).
Step 1: choose an interesting and/or important organism
One must first choose a species that is worth the considerable effort of building a new model organism. This choice is personal and can be driven by a curiosity for fundamental understanding of our planet's remarkable biodiversity or by pressing global health or agricultural needs. Examples of the former include studying the evolution of courtship behavior in other Drosophila species (Seeholzer et al., 2018) and the genetic basis of monarch butterfly migration (Markert et al., 2016). Examples of the latter include improving quality of life for livestock (Young et al., 2019) and fighting human parasitic nematodes (Gang et al., 2017). The choice is also ethical and practical: species whose populations are threatened in the environment or cannot be reasonably maintained in a laboratory are poor choices.
We and others have been interested in studying both the basic biology and public health consequences of mosquitoes. There are over 3500 species of mosquitoes spread across two subfamilies and 113 genera, and mosquitoes are found on every continent except Antarctica (Mosquito Taxonomic Inventory: http://mosquito-taxonomic-inventory.info/simpletaxonomy/term/6045). Females of some mosquito species blood-feed on vertebrate hosts. The deadliest mosquitoes are those that prefer humans over non-human animals, including the anopheline vectors of human malaria pathogens and the aedine vectors of arboviruses. When we set out to select a species for our investigation of the neurobiology of mosquito attraction to humans, we faced the fundamental choice between the malaria mosquito Anopheles gambiae and Ae. aegypti. From a public health perspective, An. gambiae was the bigger killer but it is also more difficult to rear in the laboratory and, at the time, was less amenable to genetic manipulation. Aedesaegypti was also comparatively understudied in recent years and therefore became our choice.
Step 2: learn how to rear and work with the organism in the lab
Most of the following steps toward building a model organism require that your species of interest can be bred in the laboratory. There will be important practical considerations to consider, including environmental conditions such as temperature and humidity, space needs, what to feed your species and how to ensure that mating happens normally. Some species have exotic dietary or environmental needs or complicated mating rituals that make laboratory breeding impossible. Other considerations include seasonal mating or diapause requirements, the ability to harvest embryos and the overall generation time of your prospective model. It is impossible to list all of the potential hurdles here, and we recommend consulting broadly with scientists in the field before you attempt to colonize a non-model species. If your species cannot breed in the laboratory, it may be possible to generate somatic mutations or disrupt gene expression in a wild-caught individual, but in these cases, the possibilities for systematic mechanistic insight will be limited.
Step 3: assemble its genome and profile its gene expression (RNA-seq)
Modern genetics is difficult without a genome. A well-assembled and well-annotated genome provides a complete list of genes and their position on chromosomes, and enables all of the genetic manipulation strategies we detail below. The good news is that technological advances in next-generation sequencing can now deliver high-quality genomes in months, rather than years, and for thousands of dollars rather than hundreds of millions of dollars. In particular, recent improvements in the length and accuracy of sequencing reads and reductions in the DNA input requirements for long-read technologies including Pacific Biosciences (PacBio) and Oxford Nanopore (coupled with improvements in bioinformatics tools to handle these data), have put high-quality genome assemblies within reach for many organisms (Kingan et al., 2019; Miller et al., 2018; Sedlazeck et al., 2018).
Because a genome is a pre-requisite for most genetics, we began our work in Ae. aegypti shortly after its genome became available. A draft assembly of the Ae. aegypti genome was published in 2007 (Nene et al., 2007), revealing a large (>1.2 Gb) and repetitive (>65% repetitive sequence) genome. This genome is almost 10 times larger than that of D. melanogaster and 5 times larger than that of An. gambiae (Fig. 1A). Because of limitations in sequencing and assembly technology at the time, this genome was highly fragmented, represented by over 32,000 small ‘contigs’, of which fewer than half were mapped to chromosomes.
The poor quality of the 2007 draft genome made it difficult to make progress in this species. We therefore formed the Aedes Genome Working Group in 2015 as an international consortium across industry and academia to re-sequence, re-assemble and re-annotate the Ae. aegypti genome using a variety of new sequencing technologies (Harmon, 2016) (Fig. 1B). The final assembly, AaegL5, was generated from 80 male pupae from a single-pair cross from a laboratory strain. Long-read PacBio sequencing was used to generate a primary assembly, and chromosome conformation capture (Hi-C) scaffolding techniques were used to identify alternative haplotypes and to order and orient >92% of the genome assembly onto the three chromosomes (Matthews et al., 2018). The resulting assembly was substantially more complete and contiguous than previous assemblies. Next-generation RNA sequencing (RNA-seq) data in publicly available datasets, drawn from different tissues and developmental stages, was used as input for automated gene-set annotation. A significant amount of additional manual annotation and curation dramatically expanded our understanding of the genomic repertoire of chemosensory receptors and other important multi-gene families.
Genomes are the gold standard for gene discovery but if it is not possible for budgetary or other reasons to generate a reference genome, one can use RNA-seq, including the generation of ‘de novo’ transcriptomes (Hölzer and Marz, 2019), to obtain rich information about transcript sequence and expression in a given tissue. In Drosophila, large datasets and publicly minable resources such as FlyAtlas (Chintapalli et al., 2007; Leader et al., 2018) have allowed researchers to quickly identify relevant patterns of gene expression on a genome-wide scale. This technology is more accessible and affordable than ever, and in the mosquito, transcriptomic data have been used to profile the dynamics of gene expression within specific tissues between male and female mosquitoes, across developmental stages, and in different physiological states such as blood-feeding or arboviral infection status (Akbari et al., 2013; Etebari et al., 2017; Koh et al., 2018; Matthews et al., 2018, 2016; Tallon et al., 2019).
A comprehensive transcriptomic characterization of male and female antenna, proboscis, maxillary palps, legs and brain identified a ‘parts-list’ of specific receptors and other genes expressed in sensory and neural tissues (Matthews et al., 2016), focusing the search for candidate receptors involved in mosquito host seeking (Raji et al., 2019) and egg laying (Matthews et al., 2019) behaviors (Fig. 1C,D). Further, this study identified sexually dimorphic transcripts in the mosquito brain and sensory tissues, and those that were regulated by blood-feeding state (Matthews et al., 2016). An important future goal will be to refine our understanding of gene expression within single cells to better understand the logic of how molecularly defined cell types form the neural circuits that generate mosquito behavior.
Step 4: develop a method to introduce genetic material into the organism (injection, electroporation, ballistic, viruses, ReMOT)
Genetic modification requires the introduction of reagents that alter genes or gene expression into the species of interest. If the goal is to have heritable changes, the reagents must be introduced into germline cells. There are a number of routes to this end. Insect transgenesis is traditionally performed by microinjection of DNA, RNA and/or recombinant protein into developing syncytial embryos. Germline transformation protocols for Ae. aegypti and other mosquitoes have existed since the 1980s (McGrane et al., 1988; Miller et al., 1987; Morris et al., 1989) and video protocols are available for practical learning [Jasinskiene et al., 2007; see Vosshall lab mosquito (Aedes aegypti) embryo injection protocol, https://www.youtube.com/watch?v=nDo4_2lZcbM]. In some cases, eggs are too fragile or too tough for physical injection and other routes must be tried. Reagents can be ballistically introduced (Yuen et al., 2008) or electroporated (Ando and Fujiwara, 2013) into tissue or delivered via paratransgenesis (Wilke and Marrelli, 2015). A recent innovation (ReMOT control) involves injecting proteins of interest fused to yolk proteins into the female ovary, where they are taken up into developing eggs (Chaverra-Rodriguez et al., 2018).
If eggs are available and amenable to injection but rearing subsequent generations is difficult, there are reports of homozygous mutations being obtained such that somatic phenotypes can be studied in adults injected as embryos (Watanabe et al., 2012). Notably, this somatic mutagenesis strategy has been successful in dissecting the signaling pathways underlying the complex patterning of butterfly wings (Zhang et al., 2017).
Step 5: make transgenic animals (transposons)
Before the advent of precision genome editing, genetic modification was limited to integrating and remobilizing foreign DNA within host genomes for the purposes of mutating genes and driving the expression of various effectors and markers. This technology has the benefit that it is highly efficient and affordable and has been a tremendous driver of progress in D. melanogaster for nearly 40 years (Rubin and Spradling, 1982). Transposon-mediated transgenesis adapts natural genetic mobile elements called transposons that use special repeat sequences and an enzyme called transposase to integrate at near-random sites in genomes. By cloning genes of interest between the long terminal repeats of a transposable element and providing exogenous transposase, artificial constructs can be integrated into chromosomes for the purpose of genome engineering. A number of different transposon/transposase systems exist, and they are each more or less efficient in different species (Fraser, 2012). The specific source of transposase should be carefully considered and empirically tested. Delivering mRNA or recombinant protein may prove more efficient than plasmids with heat-shock or ubiquitous promoters, for example. Furthermore, hyperactive transposase variants with improved activity have been described and may be of use in some systems (Eckermann et al., 2018; Otte et al., 2018).
In general, transposon-mediated transgenesis can be highly efficient, limiting the number of embryos that need to be injected to produce a desired line. However, integration is essentially random, although some transposon/transposase systems do preferentially integrate at specific ‘hot-spots’ across the genome. This can lead to significant position-dependent differences in gene expression pattern and timing and makes it very difficult to compare across different constructs and different integration events. One potential solution to this problem is the adaptation of site-specific integrase systems, such as phiC31 (Labbe et al., 2010; Nimmo et al., 2006), though this requires a separate step; namely, the generation and characterization of donor sites to optimize expression for the desired tissues and time points (Labbe et al., 2010). Furthermore, a transgene will require specific promoter/enhancer sequences to be faithfully expressed in a tissue of interest. In a new model organism and particularly one with a large and repetitive genome, it is often difficult to identify such regulatory elements de novo and often promoter/enhancers from distant species will not express reliably or at all. Identifying and harnessing effective and robust regulatory sequences is an important challenge for the field to solve.
Step 6: knock-down and knock-out genes (RNAi, ZFN, TALEN, CRISPR-Cas9)
Before site-specific genome editing revolutionized modern biology, RNA interference (RNAi) was widely used to disrupt gene expression. When deployed via a transgenic construct expressed in vivo, this technique has the strong benefit that specific segments of RNA can be targeted for degradation, for instance if a gene is alternatively spliced. Importantly, RNAi knock-down can be restricted to a tissue or developmental time of interest. In the mid-2000s, Barry Dickson and colleagues in Vienna developed a genome-wide RNAi library of live D. melanogaster strains that could be used to knock down expression of any gene in any tissue at will (Dietzl et al., 2007). Over one-million strains from this collection have been distributed as of this year and the technique revolutionized the field. In Caenorhabditiselegans, knocking down genes is as easy as feeding these nematodes Escherichiacoli expressing an RNAi hairpin. A genome-wide library of such bacterial strains has been constructed (Kamath and Ahringer, 2003) and is broadly used by investigators worldwide. In non-model organisms, there has not been and likely will never be such a genome-wide initiative. In most cases, the user community is too small to support the cost of such an initiative and the difficulties in maintaining thousands of strains are insurmountable.
In non-model organisms, the typical route of introducing RNAi for gene knock-down has been by injection at earlier developmental stages and the analysis of gene knock-down in adults. While cheap and flexible, the major problem with this approach is that there is an uncertain and limited window of efficacy. For example, after injection into larvae, RNAi may carry over into adults and achieve knock-down of a gene of interest for only a few days. This problem can be potentially mitigated by intrathoracic injection into adults (Drake et al., 2012; Kang et al., 2018), but RNAi in general is limited by variability in the degree and duration of knock-down. Because the knock-down is variable as a result of differences in injection site and efficacy, each animal that is injected must be analyzed separately and the technique does not scale readily. Furthermore, these manipulations are not heritable, precluding large-scale analysis. To overcome some of these limitations, it has been demonstrated that delivering RNAi reagents as transgenes, as was done in Drosophila, is possible in mosquitoes, and there have been some key successes such as developing a transgenic RNAi strategy against DENV-2 (Franz et al., 2006, 2014). The potential advantage of transgenic RNAi is that you can limit it to tissues of interest, but all the caveats about timing and degree of knock-down are still there, just in a different form.
Unlike transposase-mediated transformation, in which exogenous DNA is inserted randomly throughout the genome, site-specific genome-editing reagents enable precise and flexible manipulation of specific genomic loci. While different in execution, each class of genome-editing reagent has the same goal: to generate a double-strand break at a specific location(s) in the genome that is then resolved by the host cell, either through mechanisms of homology-directed repair (see step 7, below) or, more commonly, through non-homologous end-joining, which can create small insertions or deletions (indels) or other types of non-specific mutations that may disrupt gene function. When injected into the developing mosquito embryo, these reagents will generate mutations in each cell independently and the surviving injected embryos will give rise to adults with a mosaic genotype. If cells destined for the mature germ line are successfully edited, some proportion of the edited individual's offspring may carry a mutant allele in a heritable fashion.
A number of site-specific genome-editing reagents have been applied to mosquitoes. These include zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 RNA-guided nucleases. ZFNs were first described in 1994 as chimeric ‘designer nucleases’ composed of a DNA-binding domain [an array of zinc-fingers, each recognizing a three base pair (bp) motif] and an otherwise non-sequence-specific DNA nuclease domain from the naturally occurring FokI restriction endonuclease (Kim and Chandrasegaran, 1994). When paired, two ZFNs targeting each strand of DNA can generate site-specific double-strand breaks through heterodimerization of tethered FokI domains. TALENs are conceptually similar chimeric proteins consisting of a DNA-binding domain originally derived from a Xanthomonas plant pathogen and a non-specific FokI nuclease domain (Christian et al., 2010; Li et al., 2011). ZFNs were first applied in Ae. aegypti to manipulate several candidate genes involved in mosquito sensory perception (Corfas and Vosshall, 2015; DeGennaro et al., 2013; McMeniman et al., 2014) and behavioral regulation (Liesch et al., 2013). Similarly, TALENs were successfully deployed in Ae. aegypti, demonstrating that they too could be used to generate targeted double-strand breaks (Aryan et al., 2013, 2014). However, ZFNs and TALENs are difficult to scale and can be costly and time intensive to engineer as the rules governing protein–DNA interactions are incompletely understood and can vary depending on the context of the surrounding DNA sequence.
Since 2013, genome editing in Ae. aegypti, as in many other organisms, has primarily relied on a class of RNA-guided nuclease reagents derived from CRISPR and CRISPR-associated (Cas) genes. The bacterial type II-A CRISPR-Cas9 system from Streptococcus pyogenes was the first to be adapted as a flexible genome-editing tool and remains the most commonly used system to date (Doudna and Charpentier, 2014). In its reduced form, CRISPR-Cas9 uses short RNAs in the form of a tracrRNA:crRNA complex or a single guide RNA (sgRNA) in complex with a non-specific DNA nuclease, Cas9. The requirement for CRISPR-Cas9 binding to genomic DNA is the presence of a protospacer-adjacent motif (PAM) and 17–20 bp of RNA–DNA complementarity immediately 5′ to the PAM. For S. pyogenes Cas9, the PAM is NGG, where N is any DNA base. This motif occurs approximately once every 17 bp in the Ae. aegypti genome, meaning that CRISPR-Cas9 can be used to target nearly every gene.
Several protocols have been described for CRISPR-Cas9 editing in Ae. aegypti (Basu et al., 2015; Dong et al., 2015; Kistler et al., 2015). These initial protocols share a common mode of delivery for Cas9, sgRNA and donor DNA, through micro-injection of embryos. More recently, transgenic strains have been developed that express Cas9 in the Ae. aegypti germline (Li et al., 2017) and novel methods are being actively developed to directly deliver Cas9/sgRNAs to developing embryos within the adult ovary through intrathoracic injections (ReMOT) (Chaverra-Rodriguez et al., 2018). However, to date, donor DNA cannot be used with this strategy to support homology-directed repair. Despite the abundance of CRISPR-Cas9 recognition sites in the genome, there is considerable variability in the activity of sgRNAs, even when targeting nearby sites in the genome (Basu et al., 2015; Dong et al., 2015; Kistler et al., 2015). This could be due to a number of factors including chromatin configuration, mismatches between the reference genome and the injected strain, and variability in activity due to base composition or other properties of the target sequence. Thus, it is critical to design and test several sgRNAs targeting a desired locus to ensure editing at sufficient levels.
Step 7: develop precise mutagenesis for tagged mutants, and gene replacements
Homology-directed repair allows for flexible and precise genome editing to achieve a variety of outcomes. One straightforward use of homology-directed repair is to introduce a visible marker to disrupt the coding sequence of a gene, creating a loss-of-function mutation that can be positively screened for under a microscope instead of by molecular genotyping (Liesch et al., 2013; McMeniman et al., 2014). Outside of the lab, this allowed for a mixed release and capture experiment in a controlled semi-field environment to determine whether marked animals lacking the ability to sense CO2 would show deficits in their ability to detect human beings (McMeniman et al., 2014). Additionally, small oligonucleotides have been used as donor DNA to introduce smaller sequences, such as cassettes containing stop codons and restriction enzyme sites for ease of genotyping (Kistler et al., 2015) or attP ‘landing sites’ so that a locus can be subsequently targeted for further modification using a site-directed integrase (Raji et al., 2019). Homology-directed repair can be used to introduce exogenous sequence in-frame, either in the middle of a coding sequence (Fig. 2B) or as a stop codon replacement (Fig. 2C), to produce protein tags or multiple expressed mRNAs from a single endogenous locus (Matthews et al., 2019). Further cases of homology-directed repair following genome editing involve introducing defined point mutations, swapping alleles of protein-coding genes from one strain or species to another (Lamb et al., 2017), the generation of homing ‘gene drives’ (Esvelt et al., 2014) and many other applications. Given the broad utility of these types of manipulations, efforts to increase the frequency of successful and precise homology-directed repair is of great interest to the mosquito genetics community and remain areas of active research.
Step 8: develop binary effector systems (Gal4, LexA, Q)
Drosophila melanogaster has been a prolific model for neuroscience, primarily because of the genetic tools and reagents that have been developed over more than a century (Hales et al., 2015). The success of employing genetics to probe the circuits that generate and regulate behaviors in D. melanogaster provides a roadmap for developing similar tools to study biological phenomena in mosquitoes and other non-traditional model insects. We propose that the availability of a high-quality reference genome, comprehensive transcriptomic resources and easy-to-use genome-editing protocols in Ae. aegypti provides the necessary foundation for implementing and adapting genetic tools in common use in D. melanogaster, specifically binary expression systems such as Gal4/UAS (Brand and Perrimon, 1993), LexA/LexAop (Lai and Lee, 2006) and QF/QUAS (Potter et al., 2010). These binary systems, though derived from different organisms, share a common theme – an exogenous transcriptional activator (Gal4, LexA, QF) with no known binding sites in the host genome is expressed in a specific spatial and temporal pattern (Fig. 2B,C). When crossed to a reporter strain (Fig. 2D), the transcriptional activator will induce the expression of an arbitrary transgene downstream of the recognition sequence for this artificial transcriptional activator (UAS, LexAop, QUAS). The development of these binary systems will allow the expression of a library of arbitrary transgenic constructs in specific temporal and spatial patterns (Fig. 2E,F). It should be noted that problems can arise when using these powerful tools, such as the potential for altering endogenous gene expression (Liu and Lehmann, 2008) or causing lethality or other fitness effects when transgenes are broadly expressed (Potter et al., 2010; Riabinina et al., 2015).
Despite the availability of genomic and transcriptome resources, genome-editing tools and binary expression systems, significant challenges remain when developing useful reagents for studying neural circuits and behavior. Cleanly and reliably restricting the expression of the transcriptional activator to molecularly defined cell types in the appropriate time windows is essential to interpret the results of an experiment. Promoter fusions and other methods that utilize regulatory genomic fragments have been used to great effect to generate cell type-specific expression patterns in D. melanogaster, exemplified by large-scale collections generated at institutions like the HHMI Janelia Research Campus that aim to generate reagents that target every neuronal cell type in the fly brain (Dionne et al., 2018; Jenett et al., 2012).
However, the Ae. aegypti genome is nearly an order of magnitude larger (Matthews et al., 2018), making it more difficult to identify and deploy relevant and reliable regulatory elements. Further, the integration site has a significant effect on transgene expression, in terms of both magnitude and pattern, and the field has not yet moved to a site-specific integration system, such as PhiC31, that would make it possible to compare transgenes expressed from the same location in the genome. Finally, efforts to generate broadly expressed drivers to understand neural circuits and activity in Ae. aegypti result in widespread expression across multiple cell types including neurons, glia and non-neuronal cells, making it difficult to interpret anatomical and functional data generated with these reagents in the context of neural circuits (Bui et al., 2019). Thus, we believe that as of now, cell type-specific transgenic reporters generated through gene traps of specific marker genes (Fig. 2B,C) are a promising method of generating expression patterns with the necessary specificity to understand neural circuit function in the context of complex behaviors.
Step 9: generate effectors of interest (label, image and manipulate cells)
The development of binary systems opens up the ability to characterize the neuroanatomy of molecularly defined neural circuits and visualize and manipulate activity within these circuits to understand how they are linked to behavior. A panel of driver lines with cell type-specific expression can be generated and crossed to a library of transgenic reporter strains, enabling combinatorial expression of arbitrary transgenes across different populations of cells. These reporter strains encode transgenes that allow researchers to visualize neuronal activity (using genetically encoded calcium or voltage sensors; Simpson and Looger, 2018), trace neuronal circuits using fluorescent reporters (Dunst and Tomancak, 2019), and manipulate neuronal activity with spatial, molecular and temporal precision using thermal, optical or chemical stimuli (Bernstein et al., 2012; Chin et al., 2018; Tobin et al., 2002). The development of new reporters to visualize changes in other cellular messengers like dopamine (Sun et al., 2018) or cAMP (Hackley et al., 2018) will enable an increasingly comprehensive understanding of circuit state and function in behaving animals. The beauty of a binary system is its combinatorial nature and thus its flexibility. New reporter strains can be generated quite easily, either at quasi-random positions through transposon-mediated transgenesis or at precise genomic loci using CRISPR-Cas9 followed by homology-directed repair or by using targeted integration systems like phiC31. These new reporters can be used immediately with the existing library of driver lines and the genetic tool kit swiftly grows. Building on the progress made in the last several decades in model organisms to develop, test and refine genetic tools for studying neuroscience will allow the mosquito research community to quickly adopt sophisticated approaches while minimizing the number of timely and costly iterations required to generate and validate each new reagent.
Step 10: grow a field with interesting questions using your new model organism
Model organisms will always play an important and irreplaceable role in the biological sciences, as the available tools and resources are unmatched and facilitate science at the large scale: genome-wide mutagenesis studies, connectomic descriptions of the wiring of entire nervous systems, etc. However, as the barriers to working directly in new organisms continue to fall, we believe that the primacy of the ‘model organism’ will be joined by the unrivaled opportunities presented by embracing the biological diversity of different species, combining sophisticated genetic tools with Krogh's principle that for every biological problem there is a species most suited for its study (Lindstedt, 2014).
By enabling genetic experimentation in new organisms, the scientific community can exploit the diversity of behaviors across the animal kingdom to better understand general principles of evolution and brain function. There is a powerful history of identifying and characterizing ‘specialists’ to better understand the neural mechanisms of behavior: echolocating bats, vocal-learning songbirds, socially communicative bees, long-distance navigating dung beetles. Enabling genetic studies in these species of interest and expanding the breadth and depth with which researchers can sample individual branches of the evolutionary tree will allow new and compelling comparative approaches to understanding the evolutionary mechanisms that shape the genes and circuits that generate behavior. As one example, the recent development of comprehensive genomic resources and modern genetic tools in the Drosophila subgroup, built of over a dozen species closely related to the ‘model’ D. melanogaster, has unlocked the ability to identify subtle changes in specific genes and neural circuits underlying critically important behaviors: the identification of food sources, species-specific identification of mates, and other evolutionary signatures of critically important behaviors (Auer et al., 2019 preprint; Khallaf et al., 2019 preprint; Seeholzer et al., 2018; Stern et al., 2017).
For our neurobiology research programs, there are additional practical benefits for making mosquitoes into model organisms. Mosquitoes are dangerous vectors of disease and understanding the biology of attraction to human hosts and sites to lay eggs is important. We have used genetic tools to study sensory receptor genes and pathways to define the molecular signal transduction machinery of host cue detection and egg-laying site evaluation (DeGennaro et al., 2013; Matthews et al., 2019; McMeniman et al., 2014). We have also used it to study mechanisms of action of insect repellents (DeGennaro et al., 2013; Dennis et al., 2019) and mosquito appetite modulation (Duvall et al., 2019). It is gratifying to see the field grow, and we welcome efforts from others to understand the biology of these deadly insects.
Conclusion and future prospects
Although the field has made striking advances in building mosquitoes into a genetic model organism, challenges remain. The big 6 have benefited and continue to benefit from centralized resources to house and distribute reagents, data and other resources. Examples of these include the Bloomington Stock Center for Drosophila and Addgene for DNA reagents, which make reagents widely available at low cost to laboratories around the world. For these and many other reasons, we think that the traditional model organisms will continue to hold a vital role in biological discovery, but we urge the research community to consider the advantages of developing new models as well. BEI and Vectorbase partially fill this need for the mosquito community, but more resources are needed. Centralized databanks and stock centers are chronically underfunded, understaffed and overlooked, and this needs to change. Although CRISPR-Cas9 mutagenesis is now routine in most species, improvements are urgently needed to increase the efficiency of CRISPR-driven homology-directed repair. Techniques that think ‘outside the microinjection box’, like ReMOT control (Chaverra-Rodriguez et al., 2018), will be extremely helpful for opening up genome editing in new species where it can be difficult to collect or inject eggs. Drosophila genetics has benefited enormously from the availability of balancer chromosomes that suppress recombination (Miller et al., 2019). The general availability of balancer chromosomes in other species would dramatically speed up genetic manipulation because they allow investigators to stabilize mutations and maintain and follow transgenic lines without molecular genotyping.
All of our work to date has involved ‘reverse genetics’, in which previously identified candidate genes are manipulated and specific phenotypes investigated. Building the infrastructure to do ‘forward genetics’, by screening for phenotypes in populations of chemically or genetically mutagenized mosquito populations would be a huge boon for the field. Large strain collections in bacteria, yeast, worm, fly and mouse carrying engineered mutations, small deletions or transposon insertions (enhancer traps, gene traps) have dramatically simplified the process of linking genotype to phenotype in these species. There are serious challenges to doing forward genetics in Ae. aegypti: the absence of balancers, difficulty in homozygosing even long-colonized lab strains, and the large and repetitive genome that makes it difficult to pinpoint the gene responsible for the phenotype. Nonetheless, this remains an important aspirational goal for the field. In this review we have described our own decade-long journey to build the mosquito Ae. aegypti into a genetic model organism and we hope that our example will inspire scientists to pursue mechanistic genetics-driven studies in new and exciting species far outside the big 6.
We thank Meg Younger and Zachary Gilbert, who together with the Aedes Toolkit Group led the generation of genetic tools described here. Chris Potter shared reagents ahead of publication. We thank the Insect Transgenic Facility (ITF) at the University of Maryland and its director, Rob Harrell, for working on protocol development and doing injections to generate reagents. We apologize to all of our colleagues whose work we were unable to cite due to limitations in scope and space.
This work was supported in part by grant UL1 TR000043 from the National Center for Advancing Translational Sciences [NCATS, National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program]. This work was supported by unrestricted start-up funds provided to B.J.M. by the Department of Zoology and the Faculty of Science at the University of British Columbia. L.B.V. is an investigator of the Howard Hughes Medical Institute.
The authors declare no competing or financial interests.