Xenacoelomorpha is, most probably, a monophyletic group that includes three clades: Acoela, Nemertodermatida and Xenoturbellida. The group still has contentious phylogenetic affinities; though most authors place it as the sister group of the remaining bilaterians, some would include it as a fourth phylum within the Deuterostomia. Over the past few years, our group, along with others, has undertaken a systematic study of the microscopic anatomy of these worms; our main aim is to understand the structure and development of the nervous system. This research plan has been aided by the use of molecular/developmental tools, the most important of which has been the sequencing of the complete genomes and transcriptomes of different members of the three clades. The data obtained has been used to analyse the evolutionary history of gene families and to study their expression patterns during development, in both space and time. A major focus of our research is the origin of ‘cephalized’ (centralized) nervous systems. How complex brains are assembled from simpler neuronal arrays has been a matter of intense debate for at least 100 years. We are now tackling this issue using Xenacoelomorpha models. These represent an ideal system for this work because the members of the three clades have nervous systems with different degrees of cephalization; from the relatively simple sub-epithelial net of Xenoturbella to the compact brain of acoels. How this process of ‘progressive’ cephalization is reflected in the genomes or transcriptomes of these three groups of animals is the subject of this paper.
INTRODUCTION
Phylogenetic affinities of the members of Xenacoelomorpha
Acoel flatworms together with the nemertodermatids constitute the taxon Acoelomorpha: a group of mostly marine worms with a simple body plan. With the xenoturbellids they constitute the phylum Xenacoelomorpha. Acoelomorphs are the best-characterized clade within this phylum and for that reason will be discussed in detail in the following paragraphs. The history of the classification of acoelomorphs has been complex and their assumed affinities within the metazoans has changed. Traditionally (in precladistic analysis) the Acoelomorpha were included within the phylum Platyhelminthes, which was considered the most basal bilaterian taxon (Hyman, 1951). The introduction of molecular characters (18S rDNA) to build phylogenetic trees showed, for the first time, that Platyhelminthes were polyphyletic, with the Acoela and the Catenulida branching as the first two clades of bilaterians and the Rhabdithophora nesting within the protostomes (Carranza et al., 1997). A deeper analysis using scores of acoels, some slow-evolving, recovered the three major bilaterian superphyla currently agreed upon (Deuterostomia, Ecdysozoa and Lophotrochozoa) and showed that the Acoela were the sister group of the remaining Bilateria (Ruiz-Trillo et al., 1999). The position of acoels seemed to be clearly established, turning them into good proxies for an ancestral bilateral animal (Baguñà and Riutort, 2004). This position was confirmed in some exhaustive phylogenomic analyses (Hejnol et al., 2009). However, using alternative methodologies and datasets, Philippe and collaborators proposed (Philippe et al., 2007; Philippe et al., 2011) that Acoela could instead be a deuterostomian group, with affinities to the Ambulacraria (Echinoderms and Hemichordates). Those authors suggested that Acoelomorpha sit within the Ambulacraria, grouped with another taxon, the Xenoturbellida, and thus erected a new monophyletic group, the Xenacoelomorpha, a classification also supported by some morphological data (Tyler and Rieger, 1975; Tyler and Rieger, 1977; Ehlers, 1985; Smith et al., 1986; Lundin and Hendelberg, 1996; Lundin, 1997; Lundin, 2000; Lundin and Sterrer, 2001). Now, these alternative views, one placing the Acoelomorpha as basal bilaterians and the other placing them within the Deuterostomia, continue to provoke debate. However, and since Xenacoelomorpha seems clearly to be a monophyletic taxon, the inference of specific trends within the group remains immune to the specific problems posed by its metazoan affinities.
The acoel nervous system in context
For many years, the structure of the nervous system has been considered to be important for ascertaining phylogenetic information (Haszprunar, 1996). The Acoelomorph nervous system is a pertinent case here because it shows a high degree of variability within the group. It is characterized, according to some authors, by a low degree of centralization in the anterior domain (Raikova et al., 2004a; Hejnol and Martindale, 2008; Raikova, 2008). In the past 10 years, several studies using electron microscopy and immunocytochemistry have added to our knowledge of acoel neuroanatomy. The use of a single set of immunochemical markers, such as 5HT, GYIRF-amide and FMRF-amide has made analysis of the nervous system in this group possible (Raikova et al., 2004a; Achatz and Martinez, 2012). The systematic use of these immunochemical markers, in particular antibodies against 5-HT, led some authors to propose that acoels have a simple ‘commissural brain’ (Raikova, 2004c). However, since it is well known that these antibodies only label particular groups of cells, leaving the rest of the nervous system unstained (Grimmelikhuijzen et al., 2002; Cebrià, 2008), it is necessary to use complementary methodologies to characterize the detailed ultrastructure. The information provided by all these different studies has shown that the nervous system of acoels has a high degree of structural variability, although all share a general pattern: the possession of an intraepidermal plexus and three to five pairs of similar neurite bundles that are regularly distributed and spaced around the anterior–posterior body axis.
There are a few morphological features shared by acoels, Nemertodermatids and Xenoturbellids; for instance, the lack of a stomatogastric system and the presence of a pervasive peripheral plexus (Raikova et al., 1998, Raikova et al., 2000a; Raikova et al., 2001; Reuter et al., 2001a; Reuter et al., 2001b). The structural characteristics of the nervous system in the different taxa allow us to infer correlations (trends) between their neuroanatomy and their relative taxonomic positions within the group. Xenoturbellida is the most basal clade of Xenacoelomorpha. Its neuroanatomy is the simplest, consisting of a well-developed basal intraepidermal nerve net without any submuscular nervous structures (Fig. 1) (Raikova et al., 2000b). No areas of obvious condensation have been detected in Xenoturbella, yet they posses a statocyst (Israelsson, 2007). The Nemertodermatida nervous system comprises, in addition to a subepidermal nervous plexus, a ring of processes located at the statocyst level, which is suggestive of what could be the initial stages of the evolutionary process leading to an anterior concentration of neurons (Fig. 2) (Raikova et al., 2004b). A quick comparison of the general neuroanatomy within the different members of Acoela – we follow the detailed description given by Achatz and Martinez (Achatz and Martinez, 2012) – reveals an increment in the complexity of neural structures, particularly in the most recent groups. In the most basal family of acoels, the Diopisthoporidae, the nervous system is organized as a ring-shaped commissure with two ganglionic lobes posterior to the statocyst, and a smaller ring located more anteriorly. These two rings are linked by ventral tracks (Fig. 1) (Tyler, 2001; Achatz and Martinez, 2012). The data obtained for Paratomella rubra, which is a member of the Paratomellidae, a group that branched later and is considered the sister group of Bursalia, indicate that the nervous system consists of a dense plexus around the statocyst with two neural extensions that connect, dorsolaterally, two rings and a pair of dorsolateral neurite bundles that extend from the posterior to the end of the animal (Fig. 1) (Crezée, 1978). The clade Bursalia is divided in two major groups, the Crucimusculata and the Prosopharyngida. In the latter group, the nervous system is organized in one to three ring commissures plus eight neurite bundles (Fig. 1) (Crezée, 1975). The sister group of Prosopharyngida is Crucimusculata (Jondelius et al., 2011), a clade that includes all the named ‘higher acoels’. This clade, which includes some of the most studied acoels (e.g. Isodiametra pulchra or Symsagittifera roscoffensis), shows a more-complex nervous system, with anterior concentrations, a bilobed brain with a cellular cortex and a dense internal neuropile (Fig. 1) (Achatz and Martinez, 2012; Achatz et al., 2013). Thus, the condensation seen in some acoels, particularly those belonging to the Crucimusculata, is clearly an evolutionary derived character within the Xenacoelomorpha (Hejnol et al., 2009; Philippe et al., 2011).
Since the xenoturbellids represent the most basal clade within the phylum Xenacoelomorpha, the presence of an intraepithelial nerve net of neurones (Raikova et al., 2000b) is now regarded as ancestral – it is also found in the sister group of the Bilateria: the Cnidaria. The nemertodermatids, instead, have an anterior concentration of neurons and processes in the form of a ring commissure; a condition that could be interpreted as intermediate, in terms of structural complexity, between those found in xenoturbellids and the acoels (mostly those belonging to Crucimusculata). Thus we could visualize the evolution of xenacoelomorphs as following a trend towards the anterior concentration of the nervous system. Unfortunately, very little is known about the embryological origin of the nervous system in Xenacoelomorpha. In the following paragraphs we summarize what is known about the embryology of the acoel nervous system and its molecular regulatory mechanisms.
The acoel nervous system development and its molecular control
The nervous system of the acoels originates from the micromeres located at the animal pole, which give rise to the ectodermal lineages, specifically the epidermis and all the neural cells (Henry et al., 2000). Some of the progeny of these micromeres are internalized at later stages of embryonic development and differentiate into neurons. The differentiation of neurons from these internalized cells is readily accounted for by the expression of a SoxB orthologue, a gene that is early expressed in pro-neural cells in a great variety of taxa, including non-bilaterian taxa such as the cnidarians (Magie et al., 2005; Hejnol and Martindale, 2009). Later on in development, the Hox patterning system plays a clearly important role in providing positional cues to the differentiating neurons. Only three Hox genes (one anterior, one central and one posterior) have been identified in the acoel species studied to date (Hejnol and Martindale, 2009; Moreno et al., 2009; Sikes and Bely, 2010).
The two most anterior Hox genes seem to be restricted to ectodermalneural expression, although further corroboration is needed, but the posterior Hox gene is expressed in cells encompassing the three germ layers. Besides these genes, few other ‘classical’ neural genes have been studied in the acoel species. The orthologue of the posterior ParaHox gene caudal (Cdx) is also expressed in anterior neural structures in the acoel Convolutriloba longifissura (Hejnol and Martindale, 2009), probably in the same position as, or in close vicinity to, the cells expressing the neural genes Nk2.1 and Otp (Hejnol and Martindale, 2008).
Although scarce, most of the molecular data relates to genes involved in antero–posterior patterning in the acoel nervous system, whereas no information is available regarding the regulation of its dorsal–ventral patterning. The identification of such a network should lead to exciting discoveries because, despite having differential dorsal–ventral expression of BMPs (bone morphogenetic proteins) and their antagonists, ADMPs (anti-dorsalizing morphogenetic proteins) (Srivastava et al., 2014), all acoel species have nerve cords equally distributed along their dorso–ventral axis, showing no preference for dorsal or ventral locations. Nothing is known about the control of neurogenesis in both the Xenoturbellids and the Nemertodermatids.
Neuronal regulatory and effector gene products (BHLHs and GPCRs)
In order to understand the process of neurogenesis in the different members of the Xenacoelomorpha, we undertook a thorough study of the members of gene families that are well known for their relationship with the formation of neural tissue. Our starting point was two gene families closely related with both the development and the functionality of the nervous system: the so-called basic helix–loop–helix (bHLH) and the G-protein-coupled receptor (GPCR) families. Changes in numbers and subfamilies, and eventually the study of expression patterns for all these genes and within Xenacoelomorpha, should provide us with key insights into the process of progressive ‘cephalization’ that occurs in this clade. A summary description of these families follows.
A key, well-known group of transcription factors involved in developmental processes such as cell proliferation and differentiation is the bHLH gene superfamily. This class is present in a wide range of eukaryotes, including fungi, plants and metazoans (Jones, 2004). bHLHs are regulators of nervous system development; they control aspects such as neural fate commitment, cellular subtype specification, migratory behaviour and axonal guidance (Bertrand et al., 2002; Guillemot, 2007). Moreover, recent studies have detected bHLHs expressed in stem cells and neuronal precursors required for CNS regeneration (Cowles et al., 2013). These neuronal precursor proteins possess a highly conserved bHLH DNA-binding domain which, as the name suggests, is composed of a basic region for DNA binding and two α-helices, interrupted by a variable loop region. They normally activate their target genes as dimeric complexes. Some bHLHs also include additional domains involved in protein–protein interactions such as the ‘leuzine zipper’, PAS (Per–Arnt–Sim) or ‘orange’.
Molecular phylogenetic analysis has revealed 45 orthologous families of bHLHs in metazoans (Simionato et al., 2007), all included in higher-order groups named A, B, C, D, E and F, which are related through evolution and have similar structural characteristics (Atchley and Fitch, 1997; Jones, 2004). Groups A and B bind to core DNA sequences named ‘Eboxes’ (CANNTG). Group C includes a PAS domain in addition to the HLH domain, binding to the ACGTG or GCGTG core sequences. Members of Group D are unable to bind DNA, but act as negative regulators of group A proteins. Group E contains the proteins related to the Drosophila HER (Hairy and enhancer of split bHLH) proteins and also includes two different domains: ‘orange’ and WRPW. Proteins with a COE domain but lacking a basic domain form the last group, F.
GPCRs form the largest and most diverse super family of transmembrane receptors, and they are very abundant in most eukaryotic species. Nevertheless, their total number differs widely between species and subtypes. Within neural tissue they are involved in sensory functions such as photosensitivity or olfactory sensation and also in the regulation of homeostasis. GPCRs are known to share a highly conserved seven transmembrane domain (7TM) that can be used as a motif in genomic analysis to search for and identify this superfamily of genes. Despite the conservation of this domain, GPCRs bind to a very diverse array of ligands and intracellular signalling molecules. This specificity is due to their highly variable N-termini that allow us to classify them into five main families following the so-called GRAFS system: glutamate, rhodopsin, adhesion, frizzled and secretin (Fredriksson et al., 2003; Schiöth and Fredriksson, 2005). The rhodopsin family (also known as class A) is the largest. Rhodopsins have short N-termini but interact with a great variety of ligands (such as amines, purines, lipids, peptides or glycoproteins) through the extracellular portion of the transmembrane domain. This family can be further subdivided in four groups (α, β, γ and δ) and 13 subfamilies. The adhesion family is characterized by long N-termini with different functional domains, such as cadherin, laminin, and calcium-binding epidermal growth factor domains. The secretin 7TM domain is very similar to that of adhesion, from which it evolved. This family, belonging to class B, together with Adhesion, can be recognized by an extracellular hormone-binding domain. The glutamate family (also known as class C) comprises neuronal modulators with a typical long N-terminal ligand-binding domain known as the ‘Venus flytrap module’. The members of the last family, frizzled (class F), act as receptors of the Wnt pathway, and are well known for their crucial roles in early embryonic development, tissue polarity and cell signalling (Strotmann et al., 2011; Krishnan et al., 2013). Understanding the evolutionary history of GPCRs and bHLHs in Xenacoelomorpha opens a window on knowledge of how upstream regulatory and downstream effector genes have evolved within this phylum.
RESULTS AND DISCUSSION
Structure of the Symsagittifera roscoffensis nervous system revealed through species-specific antibodies
In order to better understand the structure and development of the nervous system of the acoel Symsagittifera roscoffensisvon Graff 1891, we generated specific antibodies against ‘pan-neuronal’ epitopes using information derived from the Genome Project. We focused our attention on two well-known proteins: the synaptic protein synaptotagmin and the RNA-binding factor Elav. Here, we provide a summary description of these expression patterns, as an example of how we can generate markers that will aid in the interpretation of gene patterns, such as those of regulatory and terminal differentiation regulators.
Adult synaptotagmin-positive structures
The adult nervous system, as revealed by the antibody against synaptotagmin, consists of six longitudinal neurite bundles (a dorsal pair, a medio-ventral pair and a ventro-lateral pair) running along most of the length of the organism. They are clearly interconnected through very thin commissures, forming a small, mostly irregular, net not completely visible at the posterior end (Fig. 2A, arrows) contrary to the observations made with the anti-5-HT antibody (Semmler et al., 2010). The only prominent commissures are located in the brain area. Immediately posterior to the brain, five commissures can be detected that directly connect all the bundles (Fig. 2A, double arrows). As we move further posterior from the rostral domain, the neurite bundles become less intensely stained. This pattern is reminiscent of that obtained with both the anti-5-HT and the anti-tyrosinated α-tubulin antibodies described in previous studies (Semmler et al., 2010). However, in those patterns at least two more commissures clearly connect the dorsal neurite bundles and the medio-ventral ones, whereas in our experiments, the commissures are less prominent. Common to all known patterns are the three commissures immediately posterior to the brain. Perhaps because of the specificity of the anti-synaptotagmin antibody, the anterior staining appears to be more detailed than has been previously reported (Fig. 2A).
Adult Elav-positive structures
The elav genes, which are also used as neuronal markers, encode highly conserved RNA-binding proteins that are, most probably, involved in neural development and differentiation processes across Bilateria. As has been showed for the cnidarian Nematostella vectensis, post-transcriptional regulation by these proteins is an ancient feature shared by Cnidaria and Bilateria (Nakanishi et al., 2012).
In S. roscoffensis, Elav immunoreactivity is observed in the most anterior part of adult animals (the brain); with two neurite bundles clearly visible compared with the other two pairs that are almost undetectable (Fig. 2B). Through the posterior part of the body axis the pattern becomes progressively less intense and the neurite bundles become thinner and thinner in appearance. We note that the thickness of the dorsal neurite bundles decreases, reflecting the presence of less well-packed fibres. Anterior to the statocyst, on the ventral side, the neurite bundles converge into an area (Fig. 2B). From this structure, neural processes extend frontally, laterally and dorsally.
It is interesting to note here that when we compare the immunochemical patterns obtained by Semmler et al. (Semmler et al., 2010) in the adult S. roscoffensis with those obtained here with the specific anti-Elav antibody, some clear differences arise. The 5HT and the RFamide immunoreactivity patterns described in that study show an extensive set of neural structures covering the whole body of the animal, whereas the Elav immunoreactivity pattern seems to be more limited to the anterior region of the adult. The dorsal neurite bundles are notably labelled with Elav, in clear contrast to the four ventral bundles, which are almost imperceptible, unlike the data published on the use of RFamide and serotonin antibodies (Semmler et al., 2010). Double and triple staining will accurately define the relative domains identified by all these antibodies.
Exploring the genome of Xenacoelomorpha – characterizing gene families involved in neurogenesis
bHLHs
We identified the putative bHLH genes in the sequenced genomes of S. roscoffensis and the xenoturbellid Xenoturbella bockiWestblad 1949. The genomes comprise 18 genes in S. roscoffensis and 33 in X. bockii. The phylogenetic analysis performed to classify them used known protein sequences from several metazoans: Homo sapiens, Drosophila melanogaster, N. vectensis, Acropora digitifera and Hydra magnipapillata, courtesy of Dr Gyoja (Gyoja et al., 2012) (Table 1). To the alignments, we added sequences from Saccoglossus kowalevskii and Capitella teleta, all downloaded from the PFAM 27.0 data bank (hosted at the EMBL-EBI; see supplementary material Fig. S1). The genes were categorized into six high-order groups: A, B, C, D, E and F (Atchley and Fitch, 1997). In both genomes, S. roscoffensis and X. bocki, we found no more than one bHLH member per family; we also found that some families had no members. Only one gene from family A had unclear affinity in S. roscoffensis. Interestingly, we found no specific cases of genes not belonging to the families already characterized in other bilaterians. The species chosen, including protostomes, deuterostomes and cnidarians, provided us with good representation – although still limited – of the Metazoa. Our trees (see supplementary material Figs S1, S2, bHLH trees) present a similar topology to those obtained in previous studies (Simionato et al., 2007; Gyoja et al., 2012).
The 18 bHLHs characterized in S. roscoffensis were classified as members of the following groups: 14 in group A (members of the families: ASCa, ASCb, Beta3, E12/E47, MyoD, Net, NeuroD, Beta3, PTFa, PTFb, twist, and a putative Oligo); three in group B (Max, MTF and SRBP); one in group C (ARNT); and one in group E (HES/HEY). No representatives of group D or F were identified (Table 1). In the case of group A, we found one putative Oligo/Beta3 relative that we could not allocate to one of these two related families because of its low branch support. A similar situation occurred for HES/HEY: we were not able to determine the affinities of the acoel sequence. In fact, the sequence aligned with the similarly unclear H. sapiens HES/HEY. Some genes – putative relatives of NeuroD, ASCa, ASCb and ARNT – were sorted, with weak statistical support. Thus, we applied BLAST to them, one by one, and aligned them with the total length of the most closely related sequences, specifically checking by hand for conserved regions. This procedure provided support for our family assignments (supplementary material Fig. S1, SR_tree).
The 33 bHLHs found in X. bocki, were also classified into the different groups: 16 in group A (relatives of: ASCa, ASCb, Atonal, Beta3, E12/E47, hand, mist, MyoRa, MyoRb, neurogenin, NeuroD, NSCL, PTFa, PTFb and twist); 7 in group B (Max, MITF, Mlx, Myc, SRBP, TF4 and USF); one in group C (ARNT); and 4 in group E (HES and HEY). Two sequences, which we named Xb_Orphan-HLH1 and Xb_Orphan-HLH2, did not match with certainty any of the 48 bHLHs included in our analysis. A component clustered in the high-order group B, with weak branch support in the phylogenetic tree, seems to be a putative member of the Net family. However, careful analysis of the sequence did not allow us to assign it to the Net family. Blast analysis suggests that it could be also a diverged orthologue of NeuroD; we provisionally named it Xb_Orphan-HLH3. A similar case occurs with Xb_Orphan-HLH4, with unclear affinities in the phylogenetic tree. Again, the sequences aligned with low bootstrap support (ARNT, NeuroD and Atonal) were checked manually before being assigned to the respective families. A similar thing happened with the acoel sequence related to the HES/HEY families (supplementary material Fig. S2, Xb_tree).
Some species, such as H. sapiens and the cnidarian N. vectensis, have duplicates in some families (Simionato et al., 2007). Duplicates do not seem to be the norm in S. roscoffensis. We found only two orthologues in the PTFa and PTFb families. Even this was not the case for X. bocki, which had only one member belonging to each of these families. Therefore, we suggest that the duplications took place within Acoelomorpha and perhaps are specific to Acoela; complete genomes of Nemertodermatida are needed to clarify the issue.
Xenoturbella bocki ‘orphan’ bHLHs
As mentioned above, in X. bocki we identified four ‘orphan’ sequences: Xb_Orphan-HLH1, Xb_Orphan-HLH2, Xb_Orphan-HLH3 and Xb_Orphan-HLH4, which do not seem to belong to any of the known bilaterian families. Two hypotheses would explain this: the sequences are homologous to an ancestral bilaterian gene that has been lost in all the bilaterians analysed; or they have drifted enormously from known relatives within the Xenacoelomorpha lineage. The introduction of an itemized representation of species may at some point improve the classification of these four genes. We have not determined the bHLH subfamilies yet: this will require more in-depth analyses in the future.
Remarkably, the high-order groups with representation in the X. bocki genome also have members in S. roscoffensis; however, the same families are not present in both of them (Table 1). The genome of S. roscoffensis has fewer families than that of X. bocki. Specifically, the families that are present in xenoturbellids but missing in acoels are: from Group A: Atonal, hand, mist, MyoRa and neurogenin; from Group B: Mlx, Myc, TF4 and USF; and from Group C: HES and HEY. This pattern clearly shows that some losses were produced before the diversification of Acoela. In the absence of comprehensive data from Nemertodermatida, it is difficult to be more accurate as to when these losses occurred. Interestingly, there is one family, and only one, MyoD (plus the problematic case of the putative ‘Oligo’) that has a representative in S. roscoffensis, but not in Xenoturbella.
When we look globally at all the bHLH genes in both taxa, it is important to note that the bootstrap support is always higher in the X. bocki trees than in those including S. roscoffensis sequences, although using the same metazoan reference sequences for the alignments. We think this reflects the fact that Acoela appears to have long branches in most phylogenetic (phylogenomic) analyses; with a higher rate of nucleotide substitutions in most genes analysed. Therefore, if the bHLH sequences follow the same general pattern, one should expect that the genome of S. roscoffensis encodes, in general, very divergent orthologous sequences. In addition, S. roscoffensis belongs to the clade Crucimusculata, which includes the more derived acoels, the so-called ‘high acoels’. In contrast, Xenoturbella appear to have short branches in all molecular phylogenetic analyses.
Of the 48 bHLH families known, 45 are present in Bilateria. Of these, 44 are the families shared by the protostomes and deuterostomes (and may therefore have been present in the urbilaterian ancestor). A total of 29–33 of these families are apparently present in the ancestor of cnidarians and bilaterians (Ledent and Vervoort, 2001; Ledent et al., 2002; Simionato et al., 2007). Surprisingly, the number of families represented in our dataset for S. roscoffensis is not in this numeric range. In fact, in S. roscoffensis we detected only 16 families. In the genome of X. bocki there were more; a total of 29. Nevertheless, the number of families shared between X. bocki and the cnidarians analysed (Table 1) (Simionato et al., 2007; Gyoja et al., 2012), ranges between 18 and 23. To us, it has become clear that in the Xenacoelomorpha lineage there have been several family losses, particularly in Acoela (see Simionato et al., 2007).
However, we should be cautious about drawing very general conclusions because we have gathered complete data from the genome of only one acoel species and one xenoturbellid. We need to explore a full range of xenacoelomorphs, incorporating members of Nemertodermatida and also some more basal acoels. A more detailed characterization of bHLH complements in different members of this phylum should provide us with a better understanding of the evolutionary dynamics of this important family of transcriptional regulators. Moreover, with access to the complements and a better description of the neuroanatomy of these animals, we should start to get a better picture of how the evolution of the nervous system, and its centralization, is linked to specific changes in genome composition, in particular for regulatory genes.
GPCRs
We mined the genomes of S. roscoffensis and X. bocki for the presence of genes containing the typical GPCR 7TM domain of the GRAFS families (Fredriksson et al., 2003; Schiöth and Fredriksson, 2005), plus the members of the Dicty_CAR (see the Materials and methods for more details). GRAFS GPCRs are typically expanded in mammals, especially the rhodopsin family, although most of them have representatives in the last eukaryotic common ancestor (de Mendoza et al., 2014). For S. roscoffensis, we further phylogenetically analysed each family to classify all the genes found. The GPCR sequences used as references were provided by Alex de Mendoza (de Mendoza et al., 2014) and were chosen to offer an extensive evolutionary perspective. The species used in our analysis were: A. queenslandica, N. vectensis, S. kovalevskii, H. sapiens, C. teleta and D. melanogaster. The clustering was checked for all the acoel sequences by running BLAST with all the candidates against the NCBI protein database (http://blast.ncbi.nlm.nih.gov/Blast.cgi).
Analysis of GPCR family numbers
In the genome of S. roscoffensis we found for each family the following complements: 9 glutamate, 6 frizzled, 5 adhesion/secretin and 225 rhodopsin receptors; however, we did not find any Dicty_CAR relative, making a total of 245 different GPCRs (Table 2). In the case of X. bocki, the complements were: 17 glutamate receptors, 7 Frizzled, 21 Adhesion/Secretin, 258 Rhodopsin and 1 Dicty_CAR; a total of 304 GPCRs (Table 2). Note that all the X. bocki and the S. roscoffensis rhodopsin values are given as averages of both the predictions based on the genomic sequence information and the AUGUSTUS gene predictions.
The Dicty_CAR domain corresponds to a cyclic AMP receptor. Frizzled, adhesion and rhodopsin classes are thought to have evolved from such a more primitive receptor type; the former two before the split of unikonts from the common ancestor of eukaryotes, and rhodopsin in the common ancestor of the opisthokonts. After the diversification of those in metazoan, the cAMP receptor might have become somehow redundant and consequently lost or rarely present in most metazoan species (Krishnan et al., 2012).
Table 2 shows that the two species studied have fewer representatives in all the families than in our reference species, except for D. melanogaster. It is known that GPCRs are subjected to species-specific diversification, which is especially remarkable in H. sapiens, C. teleta and N. vectensis. Despite the higher complexity of genes in X. bocki, compared with A. queenslandica and N. vectensis, we noticed that the former has, in general, fewer members from each GPCR family (although not in all cases). We also observe this trend within the Xenacoelomorpha monophyletic group, which is where we find S. roscoffensis, despite it being a divergent member of Acoela; it has a nervous system morphologically more complex than X. bocki, but with fewer GPCRs in all the families.
Glutamate family
In the glutamate family (see supplementary material Fig. S3) we detected three glutamate metabotropic receptors (GMR); which represents a simplification compared with the eight that found in humans, although this may be a direct consequence of the duplication of genomes specific to vertebrates. Intriguingly, S. roscoffensis has six γ-aminobutyric acid (GABA) receptors, a greater number than the three types found in humans. It is known that GABA receptors play a role in synaptic transmission and neuronal excitability in the CNS of vertebrates (Niswender and Conn, 2010). It is remarkable that a similar specific expansion with eight GABA receptors is observed in S. kowalevski, which does not have a centralized nervous system and lacks neurons that release GABA (Aronowicz and Lowe, 2006; Krishnan et al., 2013). However, although S. kowalevski does not have an obviously centralized nervous system and S. roscoffensis has a clear brain, both systems are structurally simpler that vertebrates.
Frizzled family
We also found six Frizzled-like receptors: three clustered together with Frizzled 1/2/7, one with Frizzled 5/8 and two more with Frizzled 9/10 (supplementary material Fig. S4). We did not detect any Smoothened-like receptor. Interestingly, the frizzled family is one of the most evolutionarily conserved because of its fundamental role in cell signalling and tissue polarity, where they function as Wnt receptors (Lagerström and Schiöth, 2008). With 11 members in humans, 6 in the acoels studied and 7 in the xenoturbellids (Table 2), we did not observe a species-specific expansion in this subfamily (Strotmann et al., 2011).
Adhesion and secretin families
We detected five members of the adhesion/secretin family in the acoel genome, which represents a very low number when compared with the reference species, even with the xenoturbellids, in whose genome we detected 21 members. Of those five (supplementary material Fig. S5), only one clearly clustered with the human homologues GPR123, GPR124 and GPR125. After a BLAST search, we assigned this gene, Sr_GPR125 (supplementary material Fig. S5, fourth arrow), as a putative orthologue to human GPR125 (Bjarnadóttir et al., 2007).
Secretin receptors evolved from adhesion receptors and therefore they share a similar 7TM core sequence. For this reason, we detected and analysed the members of both families together. They are present in most bilaterians, but not in older members of the Metazoa (Strotmann et al., 2011). It would be interesting to ascertain which of the genes detected belong to the adhesion family and whether there is any secretin homologue. In fact, the gene we call Sr_orphan1_aug3sy.g27699.t1 (supplementary material Fig. S5, first arrow) would be a good candidate secretin relative because it clusters with some members of the C. teleta and H. sapiens secretin family members, although in a basal position. However, further analyses are necessary to confirm this finding.
Rhodopsin family
The rhodopsin family is the largest and most diversified, and thus the most difficult to analyse. To complicate it further, some of the reference species genomes are not yet fully annotated, whereas our acoel sequences are extremely divergent. In order to resolve these issues, we aligned all members of this subfamily with human reference sequences, which is the most reliable and studied reference genome. What we observed in this study is that most of the sequences we identified in S. roscoffensis do not have a direct human orthologue; they tend to form independent groups instead. In the future, the same detailed analysis of S. roscoffensis should be performed on the different families of GPCRs found in the genome of X. bocki.
Conclusions
The phylum Xenacoelomorpha is formed of animals with a relatively simple morphology, including that of their nervous systems (Raikova et al., 2000a; Raikova et al., 2000b; Raikova et al., 2004a; Raikova et al., 2004b; Achatz et al., 2013). Here, and for the first time using the genome sequence of members of this group, we have undertaken a systematic study of two superfamilies, the bHLHs and GPCRs, involved in the specification and the functionality of nerve tissue. The main aim of this study was to correlate the molecular complexity of these families with the structural complexity of the different nervous systems within the phylum. We understand that the term ‘complexity’ is a loaded one, and there are many putative definitions for such a word, especially in the context of biological systems. However, here we use a very narrow concept of complexity: one that reflects the degree of neuronal concentration in one pole of the animal, which implies a high degree of interconnectivity and a high capacity for processing external inputs. This is a definition that would assume that ‘cephalized’ (centralized) nervous systems are structurally more complex than ‘simpler’ neuronal nets (as seen, for instance in members of the Cnidaria). Several morphological studies (mentioned above) have shown that whereas Xenoturbella seems to comprise an intraepidermal net of nerve cells, some members of Acoela, particularly those belonging to the clade Crucimusculata, have a condensed bilobed brain with a dense neuropile in the most anterior end of the animal (close to several sensory structures). Our molecular data show that in the case of the two families analysed, the complement of genes (and its diversity) is higher in Xenoturbella than in Symsagittifera. This would mean that there is an inverse correlation between the structural complexity of the nervous system and the number of genes involved in both patterning and downstream sensory functions. This putative paradox requires and deserves further analysis. We are well aware that the view that we have of both genomes is still incomplete. Moreover, structural complexity, perhaps just superficial, does not have to reflect the functional complexity or the complexity of circuits in the different nervous systems. The study of gene activities and analysis of their evolving regulatory interactions should give us a better, richer understanding of the evolution of these nervous systems; and perhaps it will provide insight into the putative ‘cephalizing’ trends that we observe in this and other clades (Moroz, 2012). Knowledge of the genomes is now, in any case, paving the way to decipher the evolution of the brain primordia.
MATERIALS AND METHODS
Immunohistochemistry
Immunostaining was performed following the protocols outlined in Achatz and Martinez (Achatz and Martinez, 2012). S. roscoffensis was incubated in primary anti-synaptotagmin (dilution 1:200) or anti-Elav (dilution 1:800) antibodies (both previously pre-absorbed) and reacted with the secondary antibody [Alexa Fluor goat anti-rabbit 532 (Molecular Probes, Eugene, OR)].
Image acquisition, processing and preparation
Confocal laser-scanning microscopy was performed with a Leica TCS SPE or a Leica SPII microscope (Leica Microsystems, Weztlar, Germany). Images were processed using ImageJ (Abramoff et al., 2004) and finished with AdobePhotoshop CS. Diagrams were produced using Adobe Illustrator 7.0.
Genome sequences, annotation and selection
Genome assemblies including scaffolds and contigs (SCFs), transcriptome sequences assembled from RNA-seq reads (ESTs), as well as gene annotation over genome sequences (AUGs), for both S. roscoffensis and X. bocki, were downloaded from the server of the University of Greifswald (sequences will be publicly available along with the publication of the genome). SOAPdenovo (v2) (Luo et al., 2012) and SOAPdenovo-Trans (v1) (Xie et al., 2014) were used in the protocol for the genome and transcriptome assemblies, respectively. A repeat library was generated using RepeatScout (v1.0.5) (Price et al., 2005). RepeatMasker (v3.3.30) (Smit et al., 2014) was used to produce a repeat-masked genome version with the above mentioned custom library, and to produce a table that lists repeat coordinates. CEGMA (v2.4) (Parra et al., 2007) was used to produce a set of ultra-conserved core protein genes from the unmasked genome. A second initial gene set was generated by WebAUGUSTUS (Hoff and Stanke, 2013), which ran PASA (Haas et al., 2003) on assembled RNA-Seq data and the unmasked genome from Xenoturbella. Both gene sets were merged (non-redundantly) and used for optimizing AUGUSTUS parameters (Stanke, et al., 2008). RNA-Seq reads were mapped against the repeat masked genome, alignments were converted to hints for AUGUSTUS. PASA and CEGMA results as well as the RepeatMasker output table were also converted to hints. AUGUSTUS was run on the unmasked genome using the optimized parameter set and all available hints. Genomic and transcriptomic nucleotide sequences were translated into all possible open reading frames (ORFs) larger than 20 amino acids using custom Perl scripts; whereas AUGUSTUS predictions already included putative proteins translated from the predicted transcripts. The described annotation protocol produced the sets of putative transcript and protein sequences that were used in this study. A search for specific protein domains was performed on all the amino acid sequences from those three sets – ORFs from genome and transcriptome sequences and proteins from AUGUSTUS predictions. HMMER (v3.0) (Eddy, 2011) was used to scan the sequences using the following Pfam (v27.0) (Finn et al., 2014) hidden Markov models (HMM): PF00001 (7tm_1), PF00002 (7tm_2, Adhesion/Secretin), PF00003 (7tm_3, Glutamate), PF01534 (Frizzled) and PF05462 (Dicty_CAR) for the GPCR family; and PF00010 (HLH DNA-binding domain) for the HLH family. hmmsearch was run with the following parameters: ‘max’ (to disable all heuristics, which entails less speed but more power), ‘E 1e–5’ (maximum total E-value to report sequences) and ‘domE 1e–5’ (maximum domain E-value to report a hit). Sequences with a significant hit for any of the previously described HMM models were then filtered out from the set of genomic (SCFs), transcriptomic (ESTs) and predicted (AUGs) proteins for further analyses.
Phylogenetic analysis
The best predictions for each domain search and dataset were manually curated and aligned with ClustalW2 (Larkin et al., 2007; McWilliam et al., 2013).We have manually filtered the more informative columns of the alignment in order to produce the presented phylogenetic trees. Then, final predictions for each family, or subfamily in the case of GPCRs, were aligned against a database of protein sequences from different species, including H. sapiens, D. melanogaster, S. kowalevski, N. vectensis and C. teleta. In the case of GPCRs, sequences from A. queenslandica were considered; for the bHLHs family, sequences from A. digitifera and H. magnipapillata were also considered. GPCR reference sequences were classified in families as described by de Mendoza et al. (de Mendoza et al., 2014) whereas the Gyoja and Satoh (Gyoja and Satoh, 2013) reference was used for bHLHs. bHLH sequences for S. kowalevski and C. teleta were directly downloaded from the Pfam database preserving the original access codes.
Full alignments were performed using MAFFT (v6.864b) (Katoh and Toh, 2008) with the ‘max’ iterate 1000 and ‘local pair’ (equivalent to L-INS-i) options. Conserved regions from those alignments were manually selected using Geneious software (v5.3.6, Biomatters, http://www.geneious.com). Maximum-likelihood analysis was carried out using RAxML (v8.0.12) (Stamatakis, 2014), with the amino acid evolutionary model LG using a discrete gamma distribution of among-site variation rates and a proportion of invariable sites (‘m PROTGAMMAILG’ option). The best tree out of 100 bootstrap replicates (N 100) was chosen. A neighbour-joining approach was also considered and the trees were inferred by Neighbour-Joining Tree Builder (njtree v1.9.1) (Li, 2006). Finally, the computed trees were edited and processed into images with FigTree (v1.4.1) (Rambaud, 2007), with further refinements and details added using Adobe Illustrator 7.0.
All the alignments generated with the manually filtered conserved columns can be found in Fasta format as supplementary material Figs S6 to S10. All the sequences included in our study keep the original reference number assigned in the preliminary genome/transcriptome analyses. The whole genome/transcriptome original sequences will be made publicly available along with the forthcoming publication of the Xenacoelomorpha Genome Project consortium paper.
Acknowledgements
This work was presented at the ‘Evolution of the First Nervous Systems II’ meeting, which was supported by the National Science Foundation (NSF). The Meeting was held at the Whitney Marine Laboratory, St. Augustine, Florida. We would like to thank specifically Alex de Mendoza (IBE CSIC-UPF, Barcelona) and Fuki Gyoja (Okinawa Institute of Science and Technology) for access to their GPCR and bHLH databases. Moreover, A. de Mendoza was especially helpful through the whole process on analysing and classifying the GPCRs. We appreciate the support given by Manel Bosch and the access to the CLSM Microscopy facility of the Universitat de Barcelona. We would like to thank the referees for the suggestions and comments that were instrumental in the improvement of the submitted manuscript. Finally, we acknowledge the continuous, insightful, discussions with all the members of the Xenacoelomorpha genome consortium.
FOOTNOTES
Funding
We would like to acknowledge the generous support obtained from the EU Programme ‘ASSEMBLE’, which covered some of the costs associated to our acoel collection trips. A grant from the Spanish Ministry of Economy (BFU2012-32806) funds the work in the laboratory of P. Martinez. Funds from the Generalitat de Catalunya (2009-SGR-001018) were also used in this project. E.P.-A. has a PhD (APIF-UB) fellowship from the Universitat de Barcelona.
References
Competing interests
The authors declare no competing or financial interests.