Insulin signaling plays key roles in development, growth and metabolism through dynamic control of glucose uptake, global protein translation and transcriptional regulation. Altered levels of insulin signaling are known to play key roles in development and disease, yet the molecular basis of such differential signaling remains obscure. Expression of the insulin receptor (InR) gene itself appears to play an important role, but the nature of the molecular wiring controlling InR transcription has not been elucidated. We characterized the regulatory elements driving Drosophila InR expression and found that the generally broad expression of this gene is belied by complex individual switch elements, the dynamic regulation of which reflects direct and indirect contributions of FOXO, EcR, Rbf and additional transcription factors through redundant elements dispersed throughout ∼40 kb of non-coding regions. The control of InR transcription in response to nutritional and tissue-specific inputs represents an integration of multiple cis-regulatory elements, the structure and function of which may have been sculpted by evolutionary selection to provide a highly tailored set of signaling responses on developmental and tissue-specific levels.
The insulin signaling pathway plays essential roles in growth and metabolism in metazoans. In mammals, the insulin receptor (INSR) binds to insulin, leading to autophosphorylation, phosphorylation of adaptor proteins, and subsequent activation of the PI3K-Akt and MAPK pathways (Ebina et al., 1985; Ullrich et al., 1985; Oldham and Hafen, 2003). Akt propagates the metabolic effects of the signaling by targeting downstream substrates, including the glucose transporter GLUT4 (also known as SLC2A4) (Bertrand et al., 2008; Gonzalez et al., 2011). The FOXO transcription factor, which is phosphorylated and excluded from the nucleus as a result of insulin signaling, represents an important target (Puig and Tjian, 2005). FOXO regulation is widely conserved in metazoans, serving to mediate the effect of insulin signaling on growth, aging and metabolism in C. elegans and Drosophila (Taguchi and White, 2008). Interestingly, FOXO directly activates expression of the insulin receptor, representing a negative transcriptional feedback loop (Puig et al., 2003; Puig and Tjian, 2005). The receptor is expressed in most tissues and developmental stages, underscoring the broad physiological relevance of this signaling pathway.
Although insulin levels control pathway activity, varying levels of receptor expression may influence signaling in a tissue-specific manner; the developmental significance of such transcriptional regulation is yet to be understood. Developmental differences in insulin receptor gene expression may be ‘hardwired' and subject to evolutionary modification, changing the impact of insulin signaling in the control of body size and morphology. In addition, physiological stimuli influence the expression of INSR, although the importance and consequences of this regulation are unknown. Diet, hormone levels and other signals impact INSR expression levels, as can viral infection and diabetes (Mamula et al., 1990; Kriauciunas et al., 1993; Chatterjee, 2001; Iritani et al., 2000; Gunton et al., 2005; Kim et al., 2014). Elevated levels of INSR expression are observed in numerous cancers, leading to an insulin-dependent growth phenotype (Belfiore and Malaguarnera, 2011). Low levels of INSR expression and signaling in the brain are associated with Alzheimer's disease (Frölich et al., 1998; Moloney et al., 2010).
Despite the epidemiological and experimental evidence for changes in expression of INSR in cancer and other diseases, we have limited knowledge about the transcriptional controls, precluding a molecular understanding of how receptor expression impacts physiology and disease. The INSR gene is very broadly expressed, unlike highly tissue-specific developmental genes, but there is limited data to suggest that the expression is not a product of a simple housekeeping promoter (Lee et al., 1992). For the human INSR gene, attention has focused on 2 kb flanking the transcription start site, the activity of which in reporter genes is affected by dexamethasone, glucocorticoids, vitamin D and estrogen (Leal et al., 1992; Lee and Tsai, 1994; García-Arencibia et al., 2005; Calle et al., 2008). This fragment is regulated by Sp1, HMGI, p53 and Rb (Cameron et al., 1992; Shen et al., 1995; Webster et al., 1996; Brunetti et al., 2001). The transcriptional significance of other regions, including extensive introns, remains largely unexplored. Genomic surveys of the mammalian gene reveal functionally uncharacterized chromatin marks and structures consistent with enhancers, and putative intronic enhancers for the gene were identified in mouse T cells (Pasquali et al., 2014; Kundaje et al., 2015; Vanhille et al., 2015).
Drosophila Insulin-like receptor (InR) is activated by a family of insulin-like peptides to control growth and homeostasis (Oldham and Hafen, 2003). The InR gene is crucial for embryonic development, function of the nervous system and regulation of organ growth (Petruzzelli et al., 1986; Garofalo and Rosen, 1988; Brogiolo et al., 2001; Song et al., 2003; Wong et al., 2014). InR mutations result in pleiotropic recessive phenotypes, leading to embryonic lethality (Fernandez et al., 1995). Its broad expression pattern has been classified as among the more ‘transcriptionally stable', similar to other genes of widespread function and fewer extremes in expression (Pérez-Lluch et al., 2015). However, InR expression is affected by nutrition and by the steroid hormone ecdysone, which acts through its receptor EcR to control growth and development (Koelle et al., 1991; Riddiford et al., 2000; Hu et al., 2003; Gershman et al., 2007). Ecdysone stimulates expression of InR in Drosophila Kc cells, and ChIP-seq studies indicate that this regulation is direct; EcR and USP bind to the InR locus (Gauhar et al., 2009). More recently, STARR-seq technology in Drosophila cell lines identified 20E-responsive elements in the InR gene locus (Shlyueva et al., 2014).
An additional regulatory factor for the insulin receptor is the retinoblastoma co-repressor (Rb); this co-factor binds near the InR promoter, although the physiological role has yet to be investigated (Acharya et al., 2012; Korenjak et al., 2012; Wei et al., 2015). Rb tumor suppressor proteins are key regulators of the cell cycle; thus, coordinate regulation of the insulin receptor with these genes might provide a link between cell proliferation and growth (Du and Pogoriler, 2006; Giacinti and Giordano, 2006; Acharya et al., 2012). From examination of human ChIP-seq datasets, we note that Rb family proteins also bind the human INSR gene (Chicas et al., 2010).
Similar to its mammalian counterpart, the Drosophila InR gene is large, with nearly 40 kb of introns (Casas-Tinto et al., 2007). The gene is characterized by various histone modifications and regions exhibiting DNase I hypersensitivity, FAIRE-seq signals, and STARR-seq candidate regulatory elements. However, as for the mammalian gene, we lack an integrated understanding of the direct transcriptional controls of this central player in cell metabolism and development (Kaplan et al., 2011; Li et al., 2011; Nègre et al., 2011; Thomas et al., 2011; Arnold et al., 2013; McKay and Lieb, 2013). Here, we provide a comprehensive identification and characterization of cis-regulatory elements associated with the InR gene, mapping their dynamic responses to FOXO (also known as dFOXO), ecdysone and Rb. Our detailed mutagenic studies of the active enhancers identify specific elements and motifs required for enhancer activity, providing, in some cases, an incoherent feed-forward regulatory logic. The dynamic regulation of these enhancers by transcriptional inputs indicates that these enhancers play a role in temporal, spatial and critical fine-tuning control of InR gene expression. Our study indicates that this gene is subject to a complex transcriptional circuit extending far beyond the previously described simple model of the FOXO-feedback loop mechanism. This gene circuit analysis transforms our understanding of the insulin receptor gene, in that even such a broadly expressed gene requires exquisite controls that are crucial to the roles of this signaling pathway in metabolism, growth control and cancer.
Candidate regulatory regions in InR introns
The Drosophila InR gene spans ∼50 kb, including ∼40 kb of introns (Fig. S1A). We found that an 80 kb BAC genomic construct (InR-BAC) covering this locus rescued lethality of an InRE19/GC25 temperature-sensitive mutant (Fig. S1B,C); two copies of the BAC increased InR gene expression in these flies 2- to 3-fold (Fig. S1D). Although we cannot rule out possible additional distant cis-regulatory elements, it is clear that sequences relevant for InR expression are located within this region. We therefore investigated the short 5′ intergenic sequence and introns of InR for clues as to transcriptional controls. Data from genome-wide STARR-seq surveys in transfected cells, as well as DNase hypersensitivity data and measurement of open chromatin using FAIRE-seq indicate that InR introns are likely to harbor relevant cis-regulatory elements (Fig. 1B, Fig. S2). To evaluate the regulatory potential of intronic regions in the fly, we tested ten GAL4 lines with fragments from the InR gene (Pfeiffer et al., 2008). Previous measurements in the embryo indicated that some of these elements drive GFP expression in dynamic patterns (Jenett et al., 2012; Jory et al., 2012; Manning et al., 2012; Li et al., 2014). Three of the fragments also express GFP in larvae and adults, in either ubiquitous or tissue-specific patterns (Fig. 1A, Fig. S3, Table S1). We additionally plotted the results of genome-wide enhancer surveys (from S2 and ovarian stem cells), chromatin accessibility at different developmental stages and tissues as measured by FAIRE-seq, and the enhancer-associated histone modifications H3K27ac, H3K4me1 and presence of the p300 coactivator (also known as Nejire or CBP). The resultant patterns do not provide a consistent, easily interpretable set of correlations across different developmental times. Enhancers found using STARR-seq do point to redundantly acting enhancers in InR introns with either shared or cell type-specific patterns (Fig. 1B). These enhancers overlap some of the fragments tested as GAL4 drivers, but there was not a complete agreement between these different methods. The two types of assays relied on distinct basal promoters, which might have biased detection because of enhancer-promoter specificity (Marinić et al., 2013; Zabidi et al., 2015; Lorberbaum et al., 2016).
Identification of active enhancers within InR introns
To delineate the exact structure of InR regulatory regions and identify regulation by FOXO and ecdysone, we divided the InR introns into 25 fragments of ∼1.5 kb each (Fig. 2A). As noted above, genome-wide assays for cis-regulatory elements employed synthetic basal promoters, which may lack functional compatibility with the endogenous enhancers. Therefore, we tested the endogenous basal promoter regions. InR has three annotated transcription start sites: T1, T2 and T3 (Casas-Tinto et al., 2007). Genome-wide RNA polymerase II occupancy and the H3K4me3 histone modification, which is linked with transcriptional start sites, showed strong association with the T1 promoter throughout different developmental stages (Fig. S4). This basal promoter appears to be the predominant site of initiation (Casas-Tinto et al., 2007; Graveley et al., 2011; Brown et al., 2014; Attrill et al., 2016). We compared the promoter activities of T1, T2 and T3 in luciferase reporter constructs, using a similarly sized intronic fragment (PT) as a negative control (Fig. 2A). The promoters and the negative control were assayed in S2 and Kc cells; T1 promoter activity was much higher than that of T2 and T3 (Fig. 2B,C, Table S2). T1 was therefore used to assay the 25 intron fragments in reporter constructs. To test for cell type specificity, we used both S2 and Kc cells. Intron fragments 2, 3, 20 and 22 were found to be active in both cell types, and the levels of activity varied. Fragments 4, 12 and 15 were active in one of the two cell types, indicating the presence of cell type-specific enhancers (Fig. 2B,C).
We find active elements in S2 cells in regions 2, 3 and 12, similar to findings from STARR-seq in S2 cells. However, we find no activity in regions 6 or 23-25, where possible enhancers were detected in some STARR-seq assays, while regions 20 and 22 were robust activators, but not consistently identified in STARR-seq (Fig. 2A-C). These differences between the assays might reflect our testing of longer fragments (1.5 kb versus 600 bp), which were thus less likely to divide and inactivate an enhancer element, and our assays relied on enhancers acting on the endogenous basal promoter region, which may provide compatibility that is lacking in the genome-wide approach.
We assessed the responsiveness of the T2 and T3 promoters to the regulatory elements identified above. Fusion constructs containing regions 2 or 3 robustly activated transcription from T1, but not T2 or T3, suggesting that these basal elements are unlikely to generate much of the overall transcriptional output, a conclusion supported by RNA-seq analysis (Fig. S5) (Graveley et al., 2011; Brown et al., 2014; Attrill et al., 2016). Previous studies have focused largely on the regulatory potential of T2 (Puig et al., 2003; Casas-Tinto et al., 2007); our analysis indicates that much of the regulatory activity of this locus is likely to be channeled through the distal T1 promoter.
Positive and negative regulation of InR enhancers by FOXO
In addition to its role as a downstream effector of insulin signaling, the FOXO transcription factor also impacts InR expression (Jünger et al., 2003; Puig et al., 2003). Binding sites for FOXO are present at T2, and reporter genes containing these sequences are activated by FOXO, leading to the notion that T2 allows for FOXO activation of InR (Puig et al., 2003; Casas-Tinto et al., 2007). We assayed each element for activation by FOXO. Consistent with previous reports, the weak T2 promoter was modestly activated by FOXO in both S2 and Kc cells (Fig. 2B). By contrast, robust activation by FOXO was observed with region 2, and with region 4 in S2 cells. The T1 promoter was itself slightly repressed by FOXO expression in S2 cells. Strikingly, expression of FOXO had a strong and significant negative effect on other elements, including regions 3 and 22, which were repressed in both cell types. Fragment 20 was repressed in S2 cells, whereas it was activated by FOXO expression in Kc cells (Fig. 2B, Table S2).
Most FOXO response fragments may be indirectly regulated
To determine if the transcriptional effects mediated by FOXO were a consequence of direct interaction of the protein with these regulatory elements, we performed ChIP analysis using anti-FOXO serum. A previously characterized direct target of FOXO, the Thor (4EBP) promoter, was used as a positive control, which showed strong endogenous FOXO binding (Fig. 3A,B) (Teleman et al., 2008; Alic et al., 2011; Bai et al., 2013). T2 (included in region 18) also showed lower, but significant, FOXO enrichment (Fig. 3A,B) (Puig et al., 2003; Casas-Tinto et al., 2007). Surprisingly, none of the other elements that were transcriptionally regulated by FOXO expression exhibited strong binding by the factor (Fig. 3A,B). A prominent peak was observed on fragment 10, an element not activated or repressed by FOXO (Fig. 3A,B).
To further assess whether the signals that we observed represented FOXO binding, we treated cells with insulin to activate the signaling pathway, which should result in phosphorylation and exclusion of endogenous FOXO from the nucleus, or subjected cells to serum starvation, which should reduce signaling and increase FOXO activity (Puig et al., 2003). Insulin treatment resulted in modestly reduced FOXO ChIP signals on Thor, as well as regions 10 and 18, whereas starvation appeared to increase the ChIP signal (Fig. 3C).
Similarity of ecdysone and FOXO responses
Ecdysone treatment increases the expression of InR; however, the molecular mechanism of this regulation has not been elucidated (Gauhar et al., 2009). To determine how the hormone may affect the transcriptional elements of InR, we treated cells with 20-hydroxyecdysone (20E). The T1 promoter was slightly repressed by 20E in both cell types, whereas T2 and T3 were unaffected (Fig. 2C). Fragment 2 was robustly activated, whereas fragments 3 and 20 were significantly repressed by 20E in both cell types. In Kc cells, we observed cell type-specific activation of elements 9 and 10, which alone had not shown significant transcriptional potential. A greater number of elements showed reduction in activity after 20E treatment, although some of these effects were modest (Fig. 2C, Table S2). To determine whether these 20E responses required EcR, we assayed reporters in an EcR-deficient cell line (ΔEcR) that was derived from Kc cells (Swevers et al., 1996). No constructs responded to 20E treatment in these cells, indicating the requirement for EcR (Fig. 2D, Fig. S6). We transfected the ΔEcR cells with EcR and its heterodimeric partner USP and confirmed that 20E responsiveness was restored, confirming the role of EcR in this regulation (Fig. 2D).
Interestingly, many of the elements tested, including 2, 3, 12 and 20, showed similar responses to 20E and FOXO, suggesting the involvement of linked pathways (Fig. 2B,C). Significantly, 20E signaling has been shown to affect FOXO localization by regulating PI3K activity, suggesting that some of the 20E effects might be mediated by FOXO activity (Colombani et al., 2005). In addition, FOXO has been reported to bind directly to the USP co-factor of EcR (Koyama et al., 2014). Thus, 20E might regulate some enhancers in the InR gene via FOXO activity. FOXO overexpression was able to regulate these elements in ΔEcR cells just as in wild-type Kc cells; thus, EcR is not required for FOXO activity (Fig. S7).
To gain further insight into 20E regulation, we compared our data in S2 cells with STARR-seq analysis in the same cell line treated with 20E (Fig. S8) (Shlyueva et al., 2014). Essential features were confirmed in both studies. In contrast to FOXO-responsive enhancers, the 20E-activated enhancers were directly bound by EcR and its binding partner USP, suggesting a mode of regulation involving EcR derepression in the presence of 20E (Fig. S8). None of the 20E-repressed areas correlated with directly bound EcR peaks, and thus these elements might be subject to indirect regulation. One gene induced by EcR is Eip74EF, which functions as a repressor (Shlyueva et al., 2014). ChIP-seq analysis in embryos indicates that this protein may interact with repressed regions 19 and 20, but that other repressed elements may be repressed by a different factor. Eip74EF is also found to bind to region 2, which was activated by 20E. This binding might represent a progressive gene switch, in which initial derepression after loss of EcR binding is later followed by repression, as EcR-driven Eip74EF repressor levels increase.
Impact of Rb binding site on the InR promoter and enhancers
The direct actions of FOXO and EcR in InR expression had previously been supported by genetic and biochemical evidence. More recently, we noted that the T1 proximal promoter region of the InR gene is occupied in vivo by the Rbf1 (Rbf – FlyBase) tumor suppressor protein, the homolog of mammalian Rb (Acharya et al., 2012; Korenjak et al., 2012; Wei et al., 2015). Binding of Rbf1 appears to be of functional significance, as a T1 promoter construct is repressed by Rbf1 expression in S2 cells (Raj et al., 2012). To further explore the significance of Rbf1 protein interaction with the InR T1 promoter, we removed a 100 bp fragment centered on the Rbf1 binding peak (ΔRbf1) (Fig. 4A). This ΔRbf1 promoter showed modest but reproducibly higher activity than the wild-type T1 promoter, indicating that this Rbf1 binding region downregulates expression (Fig. 4B). The T1 promoter proximal region has a relatively modest transcriptional output, so we explored the significance of Rbf1 in the context of more active reporters with elements 2, 3 or 12. Particularly for the fusion containing region 3, the transcriptional impact of the small T1 deletion was much larger in absolute terms than that observed for just the basal promoter itself, suggesting that Rbf1 might not only reduce the functionality of local activators within T1, but also compromise the utility of the basal promoter for element 3 (Fig. 4C). Similar ‘booster' roles for basal elements have been noted in developmentally active genes (Yuh and Davidson, 1996). The removal of the Rbf1 binding region did not dramatically alter the effects of FOXO expression, which activated element 2 and repressed 3 and 12 (Fig. 4C).
As a co-repressor, Rbf1 binds to E2F1 (also known as dE2F1) to block its activation function (Du and Pogoriler, 2006). Removal of the Rbf1 binding element, which includes E2F motifs, does not abrogate the function of T1, suggesting that other regulatory sites contribute to the activity of this promoter. We tested whether E2F1 activates the T1 promoter, and whether this requires the region involved in Rbf1 recruitment. Cotransfection of E2F1 significantly upregulated reporters containing the wild-type promoter as well as the ΔRbf1 T1 promoter; the fold stimulation was similar to that observed for the control PCNA promoter (Fig. 4D). PCNA was maximally stimulated at lower concentrations of transfected E2F1, suggesting that it might have different affinities for E2F1 binding (Fig. S9). The ΔRbf1 T1 promoter was activated by E2F1 to a higher level than the wild-type T1 promoter, consistent with the removal of the repressive function of Rbf1, and indicating that there might be additional E2F1 binding sites that are not suitable for Rbf1 recruitment, or that the activation occurs through an indirect effect via other transcription factors. Thus, Rbf1 has a repressive function on the T1 promoter and, more than merely interfering with local activators, this co-repressor might generally influence the ability of linked regulatory regions to fully engage and stimulate transcription from the T1 start site. This mode of regulation contrasts with the all-or-nothing effect observed for Rbf1 and Rb family proteins in general on cell cycle target promoters (Raj et al., 2012), and suggests that Rbf proteins might instead invoke a ‘soft' regulatory function on certain target genes.
Transcriptional circuitry of the InR gene revealed by precise mapping of CREs
To obtain a more precise understanding of the transcriptional circuitry regulating the InR gene, we further analyzed each of the active enhancers and FOXO/20E response enhancers by making serial deletions (∼300 bp each, M1-M5) in each active fragment, testing all in S2 and Kc cells for their response to FOXO or 20E (Fig. 5A). The deletion series revealed the portions of each enhancer necessary for baseline activation, as well as regions that possess inherent repressive potential. Some deletions attenuated or abrogated the response to FOXO or 20E. With enhancer 2, a FOXO- and 20E-activated element, removal of region M1 reduced basal activity to the same level as T1, suggesting that the region contains an essential activator binding site(s) (Fig. 5B). Removal of M3 greatly induced the basal activity, indicating the presence of a repressor binding site(s). FOXO induction was somewhat attenuated by removal of either M3 or M5, indicating potential FOXO-dependent activator binding sites. The effects of these mutations are summarized using symbols for constitutive activators or repressors and for FOXO- or 20E-dependent activator or repressor effects in Fig. 5C.
For enhancer 2, removal of M3 produced a complex effect with 20E treatment: baseline expression increases, but the ability of 20E to activate is lost and instead causes repression (Fig. 5B). Repression on this element is almost certainly due to the direct binding of EcR, as this protein has been found to bind within this region (Fig. S8) and the removal of M3 has no derepressive effect in cells lacking EcR (Fig. 5B). We propose that the M3 mutant is repressed rather than activated by 20E treatment because the region contains activator sites, in addition to EcR binding sites, and these activator sites are important for overall enhancer activity. 20E treatment removes EcR and simultaneously triggers the expression of repressors (e.g. Eip74EF, which might act on 20E-repressed enhancers such as elements 3 and 12). The weaker complement of activators left on this version of enhancer 2 might be dominantly suppressed by the action of these 20E-induced repressors, whereas a wild-type enhancer would not.
We analyzed all of the regulatory fragments using the same deletion analysis in both cell types, as well as ΔEcR cells, and tested for responses to FOXO and 20E (Fig. S10). The results are summarized for all elements using symbols to indicate the presence of activator or repressor activities in subregions M1-M5 (Fig. 5C for results in Kc cells; Fig. S11 for results in S2 cells; Fig. S10 and Table S3).
Combinatorial interactions of InR regulatory elements
Our detailed analysis of the cis-regulatory landscape of the InR gene indicates that multiple, parallel-acting elements contribute to the overall regulation of expression. Early studies emphasized the modularity of multiple enhancers acting on developmental genes, but a number of studies have since shown how some discrete cis-regulatory elements function in combinatorial manners (Small et al., 1993; Marinić et al., 2013; Bothma et al., 2015). Are the regulatory units identified in the InR gene independently acting units that function in an additive manner, or might there be higher-order interactions? To test this possibility, we fused cis-regulatory regions together and compared their activities with the individual parts (Fig. S12). For regions 2 and 3, the enhancers showed subadditive behavior, meaning that the sum was somewhat less than the individual activities. This effect might simply be a function of distance-dependent activation, a well-known property of cis-regulatory elements (despite the generalization that enhancers should work in a distance-independent manner) (Banerji et al., 1981). Although our reductionist analysis of the cis-regulatory elements of this gene serves to identify key properties of each of these molecular switches, a quantitative combinatorial understanding will come from re-integrating this information in the intact locus.
Gene size and regulatory complexity
With the completion of the first invertebrate metazoan genome sequences, the Carroll laboratory noted a correlation between gene size and functional designation (Nelson et al., 2004). A compact structure was associated with widely expressed genes, such as those for ribosomal proteins, while transcription factors and signaling molecules were on average encoded by genes with larger intergenic spaces (subsequent annotation of distal 5′ start sites meant that some intergenic spaces are actually large introns). The elaborate expression of very large genes, such as those of the Hox clusters, represents the combined action of multiple tissue-specific regulatory elements active at different developmental stages (Montavon and Soshnikova, 2014). For these genes, the amount of non-coding DNA is clearly linked to this regulatory complexity. On the other hand, signaling molecules, such as the insulin receptor, are broadly expressed, with transcriptional expression that is ranked among the more stable (top third of all genes, rating just below ribosomal protein transcripts) (Pérez-Lluch et al., 2015). That study found that these ‘broadly expressed' genes, both compact and large in size, share common histone modification states, in contrast to the different chromatin states associated with dynamically regulated genes, such as those for transcription factors. Thus, broadly expressed genes of both compact and large size share common regulatory and genomic features.
Why do genes, such as the insulin receptor, span substantial genomic regions? One observation relating to developmental gene expression is that genes harboring large introns demonstrate substantially different regulatory kinetics, whereby the traversing of extra genomic distance necessarily introduces a lag in induction and repression (Arnosti, 2011; Bothma et al., 2011). Although this effect might play a role in InR expression as well, it is clear that the intronic sequences of the gene serve a role that is far more complex than that of a simple spacer; we find that the transcribed space of the gene contains multiple regulatory elements that provide both redundant as well as contrasting regulatory output, some in a cell-specific manner (Fig. 6A). Other broadly expressed genes, such as those encoding ribosomal proteins, can exhibit significant regulatory responses to environmental signals, yet have relatively compact structures (Teleman et al., 2008). We speculate that there are two selective forces at play that explain the amount of regulatory DNA needed. First, simple promoters, such as those driving ribosomal protein genes, might provide the high levels of activity needed for abundant transcripts, with a certain level of dynamic regulation, but the regulation of these genes may reflect only a few types of signaling input (e.g. TOR, cell cycle) (Powers and Walter, 1999; Martin et al., 2004; Wei et al., 2015). Second, InR expression, by contrast, is wired into many different signaling pathways and might need additional regulatory DNA to provide modules suitable for responses in different developmental settings, including growth control of larval tissues, stem cell niches in the larva and adult, and non-proliferating adult neurons. In addition to providing a wide spectrum of regulatory inputs, the extensive regions of DNA dedicated to control might provide necessary redundancy, so as to achieve a high degree of precision that is buffered from environmental and genetic noise.
In contrast to the highly specific on/off regulation exhibited by many developmental genes, the regulatory elements of InR appear to be tuned to maintain moderate responses to signals. For example, both InR and the Eip74EF/Eip75B genes are regulated by 20E and EcR (Gauhar et al., 2009; Bernardo et al., 2014). The InR gene contains elements that are either activated or repressed by 20E, whereas Eip74EF/Eip75B contain multiple copies of 20E activator elements. Upon exposure to 20E, Eip74EF/Eip75B expression levels increase dramatically, whereas InR levels increase much more modestly (Fig. S13) (Bernardo et al., 2014; Mirth et al., 2014). The incoherent feed-forward properties of the InR gene, with simultaneous positive and negative effects, might ensure more precise changes in gene expression, preventing pleiotropic impacts on the downstream signaling pathway. Similarly, the presence of FOXO-activated and FOXO-repressed enhancers within the InR locus might allow FOXO to achieve precise temporal and spatial control of InR. In one model, the incoherent signaling may enable a temporally complex expression pattern, whereby the direct action of FOXO first transiently upregulates InR gene expression, followed by a delayed downregulation via the indirectly repressed enhancers (Fig. 6A,B). In addition, the multiple layers of regulation by FOXO might also provide tissue specificity, whereby FOXO-driven activation and repression signals may be weighted differently in different cellular contexts (Fig. 6A,C). Interestingly, a recent study of the broadly expressed ptc gene in Drosophila reveals a similar genetic architecture, with large amounts of regulatory DNA devoted to fine-tuning of gene expression via multiple independent elements (Lorberbaum et al., 2016). Our results suggest that the complex regulatory control found for InR might be representative of certain broadly expressed genes that have functions that necessitate complex developmental and physiological inputs, and thus a high degree of regulatory precision. Whether gene size generally reflects regulatory sophistication in other metazoan genomes remains to be explored; the selection on non-coding DNA varies in different lineages (Hartl, 2000). It is interesting that the mammalian INSR gene spans ∼200 kb, with many features consistent with enhancer complexity, including a predicted ‘super-enhancer' state within the transcription unit (Wei et al., 2016).
FOXO regulation via direct and indirect pathways
Our study provides new insights into the transcriptional regulation mediated by FOXO, a key player in nutritionally driven developmental plasticity and insulin sensitivity (Tang et al., 2011). Feedback regulation by FOXO transcription factors controls the expression of InR (Jünger et al., 2003; Puig et al., 2003; Puig and Tjian, 2005). This aspect of transcriptional regulation of InR has been studied in molecular detail; based on genetic perturbation studies and transcriptional reporter assays with short segments of the InR gene, a previous model suggested that FOXO regulation consists of direct binding to activate a single internal promoter (Puig et al., 2003; Casas-Tinto et al., 2007). Our data indicate that FOXO regulation is far more complex; we confirm the direct, if modest, activation role for FOXO on an internal promoter, but FOXO indirectly activates or represses at least half a dozen additional enhancers located within introns of the InR gene (Fig. 2B, Fig. 6A). The majority of this regulation appears to rely on transcriptional intermediates and multiple regulatory layers (Fig. 6A).
How common is such concerted direct and indirect regulation by FOXO? Hundreds of genes are suggested to be direct targets of FOXO regulation in Drosophila, although few have been investigated further for transcriptional regulation (Alic et al., 2011; Bai et al., 2013). For genes with small promoter regions, such as Thor, direct activation by FOXO might represent the bulk of the regulation. However, other genes appear to be subject to incoherent feed-forward regulation, in which a factor confers both positive and negative effects. The RpL24-like promoter is directly repressed by FOXO and is activated by the transcription factor Myc, which is in turn activated by FOXO, establishing a two-layer regulation of this ribosomal protein gene (Teleman et al., 2008; Alic et al., 2011; Herter et al., 2015). The compact promoter of RpL24-like probably does not approach the complexity of regulation seen with InR, which might reflect the importance of fine control of the receptor gene at the apex of this signaling cascade. Thus, it remains to be established how often FOXO target genes are regulated via multiple enhancers through complex direct and indirect paths, but simple direct activation may represent only one class of important FOXO effects.
Regulation of insulin receptor expression in development and disease
Molecular analysis of the transcriptional controls for Hox and other developmental genes has transformed our understanding of the mechanisms of gene expression, development and evolution. The transcriptional control of broadly expressed genes, such as InR, has received less attention. However, the sophistication of InR transcriptional wiring, which produces fewer extremes of expression, clearly points to strong selection for specific types of expression. Distinct variants in the coding regions of this gene are strongly selected in different Drosophila populations, and it is likely that sequence variation within InR intronic enhancers will be targets for evolution at a population and species level (Paaby et al., 2014) (A.S., unpublished). Understanding the impact of standing natural variation is also likely to provide insights into pathological states. In a Drosophila cancer model system, tumorigenesis associated with a high-sugar diet involves InR upregulation via the Wnt signaling pathway, and misregulation of human INSR is noted in cancer, type 2 diabetes and Alzheimer's disease (Gunton et al., 2005; Freude et al., 2009; Belfiore and Malaguarnera, 2011; Hirabayashi et al., 2013). Indeed, human sequence variants associated with type 2 diabetes and Alzheimer's disease lie within candidate INSR enhancers (Pasquali et al., 2014). The role we find for retinoblastoma protein in the regulation of InR points to a coordination of cell cycle and cellular signaling.
Our transcriptional map identifies enhancers that are likely to be targets of functional mutations, although more detailed studies will provide a better measure of candidate transcription factor binding sites. Many genome-wide studies utilize chromatin marks as proxies for active enhancers; however, it is significant that despite the many classes of ENCODE data available for this system, direct tests were essential for the identification and characterization of enhancers, which were not easily identified from general chromatin features. In fact, even dynamic chromatin features may reflect off-target effects of transcription factors, rather than functional interactions (Kok et al., 2015). Although likely to be incomplete, our identification of the cis-regulatory circuitry of InR represents a first step in enabling the construction of computational models, which can be tested in physiological settings to understand the impact of regulatory sequence variation and signaling (Samee et al., 2015; Sayal et al., 2016).
MATERIALS AND METHODS
Fly strains and transgenic lines
Fly strains were obtained from the Bloomington Stock Center as indicated in the supplementary Material and Methods. Transgenic flies were generated using a BAC containing the entire InR locus. mRNA from 3-day-old transgenic flies was analyzed by qPCR. For details, see the supplementary Material and Methods.
Luciferase reporters and assays
Luciferase reporters were constructed as previously described (Zhang et al., 2014). For details of reporters and luciferase assays, see the supplementary Material and Methods, Tables S4 and S5.
Cell culture and transfection
Drosophila S2 cells, Kc cells (Kc167) and ΔEcR cells (derived from Kc cells, obtained from Drosophila Genomics Resource Center, ID: L57-3-11) were cultured in Schneider's medium (Gibco) supplied with 10% FBS (Gibco) and penicillin-streptomycin (100 units/ml penicillin, 100 μg/ml streptomycin, Gibco). Details of transfections are provided in the supplementary Material and Methods.
Kc or S2 cells were treated with 20E and then subject to qPCR analysis as described in the supplementary Material and Methods.
Published STARR-seq, DNase hypersensitive site (DHS)-seq, FAIRE and ChIP-seq data were obtained and analyzed as described in the supplementary Material and Methods.
We thank Dr Carla Margulies (Ludwig Maximilian University of Munich, Germany) for FOXO antibodies, Dr Alex Shingleton (Michigan State University/Lake Forest College) for fly strains, Dr Bartek Wilczyński (University of Warsaw, Poland) and members of the D.N.A. and Henry laboratories (Michigan State University) for helpful discussions.
Y.W. generated the reporter library, conducted all reporter assays in different cell types, carried out FOXO ChIP and analyzed reporter data. R.H.G. conducted the BAC rescue assays and tested Janelia GAL4 lines for tissue-specific enhancers. A.S. carried out population analysis of the InR locus. K.M.M. and A.I. helped to construct the reporter library and test Janelia GAL4 lines, respectively. The project was directed and manuscript written by Y.W., R.H.G., A.S. and D.N.A.
This study was supported by the National Institutes of Health [GM056976 to D.N.A.]. Deposited in PMC for release after 12 months.
The authors declare no competing or financial interests.