Within the last 3 years, genome-wide association studies (GWAS) have had unprecedented success in identifying loci that are involved in common diseases. For example, more than 35 susceptibility loci have been identified for type 2 diabetes and 32 for obesity thus far. However, the causal gene and variant at a specific linkage disequilibrium block is often unclear. Using a combination of different mouse alleles, we can greatly facilitate the understanding of which candidate gene at a particular disease locus is associated with the disease in humans, and also provide functional analysis of variants through an allelic series, including analysis of hypomorph and hypermorph point mutations, and knockout and overexpression alleles. The phenotyping of these alleles for specific traits of interest, in combination with the functional analysis of the genetic variants, may reveal the molecular and cellular mechanism of action of these disease variants, and ultimately lead to the identification of novel therapeutic strategies for common human diseases. In this Commentary, we discuss the progress of GWAS in identifying common disease loci for metabolic disease, and the use of the mouse as a model to confirm candidate genes and provide mechanistic insights.
The power of GWAS for studying human disease
Genome-wide association studies (GWAS) in human populations have been carried out for a broad range of common diseases and disease subtypes, including bipolar disorder, coronary heart disease, Crohn’s disease, hypertension, rheumatoid arthritis, obesity, and type 1 and type 2 diabetes (Wellcome Trust Case Control Consortium, 2007). These studies have been huge undertakings, with large numbers of individuals in case-control groups and the application of very-high-density single-nucleotide polymorphism (SNP) genotyping. In many cases, this has required a high degree of international cooperation, including meta-analysis in order to maximise the statistical power for the discovery and replication of newly identified loci that are associated with disease traits (Zeggini et al., 2007). This approach has been very successful, identifying numerous loci and novel candidate genes that are directly implicated in human disease; for type 2 diabetes, the current confirmed total stands at approximately 35 loci (Voight et al., 2010). However, each of the identified genes has a relatively small effect on disease risk, explaining only a very small part of the observed familiarity in type 2 diabetes (Saxena et al., 2007; Sladek et al., 2007; Zeggini et al., 2007; Dupuis et al., 2010; Voight et al., 2010). Thus, the predictive value of these loci for assessing an individual’s disease risk is disappointing. There might be additional low-frequency or rare alleles of the identified genes in the population that further contribute to heritability, as is being found for hypertriglyceridemia (Gloyn and McCarthy, 2010; Johansen et al., 2010); however, the true value of identifying these disease-associated loci is that they provide insight into the biology of the disease and uncover potential new targets and pathways for future therapeutic intervention. Although the magnitude of the effect of common disease variants identified at the population level is small, there is still the possibility that large therapeutic (or detrimental) effects can be generated through pharmacological manipulation of a potential target or, indeed, through more extreme alleles.
The ultimate translation of disease-associated loci identified through GWAS into benefit for patients presents a number of challenges. First, GWAS identify novel loci that are defined by linkage disequilibrium (LD) blocks (also known as haplotype blocks; sequences that are inherited together in a non-random fashion), rather than identifying the causal genetic variant(s) [although small genomic regions are identified owing to the precision of high-density single-nucleotide polymorphism (SNP) panels]. Second, the SNPs that are associated with disease are not necessarily functional (in that they might have an effect on another gene that gives rise to a phenotypic consequence). Functional identification of the causal gene(s) within a disease-associated locus is the first essential step towards gaining new biological insight and identifying potential therapeutic targets.
GWAS of type 2 diabetes and obesity
Some of the greatest success in identifying previously unknown susceptibility loci for common diseases has been in the identification of loci for type 2 diabetes and obesity. In 2006, the TCF7L2 gene [a key Wnt signalling transcription factor encoded within a previously mapped chromosomal region that was suggestively linked with type 2 diabetes (Reynisdottir et al., 2003)] was directly implicated in the disease, initially through strong association in an Icelandic population that was then replicated in two other populations of European ancestry (Grant et al., 2006). This result was later widely replicated in other populations. Mechanistic work indicates that this gene is involved in insulin secretion, probably at the level of regulating genes that are required for secretory granule exocytosis (see Xavier et al., 2009). The following year, 2007, was a landmark for type 2 diabetes and obesity GWAS. One of the first associations to be published was of SNPs within the first intron of a gene called FTO (Dina et al., 2007; Frayling et al., 2007; Scuteri et al., 2007): these SNPs were strongly linked to type 2 diabetes, but this association was lost when an adjustment for body mass index (BMI) was made (Frayling et al., 2007). The FTO locus predisposes both children and adults to obesity, and consequently increases the risk of type 2 diabetes (Dina et al., 2007; Frayling et al., 2007). Most importantly, FTO was a gene of unknown function in an unknown pathway (Frayling et al., 2007) and, therefore, there was much interest in determining whether FTO was the causal gene underlying the obesity association. This initial set of studies was rapidly followed by the publication of additional loci that were directly associated with type 2 diabetes (Scott et al., 2007; Sladek et al., 2007; Wellcome Trust Case Control Consortium, 2007; Zeggini et al., 2008; Voight et al., 2010). Of the loci identified to date, and the putative candidate genes contained within them, many are likely to affect insulin secretion, whereas few are potentially involved in insulin action (Voight et al., 2010) (Table 1). Interestingly, some loci are associated with different pathologies: for example, the KLF14 locus has been identified in studies of both type 2 diabetes and basal cell carcinoma, although the functional significance of these associations remains to be determined (Stacey et al., 2009; Voight et al., 2010).
- Dominant negative:
a mutation that adversely alters the gene product by interfering with the function of the normal gene product (in heterozygotes).
- Effect size (also known as odds ratio):
a measure of relative risk for a disease association compared with controls.
an unbiased method, using a dense array of single-nucleotide polymorphisms as genetic markers, to detect associations between genotype frequency and a trait of interest in large populations of patients and healthy controls.
a mutation that causes an increase in normal gene function; also known as gain of function.
a mutation that causes a partial loss of gene function.
complete ablation of gene function (also referred to as null), often involving the removal or replacement of large genomic regions including coding and intronic sequence.
- Linkage disequilibrium:
the tendency of alleles located close to each other to be inherited together due to a recent mutation, genetic drift or selection; often observed as a correlation between genotypes and closely linked markers.
the increased expression of a gene of interest or mutant variant controlled by the endogenous regulatory sequence or a defined promoter sequence. Can be either expressed from a defined genomic locus or randomly integrated into the genome.
the probability of detecting a statistically significant variant assumed to be causal.
the evaluation of specific association signals in additional independent samples and populations.
single-nucleotide polymorphisms are defined as variations of a single base pair that occur at a significant frequency in a population.
Similar success has been achieved in GWAS of obesity: following a flurry of papers in 2009, the total number of loci associated with obesity is at least 32 (Table 2) (Meyre et al., 2009; Thorleifsson et al., 2009; Willer et al., 2009; Heid et al., 2010; Scherag et al., 2010; Speliotes et al., 2010). Many of these loci contain genes that point towards a role for neurons and the brain in the disease, particularly the hypothalamus [which has a major role in obesity (for a review, see Belgardt et al., 2009)], whereas fewer loci might influence adipocyte biology and still others are completely novel.
Approaches for identifying the underlying disease-associated gene
There are several complementary approaches being applied to identify the underlying causative gene within each disease-associated locus. The availability of the human genome sequence allows a list of genes within a locus to be compiled, and the haplotype map (HapMap) data on linkage disequilibrium between SNPs allows the size of a locus to be defined. Available information about each of the listed genes – for example, about SNP proximity, gene expression, expression quantitative trait loci (eQTL; see below), known animal models and the phenotype of interest – can then be collated, and the plausibility of each gene as a candidate involved in disease can be evaluated.
In some cases there might be very strong candidates; for example, SNPs 188 kb downstream of the MC4R gene were found to be associated with elevated BMI at the population level (Loos et al., 2008). The MC4R gene is well characterised and is known to be involved in the control of food intake by the hypothalamus. Furthermore, there are known monogenic mutations in MC4R that cause severe childhood obesity (Vaisse et al., 1998; Yeo et al., 1998). Although an obvious candidate, the functional evidence showing how the DNA marked by the associated SNPs influences MC4R expression or MC4R protein function is missing. Understanding how the SNPs, or the true functional variants that they mark, influence MC4R and BMI is an important scientific question. The fact that this gene is involved in obesity indicates that MC4R might be a potential therapeutic target, although an MC4R receptor agonist has been to shown to increase blood pressure in humans (Greenfield et al., 2009).
When there is no obvious candidate gene within an identified locus, the search for a causative gene is more problematic. One approach being used to address this problem is to exploit eQTL data (e.g. Nica et al., 2010; Nicolae et al., 2010). These datasets are constructed by correlating expression levels of genes, usually measured using a highly parallel expression profiling technology, with the presence of different SNPs. Using natural genetic polymorphisms between individuals, differences in expression of specific genes between individuals can be mapped to SNPs. Many of these SNPs are in cis with the gene that is being regulated, although this is not always the case. Using these datasets, it is possible to ask: are the associated SNPs, or flanking DNA, linked with the expression of another gene in the region? If there is a gene being regulated according to the genotype of a GWAS-associated SNP (or one in LD with it), then that gene is a candidate to test for its potential involvement in the disease. One potential limitation of this approach is that much of the available highly parallel expression data is from human lymphoblastoid cell lines, which might not faithfully replicate endogenous patterns of expression in vivo – as in, for example, a neuron in the hypothalamus of the brain that controls food intake (Nicolae et al., 2010). It has also been suggested that many genes have an eQTL associated with them; as these datasets grow in size, the correlations between trait-associated SNPs and eQTLs might occur by chance (Nica et al., 2010). Approaches are being developed to address some of these issues, such as taking account of LD in the analysis (Nica et al., 2010). For example, in a study of fasting diabetes-related traits, eQTL analysis indicated that FAQDS1 is a candidate gene, because the most closely associated SNP is in LD with an SNP that is thought to be involved in regulating the levels of FAQDS1 mRNA (Dupuis et al., 2010). Furthermore, a recent type 2 diabetes GWAS reported a strong IRS-1 cis eQTL (Voight et al., 2010).
As we alluded to above, disease-associated SNPs might tag functional variants that are in LD with the SNPs used in mapping. For a small number of genes, the coding variants are known and are in LD with associated SNPs; for example, an SNP that is linked to type 2 diabetes and is located near the glucokinase regulatory protein (GCKR) gene is in LD with a P446L variant that was previously associated with glucose phenotypes, such as elevated fasting glucose and insulin levels, and for which there is some functional evidence that it influences glucokinase (GCK) activity (Beer et al., 2009; Dupuis et al., 2010). Consequently, another potentially valuable approach for identifying causal variants is the sequencing of individuals with and without the disease to investigate a locus for other potentially causal variants in genes. Furthermore, there might be other variants in different population subsets or extreme cases. Such an approach was described above for hypertriglyceridemia (Johansen et al., 2010).
A novel approach using zebrafish to identify causal genes has been described by Ragvin et al. (Ragvin et al., 2010). They took the sequence within LD blocks for three loci from type 2 diabetes associations and searched across species for highly conserved noncoding elements that potentially marked genomic regulatory blocks. They then tested these blocks using a transgenic reporter in zebrafish embryos and correlated expression patterns with other genes in the synteny region (where the gene order is conserved between species). For example, the region containing CDKAL1 contains several additional genes, including SOX4, and the associated SNPs seem to mark a highly conserved element within the fifth intron of CDKAL1. Ragvin et al. took this element and tested it for enhancer activity in zebrafish embryos; they found that the pattern of expression directed by this element closely resembled that of sox4 and not cdkal1. They concluded that SOX4, for which there is biological evidence supporting a role in insulin secretion, was a more likely candidate for involvement in type 2 diabetes than was CDKAL1, which was more probably a bystander (Ragvin et al., 2010). This is a very interesting approach, although it might be limited owing to the requirement to use embryos. In addition, the restricted patterns of expression of some of the other genes within the tested regions might result in some genes being rejected as the causal gene because they are not expressed in zebrafish either at that time or location.
The approaches discussed above only identify candidate genes – thus, it is still essential to functionally validate each gene. Ideally this would include a definition of the causal genetic variant linked to the mapped SNPs within a gene, or its functional regulatory elements, and how these changes affect the function of the target gene(s). Furthermore, this validation would address the function of those genes in an intact organism and how they give rise to increased disease risk. As discussed above, another complementary approach would be to test the candidate genes for a functional role in disease without necessarily understanding the details of the functional polymorphisms marked by the human SNPs. In this latter approach, the mouse, as well as other model organisms, can play a key role.
The value of mouse models in diabetes and obesity research
Mouse models have great potential to help take GWAS data to the next level by investigating the function of associated genes in vivo. This section will focus on the use of the mouse to model human metabolic disease. The mouse is a good mammalian model organism for several reasons: the availability of its complete genomic sequence, genetically defined strains and extensive genetic manipulation tools; the ease of breeding; the ability to control breeding and the phenotyping environment; and the availability of a wide array of phenotyping tests, many of which are standardised and easily applied. But how good a model is the mouse for human obesity and type 2 diabetes? As discussed below, many aspects of the physiology of these disorders appear to be the same in both species, although there are some differences. Notably, the control of blood glucose is similar in many respects: both humans and mice exhibit impaired glucose tolerance and overt diabetes that can be diagnosed with essentially the same symptoms and tests.
Mouse models of type 2 diabetes and dysregulated glucose metabolism
An interesting difference between mice and humans is that mice have high levels of high-density lipoprotein (HDL) cholesterol that might protect them against atherosclerosis. Dyslipidaemia is an important component of metabolic syndrome (diabetes, obesity and hypertension) in patients. The basis of this difference might be explained by two recent papers showing that miR-33a and miR-33b, which are located within the two genes encoding sterol regulatory element-binding proteins (SREBPs), are involved in regulating cholesterol biosynthesis (Najafi-Shoushtari et al., 2010; Rayner et al., 2010) (for a review, see Brown et al., 2010). These microRNAs target several mRNAs for destruction, including that of the ABCA1 gene, which is required for cholesterol transport and, in the liver, for HDL production. Unlike humans, mice lack miR-33b, which is encoded in an intron of SREBP1, and so might have higher levels of ABCA1 and thus higher levels of HDL. It will be interesting to see whether this is indeed the case and, more importantly, what roles these microRNAs have in human metabolic syndrome (Brown et al., 2010; Najafi-Shoushtari et al., 2010).
Mutations in the mouse can give rise to similar phenotypes to those observed in humans carrying the corresponding mutations. For example, point mutations in the GCK gene give rise to maturity-onset diabetes of the young 2 (GCK-MODY2) in humans and to a similar phenotype in the mouse (e.g. Toye et al., 2004). In the case of at least one GCK mutation in the mouse, in addition to inducing a dominant phenotype, the mutation is homozygous viable, and there is a more severe glucose intolerance phenotype in homozygous mice compared with heterozygous mice (Toye et al., 2004). By contrast, mouse homozygous GCK-knockout mutations are lethal within a week of birth due to severe diabetes (Bali et al., 1995; Grupe et al., 1995). Rare homozygous viable human mutations in GCK have also been described, although disease in these patients is managed through insulin treatment (Njolstad et al., 2001). Therefore, the effect of the expression of a disease-associated gene between human and mouse can vary depending on the type of allele.
Not all mutations in mice clearly recapitulate the effects of disease-causing mutations in humans. An example of this is hyperinsulinism of infancy (HI) caused by inactivating mutations in the KATP channel subunit KCNJ11. Patients with HI exhibit life-threatening hypoglycaemia; however, mice carrying the disease-causing mutations show a more varied phenotype (Seino et al., 2000). Some knockouts show transient neonatal hypoglycaemia, whereas some dominant-negative transgenic mice go on to show overt diabetes due to loss of β-cell mass following initial hypersecretion (Miki et al., 1997; Seino et al., 2000). We recently described a homozygous Kcnj11 point mutation in the mouse that was identical to that found in a young HI patient (Hugill et al., 2010). Unlike the patients carrying this mutation, our mouse did not exhibit hypersecretion of insulin or hypoglycaemia, but we did observe impaired glucose tolerance and reduced insulin secretion (Hugill et al., 2010). By contrast, mice carrying gain-of-function mutations in Kcnj11 (i.e. channel-open mutations), which give rise to neonatal diabetes in humans, have a phenotype that closely resembles the human disease (Koster et al., 2000). This is illustrated in the study of the β-cell-specific expression of transgenic mice carrying Kcnj11 with the V59M mutation (corresponding to a well-characterised human mutation); this mutation gives rise to severe neonatal diabetes and the Kcnj11 mutant mice are providing a valuable tool for investigating the pathology and physiology of this disease (Girard et al., 2009). Furthermore, neuronal- or muscle-specific expression of this mutation shows that the muscle weakness observed in some patients with neonatal diabetes is of neurological origin. Overall, these studies show the importance of tissue-specific mouse models in revealing novel therapeutic strategies and drug-tissue specificity. In the case of neonatal diabetes, the treatment of muscle problems will require drugs that can pass the blood-brain barrier (Clark et al., 2010).
The morbidity and mortality associated with type 2 diabetes is the result of complications in multiple tissues, such as heart, kidney, retina and peripheral vasculature [for information on the prevalence of diabetes and complications in the UK see, for example, the International Diabetes Federation, IDF Diabetes Atlas 4th Edition (2009) (http://www.diabetesatlas.org)]. Whether the mouse is a good model for diabetic complications is an open question. These complications occur in patients over a long period of time, well beyond the lifespan of a mouse. Understandably, mouse models are usually not aged, as many experiments are done in the first few weeks or months of life. Ideally, mouse models of type 2 diabetes would be aged to a year or 18 months to monitor complications. There are a number of potential mouse models of diabetic nephropathy (Brosius, 2010); for example, susceptible mouse strains, such as FVB, db/db or diabetic DBA/2 mice, have been successfully used to identify the novel nephropathy Dbnph1 locus (Brosius, 2010; Chua et al., 2010).
Mouse models of obesity
The mouse has also been a pivotal model in obesity research. The identification, by the Friedman laboratory, of the genes underlying the diabetes mouse (db/db; LepR/LepR) and obesity mouse (ob/ob; Lep/Lep) as the leptin receptor and leptin itself, respectively, has been essential to the identification of hypothalamic pathways controlling food intake that are highly conserved across species (for a recent review, see Konner et al., 2009).
In summary, the mouse is a reasonably faithful model for type 2 diabetes and obesity, and offers great potential for investigating the underlying molecular and cellular mechanisms of disease development. Clearly, there are differences between mouse and human (although some of these differences might even help to provide insights into disease mechanisms) but, as one of the approaches available alongside work in humans and in other species and systems including in vitro experiments, the mouse is set to be a key contributor to the challenges that we face.
Taking on the functional challenge of GWAS using mouse models
Given that mouse models of type 2 diabetes and obesity are reasonable models of the human disorders, mice can be used to functionally test the genes within a GWAS locus. This can be achieved by examining the phenotypic consequences of genetic differences within candidate genes, including natural polymorphisms, knockout alleles, overexpression alleles, dominant-negative alleles and point mutations. An advantage of using, at least in the initial analysis, alleles that are globally expressed is that no assumptions are made about the tissues involved in a complex multiorgan disease such as type 2 diabetes; this is especially important for genes with poorly characterised or unknown function.
Knockout alleles have been used, for example, to assess whether the FTO gene, in which the associated obesity and/or diabetes SNPs are within intron 1, is in fact causal for these diseases. Homozygous Fto-knockout mice exhibit a lean phenotype characterised by reduced lean mass, reduced adipose tissue, reduced spontaneous locomotor activity, increased energy expenditure and relative hyperphagia, as well as increased postnatal lethality and postnatal growth retardation (Fischer et al., 2009). This suggests that the FTO gene is involved in determining adiposity, although this is complicated by the additional phenotypes of lethality and growth retardation observed when the gene is knocked out. Mice with a missense point mutation in Fto causing a loss of the protein’s demethylase function exhibited a milder phenotype of reduced fat mass without the additional complications, further supporting a role for this gene in determining BMI (Church et al., 2009). In addition, we have observed that overexpression of Fto in mice results in obesity essentially due to increased food intake (Church et al., 2010). Increased food intake, in individuals with the at-risk allele, has been observed in the majority of human FTO studies (Speakman et al., 2008; Timpson et al., 2008). Whether the human at-risk genotype affects expression levels of FTO in humans is unclear; however, there is one report that shows that, in fibroblasts and blood samples from heterozygous individuals, the expression levels of primary mRNA transcripts encoding the at-risk genotype are higher than those encoding the low-risk genotype (Berulava and Horsthemke, 2010). It would be interesting to know what the relative FTO expression levels are in, for example, neurons that are involved in food intake. The next issue that can be addressed using mouse models to investigate the function of FTO is which tissue(s) are involved in causing the phenotype. This can be studied using conditional alleles (knockout or overexpression floxed alleles in combination with tissue-specific Cre-recombinase lines) that allow knockout of the Fto gene only in specific tissues. An alternative to generating tissue-specific Cre mouse lines is to use adenovirus technology for targeted delivery of the recombinase to specific tissues or regions, as has been carried out for Fto in the brain (Tung et al., 2010).
Type 2 diabetes GWAS also identified the islet-restricted SLC30A8 gene (which encodes the zinc transporter ZNT8) (Sladek et al., 2007). Inactivation of SLC30A8 in the mouse supports the role of SLC30A8 as the causative gene underlying the susceptibility to type 2 diabetes conferred by the rs13266634 locus: the mice exhibited a defect in insulin secretion as well as glucose intolerance (Lemaire et al., 2009; Nicolson et al., 2009; Pound et al., 2009). Furthermore, β-cell-specific inactivation of SLC30A8 in the mouse also results in glucose intolerance (Wijesekara et al., 2010). SLC30A8 might therefore be a promising therapeutic target for the treatment of type 2 diabetes.
These models that functionally test disease-associated genes illustrate how the mouse can help to identify causative genes among GWAS data by providing functional evidence to support the involvement of a candidate gene in a disease phenotype. These models also provide tools for mechanistic studies that would be difficult to perform in humans, because mice enable access to tissues and the opportunity to carry out detailed in vivo physiological analysis.
There is an increasing availability of different mouse models in which to study disease-associated genes. The aim of the International Knockout Mouse Consortium is to make mutations in all mouse protein-coding genes, and is systematically generating gene-targeting and gene-trap constructs in C57BL/6N embryonic stem cells (see http://www.knockoutmouse.org/about). Other programmes, such as EUMODIC and the International Mouse Phenotyping Consortium, are carrying out or proposing systematic high-throughput phenotyping of these mouse lines, which might provide evidence that disease-associated mutations identified in GWAS directly influence BMI or alter blood glucose (see http://www.eumodic.org/). In addition, phenotype data on mice is being collected in a number of databases, including EUROPHENOME (http://www.europhenome.org/) and the Mouse Phenome Database (http://phenome.jax.org/). Together, these efforts to genotype and phenotype mutant mice should provide additional data that could aid in validating candidate disease genes identified in GWAS.
Approaches to the translational challenges
Even if we succeed in discovering the underlying disease-causing genes, how can we translate this into benefits for patients? Although an animal model is useful for testing potential therapeutic strategies, it is alone insufficient to bridge this gap. A clear understanding of the mechanisms by which a gene influences a disease, and of its function in the organism, might help to promote it as a pharmacological target to industry. Mouse models can facilitate mechanistic studies by providing tissue for experiments. They can also be used for the testing of other pathway components, such as upstream or downstream molecules and post-translational modifications, by perturbing them pharmacologically or genetically. Through clinical collaboration, the phenotype observed in mouse models can be compared with the human disease phenotype and, similarly, observations in patients can be examined and investigated further in the models.
The mouse models investigating the role of the FTO gene suggest that pharmacological inhibition of the encoded protein’s enzymatic activity might lead to a reduction in adiposity (see above), which is the basis of an anti-obesity drug. However, the high level of postnatal lethality in the Fto knockout and the severe multiorgan and developmental problems observed in a human carrying a catalytic inactive mutation in FTO suggests that inhibition of FTO as an anti-obesity therapy might have detrimental side effects (Boissel et al., 2009). Therefore, further research is needed on the function of this gene before it can be considered a viable therapeutic target that can be translated to the clinic. Additional experiments involving genetic manipulations in the mouse might aid our understanding of the tissue specificity and function of FTO in vivo.
A future strategy using the mouse
A broad-ranging strategy is needed to exploit the success of GWAS and translate the labour-intensive data they have generated into benefit for patients. Although this might take years, such a strategy may ultimately identify molecular pathways for common human diseases, some of which might overlap, that are potential pharmacological targets. The mouse can contribute to this goal by providing a platform for the functional analysis of candidate genes. Once genes have been selected for testing, there are a number of options for allele selection, including making conditional knockout alleles, or obtaining gene-trap or targeted allele embryonic stem cells from distribution centres. Another approach that we have exploited is to search for point mutations in DNA archives of male mice mutagenised with the powerful point mutagen ENU (Coghill et al., 2002; Quwailid et al., 2004; Church et al., 2009). This is advantageous because it can reveal a series of point mutations in the coding sequence of a candidate gene that give a variety of allele types, including nulls, hypomorphs, hypermorphs and dominant-negative alleles. Investigating the effects of the different allele types could be very informative, as described above for FTO. Moreover, a single base pair change in the mouse genome induced by ENU mutagenesis is unlikely to affect the expression or function of nearby genes. Alleles can then be selected for generation as live mice that are then subjected to detailed metabolic phenotyping (weight, body composition, glucose tolerance, insulin sensitivity, food intake, metabolic rate, etc.) using assays that are well established in the mouse. If phenotypes or subphenotypes relevant to the human disease are detected, then further alleles could be generated (such as overexpression alleles and conditional alleles) to investigate tissue specificity. These refined models then provide a resource for further functional analysis of candidate genes and pathways, which can include the application of proteomics, expression profiling and partner-protein analysis. A schematic of this strategy is illustrated in Fig. 1. The combination of multiple disease alleles in separate mouse lines, originally identified through GWAS in human populations, should provide excellent models of human disease for functional and therapeutic analysis.
ENSEMBL; genomic sequence and annotation: http://www.ensembl.org/index.html
HapMap; linkage disequilibrium information: http://hapmap.ncbi.nlm.nih.gov/whatishapmap.html
GWAS; searchable lists of GWAS loci: http://www.genome.gov/26525384
EMPRESS; standardised mouse phenotyping protocols: http://empress.har.mrc.ac.uk/
This strategy is potentially very powerful for uncovering genes that are associated with human disease, but, if applied systematically to all GWAS candidates, it is also resource hungry. Several genes per locus must be analysed in some detail in order to confirm or reject them as causal candidates. However, the potential gains are also great, because new functionally validated targets for development by pharmaceutical and biotech industries will probably be identified. A single approach will not be enough, however, because even “The best laid schemes of mice and men, Go often askew…” (Robert Burns’ poem ‘To A Mouse, On Turning Her Up In Her Nest With The Plough’, 1785). So, success will rely on a flexible and targeted approach, using all available tools as appropriate, as well as close collaboration between basic and clinical researchers.
Research in the authors’ lab was funded by the UK Medical Research Council, Diabetes UK and the Wellcome Trust. Christopher D. Church holds an MRC doctoral training studentship. The funders had no role in the preparation of this article, and authors declare that there are no competing financial interests.