SUMMARY
Despite the initial discomfort often experienced by visitors to high altitude, humans have occupied the Andean altiplano for more than 10000 years, and millions of people, indigenous and otherwise, currently live on these plains, high in the mountains of South America, at altitudes exceeding 3000m. While, to some extent, acclimatisation can accommodate the one-third decrease in oxygen availability, having been born and raised at altitude appears to confer a substantial advantage in high-altitude performance compared with having been born and raised at sea level. A number of characteristics have been postulated to contribute to a high-altitude Andean phenotype; however, the relative contributions of developmental adaptation (within the individual) and genetic adaptation (within the population of which the individual is part) to the acquisition of this phenotype have yet to be resolved.
A complex trait is influenced by multiple genetic and environmental factors and, in humans, it is inherently very difficult to determine what proportion of the trait is dictated by an individual’s genetic heritage and what proportion develops in response to the environment in which the person is born and raised. Looking for changes in putative adaptations in vertically migrant populations, determining the heritability of putative adaptive traits and genetic association analyses have all been used to evaluate the relative contributions of nurture and nature to the Andean phenotype. As the evidence for a genetic contribution to high-altitude adaptation in humans has been the subject of several recent reviews, this article instead focuses on the methodology that has been employed to isolate the effects of ‘nature’ from those of ‘nurture’ on the acquisition of the high-altitude phenotype in Andean natives (Quechua and Aymara). The principles and assumptions underlying the various approaches, as well as some of the inherent strengths and weaknesses of each, are briefly discussed.
The Andes: the mountains and the people
The Andean altiplano has long been a focus of research into high-altitude adaptation for a number of reasons including relative accessibility, the large numbers of altitude-adapted species that thrive there, the presence of large indigenous human populations and the early involvement in the field by such eminent Latin American biologists as Carlos Monge M., his son Carlos Monge C. and Alberto Hurtado. While high-altitude research in the Andes may have begun with studies such as Francois-Gilbert Viault’s study of the haematopoietic response to hypoxia in the late 1800s, the significant contributions made by Peruvian researchers may be traced back to the indignant response of Carlos Monge M. to the infamous (and often misconstrued) statements made by Joseph Barcroft in which ‘dwellers at high altitude’ were described as having ‘impaired physical and mental powers’. This story, as well as many others, can be found in High Life, John West’s (West, 1998) history of high-altitude research.
The altiplano lies in the central regions of the Andes Mountains and extends from central Peru into Bolivia. This extensive plateau, which ranges between 3000 and 4500m, is the site of numerous human habitations, ranging from small farming villages in Central Peru to the towns around Lake Titicaca (3800m) and to the cities of Cusco (3400m), Peru, and La Paz (3800m), the capital of Bolivia (Fig.1). The Andean high-altitude native population consists primarily of two linguistically defined ethnic populations: the Quechua and the Aymara. As of 1990, there were approximately 6.2 million Quechua speakers living in the highlands of Ecuador, Peru, Bolivia and Argentina and approximately 1.6 million Aymara living in the regions around Lake Titicaca and La Paz (Caviedes and Knapp, 1995). The two populations share similar environments and lifestyles, and the genetic distance separating them, as estimated by pair-wise comparisons of multiple polymorphic genetic loci, is relatively small. Salzano and Callegari-Jacques (Salzano and Callegari-Jacques, 1988) compared 21 South American Indian groups using 13 variable loci encoding red blood cell antigens and found that the genetic distance between the Aymara and the Quechua was less than that between these groups and any other group in the analysis.
Quechua was the language of the Inca empire and, although the Inca were aware of the problems that could arise when lowlanders were transplanted to the higher reaches of the Empire (West, 1998), redistribution of the vanquished throughout the imperial realm was the practice of the conquerors. As a result, the current Quechua-speaking population may have a heterogeneous ancestry that presumably includes some lowland natives. Interbreeding with Europeans will also have contributed to heterogeneity, although a combination of geographic and cultural factors may have limited the influx of Caucasian genes into the native Andean gene pool. Salzano and Callegari-Jacques (Salzano and Callegari-Jacques, 1988) estimate that the average Caucasian admixture in contemporary Quechua is approximately 25%, whereas that in the Aymara is approximately 8%. These values will presumably vary depending on the proximity of the two groups (indigenous and Caucasian) and the amount of time that they have been in contact.
There is an extensive body of literature describing the morphology and physiology of the high-altitude native populations in South America and, although no consensus phenotype has emerged, a number of traits have been postulated to be characteristic of these people (Table1). Not all these characteristics are necessarily adaptive, and it is likely that there is substantial interdependence between some of them (e.g. the increase in lung capacity may be associated with the increase in chest size and in turn contribute to the increase in pulmonary diffusion capacity). Nevertheless, there is little doubt that Andean highlanders are better adapted to the hypoxic conditions extant at between 3200m and 4000m than are sea-level populations.
Both environment and genetic heritage determine an individual’s phenotype. Delineating the relative influences of these two forces on the development of high-altitude adaptations has been the subject of a number of studies (Moore et al., 1994; Ramirez et al., 1999; Rupert and Hochachka, 2001). This review discusses some of the assumptions and methodologies that underlie these studies.
Has there been time enough for evolution to have occurred in the Andeans?
There is some debate over when the Andean altiplano was first colonised by humans. MacNeish (MacNeish, 1971) maintains that the archaeological evidence supports occupation as far back as 22000 years ago; however, other researchers are sceptical of both the dating and the artefacts on which these claims are based (for a critical evaluation of the evidence for early occupancy of South America, see Lynch (Lynch, 1990)). On the basis of less contentious evidence, 12000 years ago seems to be a reasonable estimate of the earliest substantial human activity on the altiplano, although whether these people were the direct antecedents of the current indigenous populations is unknown. This duration is an important parameter in considering the role of evolution in these populations as it establishes the time frame over which evolutionary changes would have had to occur. While 12000 years (approximately 600 generations) is not a long period by human evolutionary standards, it is sufficient time for selection significantly to alter the frequency of gene variants in a population even if the selective advantage conferred by the allele is slight (Fig.2).
It is unlikely that genetic adaptation has occurred in Andean populations as a result of the generation and promulgation of new alleles over the last 12000 years. The mutation frequency in humans is approximately 10−6 per meiosis per gene, and the probability of a beneficial variant arising is much lower. Furthermore, unless the interbreeding population was quite small, any new allele would have to confer a considerable advantage to avoid being eliminated by genetic drift (the stochastic variation of allele frequencies within a population) within the first few generations and, as there is no evidence for a unique and extremely adapted phenotype in human high-altitude populations, this scenario seems unlikely. However, the appearance of new alleles is not a prerequisite for adaptation. There is substantial genetic variability in humans. Extensive sequencing of the human genome indicates that between two people, on average, there is a single nucleotide polymorphism every thousand bases, or approximately 3 million per genome (Bentley, 2000). By convention, a polymorphic locus has at least two variants that are present in more than 1% of the population (Sunyaev et al., 2000). While most variants are silent and do not affect the coding or regulatory sequences of genes, many have associated phenotypes and thus contribute to human phenotypic variability.
Migration to a new environment may expose a population to new selective pressures and, as a consequence, favour the transmission of pre-existing variants that confer an advantage in the new conditions. Selective transmission would increase the frequency of these alleles and thereby increase the overall fitness of the population. Amplification of pre-existing variants could have contributed to the evolution of Andean populations, especially if there had been substantial genetic variation in the founder populations from which to draw such variants. DNA heterozygosity (a measure of genetic variation) seen in extant Native American populations may be evidence for ancestral diversity. Of the genetic loci examined in these populations, 90% were polymorphic (Kidd et al., 1991). Furthermore, mitochondrial DNA studies of current South American aboriginals (Monsalve et al., 1994) suggest that there was no significant bottleneck during the colonisation of South America, suggesting that the diversity in the North could have been retained during the migration to the South. There may have been, however, an important bottleneck in historic times. After (and probably as a result of) the arrival of Europeans in the Americas, the native population declined precipitously. Although population estimates for pre-Columbian New World populations vary considerably (Ubelaker, 1976), some estimates put the decline in Andean populations as high as 93% (from 9×106 to 6×105) between the years 1520 and 1620 (Cook, 1981). The extent to which the gene pool of the survivors differed from that of the pre-colonial populations is unknown, although it would certainly have been altered by the influx of European genes. Furthermore, the social upheaval, displacement and diseases that accompanied the arrival of the Europeans would have subjected the native population to new selective pressures during the critical period in which their populations were being re-established.
Both the archaeological evidence and studies of current aboriginal populations suggest that there was sufficient time and enough pre-existing genetic variation for evolutionary changes to have occurred in the ancestors of the current Andean people. However, for such change to have occurred, there must have been advantageous phenotypes that, at least to some extent, were genetically determined.
Heritability studies
Heritability is the proportion of phenotypic variation that is genetically determined. The remaining variation is due to environmental factors. Phenotypic variation is the source of the variation upon which selection can act. For evolutionary change to occur, there must be some genetic factors contributing to the selected phenotype. A strictly developmental trait, however valuable, needs to be re-acquired every generation.
A common method of estimating the heritability of a trait is to compare the resemblance between relatives. This estimate will vary depending on the relationship chosen and, because there is a greater environmental covariance in sibling pairs than in parent/offspring pairs, the latter comparison is generally more sensitive to genetic differences. Values tend to be higher between mothers and offspring than between fathers and offspring as a result of both maternal effects and non-paternity, and a mid-parent mean is often used in an attempt to average out these effects (Vandemark, 1992).
Heritability estimates can be divided into two general categories depending on the sources of variation. Heritability in the broad sense (H2) is an estimate of the proportion of the phenotypic variability that can be attributed to total genetic variability and is defined as:
Broad-sense heritability includes all sources of genetic variance such as the additive effects of the genes, dominance effects at loci and epistatic effects between genes. As the latter two are genotypic interactions, and therefore not inherited, a more restrictive parameter, heritability in the narrow sense (h2), gives a better estimate of the genetic variability transmitted between parents and offspring:
The magnitude of h2 is a major determinant of the potential of a trait to respond to directional selection and of the rate at which it will do so (Hedrick, 2000). The greater the heritability, the greater the scope for selection to act. By favouring the inter-generational transmission of the genetic variants that contribute to a beneficial phenotype, selection will increase the frequency of those variants in the population at the expense of the less beneficial variants. In the end, this may result in a loss of genetic variation in the trait as selection eliminates all but the most beneficial variant. Indeed, many traits that are associated with reproductive success (the ultimate arbiter of evolutionary fitness) have low heritability (Hartl, 2000) (see Fig.3), and a number of studies in wild populations have shown that traits associated with increased fitness have low variability (Mousseau and Roff, 1987).
The heritability of any given trait is highly environmental- and population-specific (Hedrick, 2000). The proportion of variance that is determined by the environment is highly influenced by environmental conditions, whereas the genetic variance, which is a characteristic of a population, will depend on both selection and genetic history (i.e. genetic drift, gene flow, mutation rate) and can vary considerably among species and populations.
Heritability studies in the Andes
In the mid-1800s, Denis Jourdanet, an early researcher into high-altitude adaptation, described the high-altitude native as having a ‘vast chest [that] makes him comfortable in the midst of this thin air’ (Houston, 1987). This is one of the earliest descriptions of what may be the most commonly cited characteristic of New World high-altitude natives: the relatively large ‘barrel chest’. Alberto Hurtado (Hurtado, 1932) commented on this characteristic in Andean populations and postulated that the enlarged chest could allow for increased lung volumes and thereby increase oxygen uptake. Whether this chest morphology is a genetic characteristic has been the subject of numerous studies.
In a series of studies of Aymara-speaking natives from Camacani, Peru (3900m) (Eckhardt and Melton, 1992), a number of anthropometric measurements were made (including nine determinants of thoracic morphology), and the percentage heritability of each trait was estimated (Table2). A number of thoracic traits showed significant heritabilities, suggesting that there was a genetic component contributing to their variation. If there was a similar genetic influence on thoracic dimensions in this population’s antecedents (and assuming that a larger chest conferred some advantage at altitude), then conditions were in place for selection to favour the acquisition of this trait. However, whether the current Andean ‘barrel chest’ represents a genetically determined high-altitude adaptation cannot be determined from these data. Significant heritability can mean that selection has had little effect on these parameters. As Melton (Melton, 1992) points out, if genetic variation was lost as selection shifted the population towards larger chests, then the traits that do not show significant heritability may have been most influenced by selection.
In addition to morphological characteristics, heritability estimates have been made for the ventilatory response to hypoxia in the Aymara. The hypoxic ventilatory response (HVR) is one of the initial responses to altitude. The drop in arterial oxygen pressure resulting from reduced oxygen availability rapidly stimulates a compensatory increase in respiration rate. There is evidence from twin studies for a genetic component in the control of HVR (Collins et al., 1978). A recent review (Moore, 2000) discusses the genetics and development of this trait in humans. Prolonged exposure to hypoxia appears to blunt the HVR in some high-altitude natives (Chiodi, 1957) such that their response, when challenged by further reductions in oxygen level, is reduced. As the resultant hypoventilation can lead to pathologically high haematocrits, this blunted response is believed to be maladaptive. Blunting of the HVR is more pronounced in Andeans than in Tibetans, and some researchers have postulated that this represents evidence for superior hypoxia-adaptation in the latter (Moore et al., 1992; Beall et al., 1997). Furthermore, reduced HVR has also been proposed to contribute to the greater susceptibility to chronic mountain sickness (Monge’s disease) in Andean highlanders. In a comparative study of high-altitude natives from the Himalayas (Tibetan) and the Andes (Aymara), Beall et al. (Beall et al., 1997) reported that the resting ventilatory rate was 50% higher in the Tibetans and that the heritable contribution to variance in both resting ventilatory rate and HVR was also significantly greater in the Asians (31% versus 21% and 35% versus 0% respectively). The authors concluded that, as the genetic component contributing to these characteristics is greater in the Asians than in the Andeans, the potential for evolution to act on these characteristics is also greater in the Tibetans. However, although the extant phenotypes appear to be consistent with superior adaptation in the Himalayans, the lower heritabilities in the Andeans may reflect greater selection, assuming similar variation in both ancestral lineages.
As mentioned above, comparison of heritabilities between populations is problematic. While the Andeans and the Himalayans have faced similar hypoxic stresses during their occupation of the highlands, the other selective pressures acting on the populations and the genetic backgrounds of their respective founder populations could have been quite different. The Andeans may have had somewhat more time to adapt than the Himalayans. There is substantial evidence supporting occupation of the Andes extending back at least 12000 years, but archaeological evidence suggests that the Himalayan plateau has only been occupied for approximately 5000 years (Cavalli-Sforza et al., 1994), and recent genetic analysis suggests that the ancestors of the current Himalayan Sino-Tibetan population were living in the upper-middle Yellow River basin (1000–2000m) approximately 10000 years ago (Su et al., 2000).
Heritability estimates are a valuable indicator of the potential for a trait to be subject to evolutionary change and are frequently used by animal and plant breeders to predict the efficacy of a selective breeding program (Hedrick, 2000). However, because the absence of heritability could mean either that selection has eliminated genetic variance or that there was no genetic variance to begin with, heritability estimates are of less value in determining whether evolution has already occurred. Resolving these two possibilities is difficult. Heritability is highly population- and environment-specific (Hartl, 2000), and caution must be taken when extrapolating from current populations to ancestral ones or to other current populations.
Migration studies
Perhaps the best way to separate the effects of genetic background from those of environment is to take advantage of natural population movements and compare genetically related people raised in different environments, or conversely, genetically distinct people raised in similar environments. Traits that are predominantly genetic in origin will vary less between related populations raised in different conditions than between unrelated populations, whereas traits that are predominantly acquired in response to environmental conditions will vary less between populations exposed to common conditions than between related populations raised in different environments. The ideal migration study uses offspring of first-generation migrants, who were conceived, born and raised in the new environment. These children were exposed from birth to the new conditions, so any developmental adaptations should be manifest but the probability of confounding genetic admixture resulting from interbreeding with local natives is eliminated.
Several researchers have taken advantage of population movements up to, and down from, the Andean altiplano to look for genetic contributions to the acquisition of a number of putative adaptive traits. Melton (Melton, 1992) compared physical development in children (Aymara and Quechua) born and raised in Puno, Peru (3900m), or in Tacna, Peru (800m), to parents who were recent migrants from Puno. A number of anthropometric measurements were made including 12 that reflected thoracic growth and development. Despite having been raised at a relatively low altitude, the children in Tacna still had the lengthened sternum characteristic of high-altitude populations and chests as large as those of the children raised at 3900m. The author concludes that this demonstrates ‘a genetic component in the growth patterns of these Andean people’. Both Hoff (Hoff, 1972) and Beall et al. (Beall et al., 1977) compared chest morphology of Quechua living at over 4000m with that of Quechua born, or raised from an early age, at lower altitudes. The results of both studies support the hypothesis that pronounced thoracic development is an intrinsic characteristic of the Quechua that will manifest regardless of the altitude at which the person is born and raised. Conversely, Frisancho et al. (Frisancho et al., 1975) found that Quechua native to 4150m had larger chest circumferences for a given height than Quechua born at a lower altitude (980m) and concluded that ‘the greater maximum chest circumference of the highland subjects is probably of environmental origin’. Whether the lowland Quechua still had larger chests than non-indigenous populations was not addressed. Overall, the lowland children were shorter than the highland children, whereas the opposite was seen in adults. The authors suggest that recent debilitating socio-economic trends that had affected the development of the lowland children could account for this observation. The paper illustrates the difficulty in isolating single environmental factors when studying migrant populations.
When interpreting migration studies that show that a translocated population is more like the population in their new home than that in their original environment, differential migration must be considered. Less adapted individuals may be more inclined to leave and therefore be disproportionately represented in the out-bound migrant population, although the magnitude of this effect may be mitigated by the co-migration of associates and family members who themselves were fully adapted. In the case of hypoxia adaptation, this would be more likely to result in downward migration because human populations are presumably well adapted to sea-level oxygen levels. Differential migration in response to other stresses (including those of socio-economic origin) should also be considered.
Another issue confounding comparisons between migrating populations is the possibility that the putative genetic predisposition to a trait manifests itself fully only in response to conditions extant in the original location and will therefore be less pronounced in the migrants regardless of genotype. Brutsaert et al. (Brutsaert et al., 1999) reported that the genetic potential for larger lung volumes at high altitude (3800m) depends upon developmental exposure to hypoxia in both Aymara and Quechua.
Co-segregation of polygenic traits
Polygenic traits are influenced by multiple genes and, in most cases, the variation contributed by each allele is small compared with the overall variance of the phenotype (Hartl, 2000). A polygenic trait tends to manifest itself in a population as a range of phenotypes rather than a small number of discrete phenotypes. Skin colour is an example of a polygenic trait with a number of major genetic determinants (Strum et al., 1998) including the highly polymorphic human melanocortin 1 receptor locus (Rana et al., 1999). Within a population, there is usually a range of skin colours that may, or may not, overlap with other populations. Skin colour has high heritability and, although significantly influenced by environmental factors such as tanning and vascularity, can be used as an indirect assessment of admixture in contrasting populations (Williams-Blangero and Blangero, 1992).
Skin reflectance is one method of quantifying skin colour, and Greksa et al. (Greksa et al., 1991) demonstrated that skin reflectances could be used to estimate the extent of European admixture in a population of Aymara living at 3600m in La Paz. In essence, since the Aymara were darker-skinned than the Europeans, in a mixed population, individuals with lighter skins have a greater percentage of European genes than those with darker skins. This observation was consistent with the distribution of European surnames in the population, a previously confirmed indicator of admixture between the two populations. In a subsequent study, the correlation between skin reflectance and several measures of lung function was determined in the same Aymara population (Greksa, 1996). After removing the effects of tanning and vascularity, there was a significant correlation between several measures of lung function and skin reflectance and, in all such cases, values were greater in the darker-skinned Aymara (Fig.4) than in those with lighter skin. The author concludes that ‘darker skin colour, which reflects an increase in the genes of Aymara origin, is associated with progressive increases in TLC [total lung capacity] and its components’ and that this is consistent with there being ‘an important genetic component to the enhanced lung volume of Andean highlanders’. However, the author also notes that previous studies in the same population had revealed a developmental component to the acquisition of enhanced lung volumes (Greksa et al., 1994), thereby supporting roles for both genetic and environmental factors in the development of this putative adaptive characteristic.
Association analysis using polymorphisms in candidate genes
If evolution is not relying on the generation and promulgation of new genetic variants but rather on the more rapid process of preferential transmission of pre-existing variants, the resultant changes in allele frequencies may be detectable by comparing allele frequencies between putative genetically adapted populations and those that have not been exposed to the same environmental conditions. Over-representation of alleles in the former population may reflect adaptation. This is a derivation of association analysis, which is commonly used in the search for genes involved in the aetiology of complex diseases (e.g. diabetes) (Cox and Bell, 1989). In this instance, however, the association being searched for is not between an allele and a disease phenotype, but rather between an allele and an adaptive phenotype. As there are approximately 3×109 bases in the human genome and, by some estimates, a variant on average every 1000 bases, there can be a substantial amount of variation between populations simply from chance. Testing large numbers of randomly selected polymorphisms greatly increases the probability of observing chance associations, particularly prior to correcting for multiple tests, and the corrected P values are often so low (Xu et al., 1998) that statistical power requires unrealistically large sample sizes.
By choosing candidate genes that, by definition, have a prior probability of being involved in the genotype of interest (Cox and Bell, 1989), fewer tests are necessary and higher P values are obtainable. This reduces the chance of spurious associations while increasing the chance of detecting a valid association (to the extent that the ‘candidacy’ is legitimate). Typically, candidate genes in high-altitude adaptation would be those whose products are involved in the uptake, transport and utilisation of oxygen, although genes with less direct impact could also be considered.
While analysis of functional variants may actually shed light on the physiology involved, silent variants can be informative as well. If alleles at phenotypically silent polymorphic loci are in linkage disequilibrium with the causative allele, then they can be used as a surrogate genetic marker to follow the nearby variant. This is particularly useful if there is an easy assay for the marker genotype, such as an altered restriction enzyme recognition site. Linkage disequilibrium refers to the non-random association of alleles at linked loci during meiosis, and the resultant co-segregating allele combinations that appear in populations are known as haplotypes (Fig.5). Unlike linkage, which is immutable (unless the genes physically move), haplotypic associations between alleles will decay over time as a result of genetic recombination. The rate at which this occurs is dependent on the genetic distance separating the linked loci (which may, or may not, reflect the physical distance separating them on the chromosome). The value of linkage disequilibrium in genetic analysis is substantial because the presence of one allele indicates the presence of the other(s), and therefore any allele in the haplotype can be used as a marker for any other allele. This allows detection of potentially significant associations even if potentially causal variants have not yet been identified, as well as the use of markers in more variable regions of the genome (e.g. introns). The region over which linkage disequilibrium spreads is estimated to average 3000 bases (Keavney, 2000). As recombination leads to equilibrium, the extent of linkage disequilibrium around a given site will depend on the frequency of recombinant events which, in turn, is a function of time (generations), genome location and population size. Maximum disequilibrium will be found in small populations that are recent descendants of a small founder population.
Haplotypic associations can vary from population to population (Kidd et al., 2000). When a population divides, there is a chance that the distribution of haplotypes will differ in the two sub-groups. This is especially likely when a small group splits off in which one haplotype is highly over-represented. The clock is reset to zero, and the decay of the association between alleles in the haplotype has to begin again in this population (Keavney, 2000). Other issues to be considered in the analysis of genetic associations are (i) the possibility that the phenotypically significant unit is the haplotype and not the individual alleles (Fernandez-Vina et al., 1993) and (ii) the fact that haplotypes are not necessarily reciprocal (i.e. while allele A always assorts with allele B, allele B does not always assort with allele A). Association depends on the history of the polymorphism and, as shown in Fig.5, will reflect the original haplotype.
Rupert et al. (Rupert et al., 1999a) used the candidate gene association analysis approach to determine whether selection against factors that contribute to blood viscosity could have played a role in altitude adaptation in the Quechua. Some degree of polycythaemia is common in Andean populations, and the increase in blood viscosity that accompanies high haematocrits can have pathological consequences (Dintenfass, 1981). Fibrinogen concentration is the primary determinant of plasma viscosity (Lowe et al., 1993), so Rupert et al. (Rupert et al., 1999a) postulated that reduced levels of fibrinogen might offset the increased blood viscosity resulting from elevated haematocrits and that, in high-altitude native populations, this might be reflected in lower frequencies of alleles associated with higher levels of fibrinogen. The results of that study showed that the alleles associated with lower fibrinogen level at three polymorphic loci in the β-fibrinogen gene were more prevalent in the Quechua than in lowland populations, and the authors conclude that this observation is consistent with these alleles having been selected for in this population. This study serves to illustrate both the strengths and limitations of the candidate gene approach.
Genomic DNA is easy to prepare and store and, with the advent of polymerase chain reaction (PCR) amplification, a single, small blood sample can provide enough DNA for numerous experiments. In addition, an individual’s genotype is unaffected by age, physical condition or lifestyle; thus, comparing genotypes avoids many of the confounding variables that plague phenotypic analyses. In the fibrinogen example, the potential phenotype (blood viscosity) is highly influenced by such diverse factors as gender, birth-control method, smoking history, occupation and physical fitness. However, the drawback to working exclusively with genotypes is that selection acts on the phenotype and not on the genotype. If the association (causal or otherwise) on which the analysis is based does not hold in the studied population, then an over-representation of alleles would not confer any selectable advantage to the population and might instead reflect stochastic fluctuations in allele frequency (i.e. genetic drift).
Another important issue to be addressed in designing association analyses is the choice of controls. Allele frequencies may vary between two populations by chance alone. In the example cited above, the under-represented β-fibrinogen alleles may have been lost in the Quechua, or their ancestors, by chance, and this loss coincidentally was consistent with the predicted loss due to adaptation. The more closely the control population is related, the less likely are random deviations in allele frequency. Rupert et al. (Rupert et al., 1999a) determined β-fibrinogen allele frequencies in the Na-Dene, a Native American population that is not considered to be closely related to the Quechua, as a lowland population control. While this provided some basis for comparison, the ideal control population would be the last group to separate from the Quechua before the latter migrated into the mountains. In that case, differences between the two groups will have arisen subsequent to the change in environment and are therefore more likely to be a consequence of adaptation to the new conditions (especially if frequencies for other markers for which there is no apparent phenotype are similar in the two populations). Alternatively, as postulated by Carlos Monge, the founding population may have had a high frequency of beneficial variants prior to its arrival in the Andes by chance, and this genetic composition facilitated occupation of the mountains. In this scenario, the difference would not be recognised in comparisons with recently separated lowland populations despite being of adaptive value.
Other issues that should be considered when designing association studies are population admixture, in which allele frequencies are altered by the influx of ‘foreign’ DNA, and population stratification. In the latter situation, a segment of the population in which both an allele and a trait are common, although not linked, skews the analysis and gives a spurious association. This is less of a concern when the trait defines the population itself, as is the case when considering adaptive phenotypes.
Another gene whose role in altitude adaptation has been examined using association studies encodes the angiotensin-converting enzyme (ACE). One variant in this gene, detected by the presence of an intronic Alu repeat (the ‘insertion’ or ‘I’ allele), was reported to be over-represented in elite European climbers (Montgomery et al., 1997), and the authors postulated that ACE represented a gene for ‘human performance’. The alternative allele (the ‘deletion’ or ‘D’ allele) has been associated with hypertension and cardiovascular disease in a number of studies (Schunkert, 1997). Rupert et al. (Rupert et al., 1999b) found that, although the I allele was more prevalent in the Quechua than in Caucasians, the allele was equally common in other Native American populations and, in fact, was less common in the highland natives than in other indigenous South American populations. Neither of these studies (nor many others) took into consideration that phenotypes may depend on the interaction between alleles at more than one locus and that synergistic interactions between the intronic ACE polymorphism (or the functional variant to which it is linked) and alleles at other genes in the renin–angiotensin pathway have been reported. The association between myocardial infarct and homozygosity for the ACE D allele was dependent on the presence of the C allele at the angiotensin 2 type 1 receptor (AT2R1) C/A1166 polymorphism (Tiret et al., 1994), and left ventricular mass index in male cardiovascular disease patients was predicted by the combination of the D/D genotype at the ACE loci and the C/C genotype at the angiotensinogen (AGT) T/C704 loci, but not by individual genotypes (Kim et al., 2000). If specific alleles of the AGT and AT2R1 genes are required to potentiate the effect of the ACE alleles, then frequency data for all loci should be considered. Selection may act to favour one allele only in the presence of the others.
Although differential representation of alleles in a population can mean that one of the alleles is subject to selection, an equally likely explanation is random genetic drift. One way to determine whether the over-representation is due to selective pressure favoring one allele is to establish that there is an associated phenotype that is both present, and of adaptive significance, in the populations under consideration. As mentioned above, this can be a challenge, especially if there are environmental factors involved as well. A less difficult alternative is a comparative approach. Similar allele over-representation observed in diverse high-altitude populations that presumably did not share recent ancestors would support the hypothesis that the allele confers a selective advantage. Lack of correlation between populations is less informative because there are a number of possible explanations. Regardless of benefit, the current frequency of an allele could be low in one of the populations if the allele was absent, or very rare, in the founding stock. Alternatively, the allele may not confer an advantage in one of the populations because of environmental conditions other than altitude, such as diet, social habits, disease, etc. Another possibility is that the same benefits are achieved but by an alternative pathway (e.g. one population reduces the production of a product while a second population increases its degradation). Phenotypically, the two populations would be similar, but genotypically they would differ.
Despite the limitations inherent in association studies and the difficulties in finding the appropriate controls, this approach remains a powerful method in looking for the influence of genetic variants on a complex trait. Significant over-representation of an allele that a priori would be predicted to be beneficial in the population in question may be sufficient reason to undertake the more costly and difficult analysis at the phenotypic level. Similar over-representation of a ‘silent’ mutation could be followed up by gene mapping or sequence analysis in search of the causal variant. In a frequently cited article, Risch and Merikangas (Risch and Merikangas, 1996) argue that association analysis using candidate genes is the most effective way to determine the genetic basis of complex diseases. We consider that this may also be the case for non-pathological traits, such as adaptive phenotypes. Choice of candidate genes could be based on observed characteristics but would be substantially more powerful if the phenotype had a demonstrable genetic basis, as ascertained by heritability and migration studies. Advances in micro-array technology together with the nearly completed human genome project and the growth in the number of recognised single nucleotide polymorphisms will increase the efficiency and scope of this type of analysis. The limiting factor may be the procurement of DNA from the adapted populations because there may be a finite window of opportunity to collect samples of this sort. Advances in communication and transportation, as well as changing attitudes to cultural boundaries, will inevitably lead to a homogenisation of the species. As expressed by the eminent population geneticist Luca Cavalli-Sforza in 1991, the founding year of The Human Genome Diversity Project: ‘The genetic diversity of people now living harbours clues to the evolution of our species, but the gate to preserve these clues is closing rapidly’ (Cavalli-Sforza et al., 1991).
Acknowledgements
This manuscript was substantially improved by incorporating many of the suggestions made by anonymous reviewers. J.L.R. is a research fellow supported by The Heart and Stroke Foundation of Canada. P.W.H. is supported, in part, by NSERC Canada.