## ABSTRACT

Aggregation chimaeras and X-inactivation mosaics in mice are alike in general appearance, but chimaeras are very much more variable in the proportions of the cell types (*p*) seen for example in coat pigmentation. The distribution of *p* in chimaeras is not binomial, but is uniform, or flat, between the two extremes. The greater variability of chimaeras arises from two sampling events that occur when cellular heterogeneity is already present in chimaeras but before it arises in mosaics. These are the differentiation of the inner cell mass from the trophectoderm and of the primary ectoderm from the primary endoderm. The second of these generates the flat distribution of chimaeras as a consequence of the two cell types being unmixed at that time. The two sampling events generate single-colour individuals in roughly the proportions observed. Consideration of the second sampling event provides evidence that the primordial germ cells must originate in the primary ectoderm and not in the yolk-sac.

Estimation of numbers of progenitor cells on the supposition of binomial sampling is not valid unless the clone-size is known or the cells in the sample are not contiguous. Data on coat pigmentation are consistent with the assumptions that X-inactivation is random in about 21 cells, that the sampling of melanoblasts is binomial (because they are not contiguous), and that the melanocytes of the head and body are descended from about 34 progenitor cells.

## INTRODUCTION

This paper is concerned with differences between ‘chimaeras’ and ‘mosaics’ in mice. The chimaeras to be considered are those produced by aggregation of cleavage-stage embryos, the strains aggregated differing in some recognizable genetic marker. The mosaics to be considered are those resulting from X-chromosome inactivation in females heterozygous for a sex-linked gene with distinctive phenotypes. Both are therefore heterogeneous for their cell types and both show variegation in the adult tissues, most readily seen in the coat colour, when the marker gene affects pigmentation. On the whole chimaeras and mosaics look very much alike. Many authors, however, have noted that chimaeras tend to be more variable than mosaics (see McLaren, 1976*a*), and our main purpose is to show how this difference of variability arises from the different origins of the cellular heterogeneity.

Most of the work has been done with albino versus coloured as the genetic marker distinguishing the two cell types, and the animals have been classified by the proportion of albino in the coat pigmentation. For the sake of generality and simplicity one cell type will be called ‘white’ and the other ‘black’. The proportion of ‘white’ (e.g. albino) cells in any organ or tissue will be symbolized by *p*. We are concerned, then, with the distribution and the variance of *p* among individuals. Individuals with 0 % or 100 % white will be referred to as ‘pure’ classes. Attempts have been made to deduce the number of‘progenitor cells’ from the frequency of pure classes, on the supposition that the variation arises from binomial sampling. The second part of the paper will deal more specifically with this problem. But since the whole problem of the variability concerns the sampling of progenitor cells we must be clear at the outset what the idea of progenitor cells implies.

### Progenitor cells

The following account is based on the explanation given by McLaren (1976*a*, p. 112). After the cellular heterogeneity has been created in the embryo the cells proliferate and the two types mingle with each other. The mingling may be complete, leading to a random mixture; or the growth may be partially clonal, leading to a non-random arrangement in patches. Then some cells become ‘allocated’ (McLaren’s term) to the formation of the organ or tissue that is studied in the adult. Allocation means, strictly speaking, that the descendants of these cells make up the whole of the organ, and contribute to no other organ or tissue. The progenitor cells are those cells that have become thus allocated. The cells that make up the embryonic tissue before allocation are not progenitor cells ; only after allocation are the allocated cells defined as progenitors. Allocation is an event that is inferred from the observed variation of cell proportions in the adult organ and, as McLaren points out, it need not coincide with any event of cellular differentiation that might be detected by other means. The concept of progenitor cells is therefore a statistical rather than an embryological one. The statistical requirement is that the cell proportions (*p*) among the progenitor cells should not be changed in their descendants by subsequent sampling events during the development of the organ. The variance of *p* in the adult organ is then a measure of the variance due to the sampling of the progenitor cells from the embryonic tissue of which they form a part. An ‘organ’ in this context is any assemblage of cells among which the proportions of the two cell types are observed. Thus when the proportion of one cell type is assessed from the proportion of albino in the coat, the ‘organ’ consists only of the melanocytes that contribute to coat pigmentation. The strict definition of allocation given above is unrealistic and unnecessarily restrictive. The developing organ may gain cells by immigraton of cells not descended from the progenitors, or it may lose cells by emigration to other tissues. Neither of these events will affect the issue provided the cell proportions in the developing organ are not changed by them.

## DISTRIBUTION AND VARIANCE

Distributions of mosaics have been recorded by Nesbitt (1971), and of chimaeras by Mullen & Whitten (1971). In Fig. 1 we show some additional distributions to illustrate the differences between mosaics and chimaeras. One group of mosaics (*a*) was marked by the sex-linked gene brindled which affects pigmentation; the data are from an experiment described by Falconer & Isaccson (1972). The other group of mosaics (*c*) was marked by albino with Cattanach’s translocation; the data were kindly supplied by Dr Cattanach. Five groups of chimaeras (*e-i*) are plotted separately because they involved different strain combinations and had different means. Groups (*e*), (*f*) and (*g*) are our own, reported by Roberts, Falconer, Bowman & Gauld (1976). The other groups are compiled from published sources, as noted under the figure, and from data kindly supplied by Dr Lyon. Group (*h*) was marked by multiple recessive genes; all the other chimaeras were marked by albino. The coat pigmentation of all mosaics and chimaeras was by class-intervals of 5 percentage points, except group (*i*) where the class interval was 10 percentage points. The two mosaic groups (*a*, c) and three of the chimaeras (*e,f*, g) were all classified by the same standards developed by Dr Cattanach and Mr J. H. Isaacson. The differences between them therefore cannot be attributed to different criteria of classification.

It is immediately obvious that the mosaics and chimaeras are very different. The chimaeras are not only much more variable, but their distributions are very clearly not binomial. Both groups of mosaics, in contrast, fit closely to binomial distributions, though they would not be expected to be exactly binomial for reasons to be stated later. For the sake of illustration, binomial distributions are shown in Fig. 1(*b*) and (*d*).These are the binomials best fitting the observed mosaic distributions (*a*) and (*c*) respectively, having the same mean and variance. The means and variances of the distributions shown in Fig. 1, with other parameters for discussion later, are given in Table 1.

The essential feature of the distributions of chimaeras is that, excluding the pure classes (0 % and 100 %), they are more or less flat, as far as can be judged from the rather small numbers. That is to say, all values of *p* are equally probable. We have, therefore, to look for a sampling process that will give a distribution that is flat, or nearly so, between the two extremes. The frequencies of the pure classes seem to differ between groups, though not significantly, ranging from 11 to 37 % for the two classes combined. We hope to show that these single-colour animals are not necessarily technical failures, but could result from the sampling processes in chimaeras.

## SAMPLING EVENTS AND THE ORIGIN OF CELLULAR HETEROGENEITY

The events in the early embryo that we have to consider are the following. The embryonic stages at which they occur are illustrated diagrammatically in Fig. 2.

In chimaeras the cellular heterogeneity is, of course, present from the beginning, when the embryos are stuck together. The aggregated embryo is then composed of two halves with different cell types. It has been shown by Garner & McLaren (1974) that virtually no mingling of the cell types occurs during the two cell divisions after aggregation of 8-cell embryos. Thus in the blastocyst of 64 cells the two cell types are still unmixed. This is one of the two main points in our argument and we shall return to it later.

(1) The first sampling event is the separation of the inner cell mass from the trophectoderm at days. The inner cell mass gives rise later to the whole of the embryo. It is composed of cells lying inside, the outside layer being the trophectoderm. In principle, a circumferential division would not affect the cell proportions, but the number of cells in the inner cell mass is not large, and chance could affect the numbers of the two cell types that find themselves in the inside. Garner & McLaren (1974) counted the cells in chi-maeric blastocysts after one of the components had been labelled with [

^{3}H]thy-midine. In six blastocysts the mean proportion of labelled cells in the inner cell mass was 0·45 ± 0·05. The variance of the cell proportions was 0·0161, and the range was 0·273 to 0·565. Thus it seems that some variation does arise in chimaeras from the sampling of cells to form the inner cell mass, but not enough to generate the flat distribution.(2) The second sampling is the division of the inner cell mass into ‘primary ectoderm’ and ‘primary endoderm’ at 4–5 days. There are then about 45 cells in the whole inner cell mass of non-chimaeric embryos, and roughly half of these, i.e. about 22 form the primary ectoderm from which the whole embryo develops (McLaren, 1976

*b*). Chimaeras at this stage have double the normal number of cells in the whole embryo, and probably about three times as many in the inner cell mass, for reasons of geometry explained by Buehr & McLaren (1974). Thus there are probably roughly 60–70 cells in the primary ectoderm of chimaeras when it differentiates. This is the sampling that we believe gives rise to the flat distribution in chimaeras, and we shall return to it later.(3) The cellular heterogeneity of mosaics arises when X-inactivation takes place. This occurs sometime between and 7 days (McLaren, 1976

*a*, p. 23), and is therefore probably after the second sampling. If X-inactivation takes place just before or just after the separation of the primary ectoderm from primary endoderm, then the number of cells present would be about 22. If it takes place later the number of cells would be greater. The important difference between the origins of cellular heterogeneity in mosaics and chimaeras is that in mosaics it arises randomly among the cells present, so that the two cell types are randomly mixed immediately after X-inactivation. Consequently the variation arising is binomial with a sample size of not less than about 22 cells. Nesbitt (1971) made a penetrating analysis of mosaics and concluded that the variation arising from X-inactivation is binomial with a sample size of about 21 cells at X-inactivation.The three sampling events considered so far -two in chimaeras and one in mosaics -produce samples that give rise to the whole embryo. The variation that arises from them is variation of the cell-proportions in the animal as a whole. Subsequent events, or processes, give rise to variation between different organs of the same individual, and therefore add to the variation between the same organ in different individuals. The processes that need to be considered are the following.

(4) The allocation of the progenitor cells of the organ. This will be considered later in the paper.

(5) Differential growth of the two cell types : the relative growth rates may vary from one organ to another, as found for example by Barnes, Tuffrey, Drury & Catty. (1974).

(6) Finally, there will be variation due to error of measurement of the cell proportions in the organ studied.

## ORIGIN OF FLAT DISTRIBUTION IN CHIMAERAS

The differentiation of the inner cell mass into primary ectoderm and primary endoderm is the second sampling event illustrated in Fig. 2. It takes place two, or at most three cell divisions after Garner & McLaren (1974) found the cell types to be unmixed. We assume that they are still unmixed when the sampling occurs. The consequences are then as follows.

For simplicity, let us suppose that the inner cell mass then consists of a roughly spherical ball of cells, divided in two equal halves by the plane of aggregation, with ‘white’ cells in one half and ‘black’ cells in the other. The cells that are nearest the blastocoel become the primary endoderm, the remainder becoming the primary ectoderm from which the whole of the embryo proper develops. According to the evidence summarized by McLaren (1976*b*), the division is roughly equal, so that roughly half the inner cell mass becomes primary ectoderm. The situation is illustrated in Fig. 2(i). There is no reason to suppose that the plane of aggregation will have any specific orientation with respect to the plane of differentiation. The angle between the two planes will presumably, therefore, be random. The proportion, *p*, of white cells in the sample (i.e. the primary ectoderm) depends on this angle. So if all angles are equally probable, all values of *p* will be equally probable, including the pure classes (0 % and 100 %). Thus, if the conditions specified hold, this sampling can in principle generate a flat distribution. Some further elaboration of the model is, however, necessary.

First, there has already been some variation in cell proportions introduced by the first sampling (inner cell mass/trophectoderm), so the plane of aggregation will not be equatorial in all embryos (Fig. 2, iv and v). Second, the primary ectoderm may not comprise exactly half the cells, so the plane of differentiation may not be through the centre of the sphere (Fig. 2, ii and iii). Dr McLaren kindly provided data on this point, consisting of cell counts in 10 non-chimaeric embryos at the appropriate stage. The total number of inner cell mass cells ranged from 31 to 177. The proportion of cells that had differentiated into primary ectoderm ranged from 0·41 to 0·63, and this proportion was not correlated with the total cell number. The mean proportion was 0·51, with a standard error of 0·02 and a standard deviation of 0·06. Thus it is clear that, on average, half the inner cell mass cells differentiate into primary ectoderm, and that there is some variation in this proportion. It is not known whether chimaeras, with twice as many cells, do the same but we shall assume that they do.

We have worked out the geometry of the sampling at the differentiation of the inner cell mass into primary ectoderm and endoderm. Details of how the distributions were derived are given in Appendix I, and the conclusions are illustrated in Fig. 3. Figure 3 shows frequency distributions of the proportion of white cells, *p*, in the embryonic ectoderm and therefore in the whole of the embryo proper. Values of *p* are grouped in five percentage-point classes, so that the distributions are comparable with the observed distributions in Fig. 1. There are difficulties in showing the frequencies of pure classes in these histograms of theoretical distributions. Pure classes will be dealt with separately later; in Fig. 3 they are combined with the end-classes, 5 % and 95 %. There are two variables that affect the distributions: the proportion, *a*, of white cells in the inner cell mass, which as noted earlier has been seen to vary between about 0·3 and 0·6; and the proportion, *b*, of inner cell mass cells that form primary ectoderm, which we saw in the previous paragraph may vary between about 0·4 and 0·6. The four upper graphs show what happens if *a* is always 0·5 and *b* varies between 0·3 and 0·7. Consideration of Fig. 2 will show that when *b* is less than 0·5 (case ii) all values of *p* can occur; but when *b* is greater than 0·5 (case iii), *p* is restricted within narrower limits. The lower four graphs show what happens when *b* is always 0·5 and *a* varies between 0·3 and 0·7. The distributions are now asymmetrical, reaching one extreme but not the other. The reason for this can be seen from Fig. 2, (iv) and (v) : with the proportions shown in (iv) the primary ectoderm can be all-white but not all-black; with the proportions in (v) it can be all-black but not all-white, the greatest possible proportion of white cells in the primary ectoderm being as shown in the drawing. The graph for *a =* 0·5 with *b* = 0·5 is not shown; it is perfectly flat from one extreme to the other. We have not explored the consequences of both *a* and *b* varying simultaneously.

All the distributions in Fig. 3 are very nearly flat between the limiting values of *p*. Any real distribution is likely to be a mixture of distributions such as those shown, and with other combinations of *a* and *b*. Predictions about the expected distributions therefore cannot be precise, but the expectation of a more or less flat distribution seems fully justified, provided the mean values of *a* and *b* are close to 0·5. Certainly the observed distributions in Fig. 1 can easily be accounted for by the theoretical distributions in Fig. 3.

We come now to the question of what will be the frequencies of the pure classes. The number of possible values of *p*, the cell proportions in the sample, is limited by the number of cells in the sample. If there are *n* cells, then there are only *n+1* possible values of *p*, i.e. 0/n, 1/n..*n*/*n*. If the distribution is perfectly flat, then all values of *p* are equally probable and the frequency of each pure class will be l/(n+l). As noted earlier, there are probably about 60 or more cells in the primary ectoderm of chimaeras, so a perfectly flat distribution would have pure classes represented with a maximum frequency of about 1/61 = 1·6 % for each class. Table 2 gives the frequencies of pure classes for each of three values of *n*, with different combinations of *a* and *b*. It will be seen that the value of *n*, within the range 20–100, does not very greatly affect the frequency of pure classes. Three points need to be noted about Table 2 before we compare observed with expected frequencies. First, consider what happens if *b* varies, with *a* = 0·5. If *b* is always less than 0·5 (Fig. 2, ii) we shall get both pure classes, each with the frequency given in Table 2. But if *b* varies round a mean of 0·5 it will be greater than 0·5 as often as it is less than 0·5, and when *b* is greater than 0·5 (Fig. 2, iii) we get no pure classes. Thus with variable *b* the overall frequency of each pure class will be only half of the value entered in Table 2. Second, consider what happens if *a* varies, with *b =* 0·5. If *a* is always greater than 0·5 (Fig. 2, iv) there will be only one pure class with the frequency shown in Table 2. If *a* is always less than 0·5 (Fig. 2, v), there will be only the other pure class, with the same frequency. But if *a* varies round 0·5 there will be both pure classes, each with half the frequency shown. In summary, if either *b* or *a* vary round a mean of 0·5 the frequencies in Table 2 are those of both pure classes combined, except in the case of both *a* and *b =* 0·5 when the frequency entered is of one pure class. The third point concerns the generation of asymmetry, i.e. unequal frequencies of the two pure classes. Asymmetry can be generated only by *a* having a mean different from 0·5, such as would result from one cell-type being preferentially included in the inner cell mass when this differentiates from the trophectoderm. If the mean of *b* differs from 0·5 the consequence is an increase or a decrease of both pure classes.

The observed frequency of pure classes differs among experimental groups. In the five groups in Fig. 1 and Table 1 the combined frequency of both pure classes ranges from 11 % to 37 %, with an unweighted mean of 23 %. The differences between these five groups are, however, not significant (*x*^{2}_{[4]} = 8·52, *P >* 0·05). The three ‘balanced’ strain combinations of Mullen & Whitten (1971) are nearly alike, with a mean of 33 %. Sanyal & Zeilmaker (1976) found 36 % of pure classes for coat pigmentation. It is clear from these data that the observed frequencies of pure classes are generally higher than would be expected from the variation of *a* and *b* as shown in Table 2. The expectations in Table 2, however, refer to the whole body, whereas the observed frequencies are ‘singlecolour’ individuals classified by coat pigmentation. Single-colour individuals have often been found to be chimaeric in other organs. The data are very scanty, but we have made a rough estimate of the proportion of single-colour individuals that are chimaeric elsewhere. In our own chimaeras (group *f* in Table 1 and others not included) chimaerism of an enzyme marker was looked for in 8 other organs or tissues (I. K .Gauld, unpublished) ; 2 out of 11 single-colour individuals were chimaeric elsewhere. In group (*h*) of Table 1, 2 out of 6 were chimaeric in hair-follicles. In Sanyl & Zeilmaker’s (1976) data 3 out of 24 were chimaeric in eye-pigmentation or in retinal cells. The total is 7 chimaeric out of 41, or 17 %. This, of course, is a minimal estimate because of the limited search of other tissues. To get the frequency of ‘true’ pure classes we have to reduce the observed frequency of single-colour individuals by 17 % or more. Taking the observed frequency as very roughly 30 %, this gives 25 %, which still seems too high to be accounted for by the variation of *a* and *b*. We think, nevertheless, that it is not necessary to attribute single-colour individuals to technical failure of the aggregation, for four reasons. First the proportion that are chimaeric elsewhere may be higher. Second, the expectations in Table 2 are for either *a* or *b* varying, but not both together. If both vary simultaneously the proportion of pure classes could be much higher. Third, the expectations are based on the assumption that there is no cell selection. If the two cell types proliferate at different rates the mean cell proportions observed will not be 0-5 and the frequency of pure classes will be increased above the expectations based on no selection. Some of the data cited above on the frequency of pure classes may not meet this requirement. And, finally, the expectations can only be approximate because the inner cell mass is not in reality spherical.

## SAMPLING FROM CLONES

So far we have considered the sampling of cells that give rise to the whole embryo and adult. In the case of chimaeras we have seen that there are two samplings and they give rise to a large amount of variation, ranging from one extreme to the other. In the case of mosaics there is only one such sampling, at X-inactivation, and this gives rise to binomial variation dependent on the number of cells at the time of X-inactivation. In this section we consider the subsequent sampling of progenitor cells to form particular organs or tissues. This gives rise to variation between different organs of the same individual and adds to the variation between the same organ in different individuals. As Nesbitt (1971) has pointed out, if an analysis of variance is made between and within individuals, the component between individuals, i.e. the covariance, estimates the variation due to the first sampling, and the component within individuals estimates the variation due to the later samplings of the progenitor cells.

The sampling of progenitor cells has in the past been assumed to be binomial, so that the number of progenitor cells, *N*, can be deduced from the variance *σ*^{2} = *pq/N*, or from the frequencies of pure classes, which are *p*^{N} and *q*^{N} respectively, where *p* and *q* are the mean proportions of the two types of cell. The supposition of binomial variation, however, is not valid if the progenitor cells are contiguous cells, sampled from a tissue in which there has been some coherent clonal growth so that the two cell types are not randomly dispersed in the tissue. The variation then depends not only on the sample size but also on the clone size in the tissue from which it is taken. It turns out that the binomial supposition can lead to estimates of cell number that are grossly wrong. Other reasons for doubting the validity of the binomial supposition are discussed by McLaren (1976*a*).

If, however, the progenitor cells are sampled singly, and not as a contiguous group, then the variation arising will be binomial, whatever the clone size in the tissue may be. It is possible that the progenitor cells of melanocytes become allocated as single cells in the neural crest. If this is so, then the variation of skin pigmentation arising from the sampling of progenitor cells would be binomial, and estimates of their number would not be subject to the type of error discussed below. Melanocyte progenitor cells are the subject of the next section.

A ‘clone’, in the sense we use it, means a group of contiguous cells that are descended from a single cell one or more cell divisions previously. Adjacent clones of the same cell type form a single ‘patch’. There will, of course, be patches even if the arrangement of the cells is completely random (see West, 1975, 1976). Sampling of contiguous cells from a random arrangement will yield a binomial distribution. We are concerned with the embryonic tissue in which the progenitor cells become allocated. In a rapidly growing tissue, cell mingling would have to be very rapid if it were to break up the clones and maintain a random arrangement. So in embryonic tissue one must expect clones to be present and patches to be larger than the random size.

We have worked out the consequences of sampling in a one-dimensional array, i.e. a line, of cells. We assume there is an initial state of random arrangement, corresponding to a clone size of *k* = 1. Then there is cell division with no mixing, so that daughter cells lie side by side, and successive divisions give clone sizes of *k* = 2, 4, 8, etc. We have assumed for simplicity that all cells divide at the same rate, so that all clones are the same size. We assume, further, that after the sampling of progenitor cells the two cell types proliferate at the same rate, so that the cell proportions observed in the adult are the same as they were in the sample. Details of the derivation of the variance are given in Appendix IL The results are shown in Fig. 4. This shows the variance of cell proportions, , plotted against the sample size, *n*, when the clone size, *k*, is 1, 2, 4 or 8. A clone size of *k* = 1 represents a random arrangement of cells, so the variance shown for *k* = 1 is the binomial variance. It can be seen from the figure that for any sample size, *n*, the variance increases as the clone size, *k*, increases. This means that if we estimate *n* from the binomial variance (k = 1) we will always get an underestimate if *k* is greater than 1. For example, suppose the observed variance were 0·05, and the clone size *k* = 4. Reading, correctly, against the curve of *k* = 4 shows that the sample size was *n* = 19. But reading erroneously against the curve for *k =* 1 would give a ‘binomial’ estimate of *n* = 5. The conclusion is that the number of progenitor cells cannot be estimated from the variance unless the clone size in the tissue is known. Table 3 gives some examples from which the range or error can be appreciated. It shows the estimates (*N*) that would be obtained by supposing the sampling to be binomial, with various true values of *n* and clone-sizes of 2, 4 and 8. When the true *n* is large relative to *k*, the binomially estimated *N* approximates to *n*/*k*, which is the mean number of clones in the sample of progenitor cells. The table shows that the approximation is quite close if *n* is greater than about *2k*. Thus the binomial-TV is much better regarded as an estimate of *n/k* than of *n*.

The foregoing discussion refers to the rather unrealistic situation of a tissue composed of a single line of cells. For a two-dimensional, or for a solid tissue, the details may be different, but the same general conclusion must hold : that the sample size will be underestimated by assuming binomial sampling. If the sample size were larger than the clone size, the binomial estimate would approximate to the number of clones included in the progenitor cells, as we have seen it does in a one-dimensional tissue.

## SEQUENTIAL SAMPLING

One source of variation remains to be examined, and that is the sampling of progenitor cells of melanoctyes. In chimaeras, this sampling will give rise to some individuals that are chimaeric in some tissues, but not in the coat melanocytes, i.e. ‘single-colour chimaeras’. We want to find out what proportion of these would be expected. The number of melanocyte progenitor cells has been estimated in different ways from mosaics and chimaeras. We consider mosaics first.

*n*

_{x}and

*n*

_{2}are the sequential sample sizes, then where

*N*is the ‘effective’ sample size. The variance resulting from the two samplings is given by

*σ*

^{2}=

*pq/N, p*and

*q*being the mean cell proportions. Nesbitt, working with Cattanach’s translocation, estimated

*n*

_{1}= 21 from the covariance of

*p*in several different organs, and then estimated

*n*

_{2}

*=*22 for coat pigmentation from the observed variance. The observed variance, however, included variance due to error of measurement which was not separately estimated for coat pigmentation. If allowance is made for the error variance, the estimate of

*n*

_{2}is considerably increased. Falconer & Isaccson (1972) classified a number of brindled mosaics each twice and found the correlation between the two scores, i.e. the repeatability, to be 0·92 in the strain whose distribution is shown here in Fig. 1 (

*a*). This means that 8 % of the variance of single scores was attributable to measurement error. The variance of

*p*in those mice was 0·0141, so the estimated error variance was . The classification, however, was made by judgement, not measurement, so the real error variation was probably greater than that arising from inconsistency of classification, but how much greater can only be guessed at. We shall show that a reasonable guess leads to an estimate of

*n*

_{2}that is consistent with the estimate derived from the striping pattern of chimaeras.

It is easier to think of the standard deviation of percentage scores attributable to error, rather than the variance of proportions. The estimate of given above corresponds to a standard deviation of about three percentage points. Remembering that the classification was made in five percentage-point classes, one might guess that the total error variance might be equivalent to a standard deviation of five percentage points, which means a variance of proportions of . Subtracting from the observed variance gives the ‘true’ variance from which *N* is to be derived. Taking the observed variance of the brindled mosaics from Table 1 gives an estimate of *N* = 13, which is only a little higher than Nesbitt’s estimate of 11. The effect on the estimate of *n*_{2}, however, is greater. If we take Nesbitt’s estimate of *n*_{l}*=* 21, we get n_{2} = 34. The effects of assuming different levels of error variance are as follows.

Turning now to chimaeras, the number of melanocyte progenitors has been estimated from the pattern of striping. From the ‘standard pattern’, Mintz (1967) concluded that there are 34 primordial cells, 6 for the head, 12 for the body and 16 for the tail. Wolpert & Gingell (1970) increased the number to 64, on the ground that the stripes are patches and not clones, and West (1975), on the same grounds but by a different method, arrived at a number of 68. For comparison with the estimate from mosaics, we are concerned only with the head and body, since tails were not taken into consideration in the classification. Taking Wolpert & Gingell’s total number and dividing it in Mintz’ proportions to head and body gives an estimate from the striping pattern of *n*_{2} = 34. This is the same as the estimate based on an error standard deviation of five percentage points. Making a reasonable guess at the error variance thus gives an estimate of the number of melanocyte progenitor cells in the head and body that is consistent with the independent estimate from the striping pattern. The estimate of *n*_{2} derived from the variance is, of course, dependent on the value taken for *n*_{1}. Confidence limits of *n*_{1} can be calculated from Nesbitt’s data. In her Table 3 she gives 15 estimates of *n*_{1} based on covariance of various tissues in pairs. The mean of these estimates is 20·13 with a standard error of 1·18. Taking 95 % confidence limits as ± 2 S.E. gives upper and lower confidence limits for *n*_{x} of 22·5 and 17·8. Using these with *N =* 13 (corresponding to *σ*_{e} = 0·05) gives lower and upper limits for *n*_{2} of 31 and 48. The real confidence limits must be wider than these because we have disregarded the error variance in estimating *N* from the observed variance. The estimates of *n*_{2} are therefore not to be regarded as being very precise.

The conclusions to be drawn from the foregoing calculations are that a consistent picture results from the assumptions that (i) there are about 21 cells in the embryo proper at the time of X-inactivation, (ii) the melanocytes in the coat of the head and body are derived from about 34 progenitor cells, and (iii) the sampling resulting from X-inactivation and from the allocation of progenitor cells of melanocytes are both binomial.

We come now to the main question about chimaeras : what proportion of single-colour individuals can be attributed to the sampling of melanocyte progenitor cells ? These are individuals that are chimaeric elsewhere in the body but not in the coat pigmentation. In considering single-colour individuals we can no longer exclude the tail, because the pigmentation on the tail is taken into account in distinguishing single-colour individuals from overt chimaeras. For this purpose, therefore, we shall adopt Wolpert & Gingell’s total of 64 melanocyte progenitors.

The proportion of pure classes among all individuals will be denoted by *P*_{1} and the proportion of chimaeras by *Q* = 1 *-P*. Let:

*n* = the number of melanocyte progenitor cells, taken to be 64.

*Q*_{1} = the proportion of individuals that are chimaeric before melanocyte sampling.

*Q _{2}* = the proportion that are chimaeric in the coat after melanocyte sampling, i.e. overt chimaeras.

Then the difference, *Q*_{1}*— Q*_{2}, represents the proportion of individuals that have become single-colour by the melanocyte sampling but are chimaeric elsewhere. Let ΔP be this increment to the pure classes resulting from melanocyte sampling. It is shown in Appendix III that ΔP = 2*Q*_{1}/(n +1). (We have assumed that the number of cells in the tissue from which the *n* melanocytes are sampled is large relative to *n*, and that among those that are chimaeric in this tissue all values of *p* are equally probable, i.e. the distribution is flat.)

This gives the proportion of single-colour individuals that are expected to be chimaeric elsewhere. The expectations according to the value taken for the frequency of single-colour individuals (*P*_{2}) are as follows :

We saw earlier that *P*_{2} ranges roughly from 10 to 40 %, and that *ΔP*/*P*_{2}, estimated from very limited data, is 17 % or more. The mean value of *P*_{2} in the groups from which the estimate of *ΔP/P*_{2} came was about 30 %, for which the expectation of *ΔP/P*_{2} would be 7 %. The discrepancy may not be very serious considering the paucity of data; the estimate of ΔP/P_{2} = 17 % has a standard error of 6 %, and the estimate of *P*_{2}*=* 30 % has a standard error of 4 %. An adjustment of one standard error in each would bring them into agreement. On the other hand, the discrepancy could be accounted for by an additional sampling event between X-inactivation and the allocation of melanocyte progenitors. This would result in some individuals having only one cell type in the neural crest tissue before melanocyte sampling, though still being chimaeric elsewhere. These individuals would be found as single-colour chimaeras, but would not have resulted from the melanocyte sampling.

## DISCUSSION

Examination of the events in the early embryo by which cells are sampled has provided a coherent explanation of many features of chimaeras and mosaics, and particularly of the much greater variation of cell proportions found in chimaeras. There are three early sampling events -the differentiation of inner cell mass from trophectoderm, the differentiation of primary ectoderm from primary endoderm, and X-chromosome inactivation. The first two events cause variation in chimaeras, but not in mosaics because the cellular heterogeneity is not then present in mosaics. The third event causes variation in mosaics, but not in chimaeras marked by an autosomal gene. Most of the variation in chimaeras comes from the second sampling and the reason why this is so much more than arises from X-inactivation in mosaics is that the two cell types are still largely unmixed when the primary ectoderm differentiates.

All three of these sampling events occur before organogenesis has started. Consequently the variation they produce is variation of cell proportions throughout the whole body. If no variation arose subsequently in separate organs, all organs would therefore be completely correlated in respect of cell proportions, in both chimaeras and mosaics. Subsequent variation affecting organs separately arises from the sampling of progenitor cells, and from other causes such as differential proliferation. It is tempting to think that if two organs are correlated in respect of cell proportions they must share a common cell lineage. A great deal of caution, however, is needed in drawing such a conclusion, for two reasons. First, a correlation between two organs by itself tells us no more than that they are both derived from the primary ectoderm. Second, if two organs are more highly correlated than either is with a third, there would be some grounds for concluding that the first two have some cell lineage in common that they do not share with the third. But there could be another explanation: the third organ could have fewer progenitor cells than the first two, and so have a greater variance. This alternative explanation, however, would not apply if it were the covariances, rather than the correlations, that differed in the way described.

The consequences of the sampling of cells to form the primary ectoderm has a bearing on the origin of germ cells. The primordial germ cells are first seen in the yolk sac, which is derived from the primary endoderm, but there is doubt about whether this is their site of origin. Evidence from injection chimaeras suggests that they do not originate in the yolk-sac, and Gardner & Rossant (1976) come to the following conclusion: ‘Hence, it would appear that primordial germ cells originate from the embryonic ectoderm rather than from the extra-embryonic endoderm, and that they secondarily migrate into the latter.’ Our evidence supports this conclusion for the following reason. If our description of the sampling that results from the differentiation of primary ectoderm from primary endoderm is correct, then the cell proportions in the primary endoderm must be the complement of the cell proportions in the primary ectoderm, at least roughly. For example, if the inner cell mass contains equal numbers of two cell types and half its cells become primary ectoderm ; then if the sampling results in the primary ectoderm having 60 % of white cells the primary endoderm must have 40 % of white cells. Consequently any extra-embryonic tissue derived from the primary endoderm should be negatively correlated in respect of cell proportions with tissues in the embryo proper. If primordial germ cells originate in the yolk-sac, then the gametic output of chimaeras should be negatively correlated with the somatic cell proportions. In fact the correlation is positive. This is clearly seen in the data of Ford *et al*. (1975). Ten of the overt chimaeras made in this department provided data (I. K. Gauld, unpublished). The correlation between their gametic output and coat pigmentation was +0·70 (P < 0·05). Thus, if our description of the sampling event is right, the primordial germ cells must originate from the primary ectoderm, and not from the yolk-sac.

The inner cell mass is assumed to be spherical with the proportion of white cells being *a* and the proportion sampled to form the primary ectoderm being *b*. Let us also assume that there are many cells in the inner cell mass so that we can assume that *p*, the proportion of white cells in the primary ectoderm, can take all real values from 0 to 1. A correction due to discreteness will be made later.

*α*and

*β*, measured in radians, be defined as in Fig. 5 and let the radius of the sphere be 1. First let us consider

*a*= 0·5,

*b*⩽0·5 as is illustrated in Fig. 5. Then by using cylindrical polar coordinates (the Z-axis being perpendicular to the plane of differentiation), it can be proved that

*V*, the volume cut off by the plane of aggregation, the plane of differentiation and the surface of the sphere, is given by Where Thus for any particular value of

*β*When

*b*= 0·5, (Al) and (A2) simplify to give By putting

*β*= 0 in the two expressions for

*V*and equating them, we find that Thus given any value for

*b*, cos

*α*, and thus

*α*, can be found by using the Newton-Raphson iterative method.

*β*can have any value between − π/2 and + π/2 with equal likelihood. Thus

*p*is a uniform random variable on ( −

*π/2*, π/2) and its probability density function,/(

*β*), is given by If

*α*⩽

*β ⩽*π/2, then

*p*= 0, while if −

*α*⩾

*β− π/2, p =*1. For all other values of

*p*, (A2) defines a unique value of

*p*lying in (0, 1). From (A 5), we can see thatX To find Prob (

*p′ ⩽ p⩽ p″*), the values

*β′*and

*β″*need to be found where

*β′*and

*β″*are those values of

*p*which on substitution in equation (A 2) give

*p = p′*and

*p″*respectively. They can be found by use of the Newton-Raphson interative technique applied to equations (Al) and (A2).

*p′*and

*p″*being the limits of the intervals. The probabilities of the end intervals (e.g. (0, 0·05)) must be increased by the addition of the discrete probabilities given by (A 6).

For each circumstance, the required probabilities (cf. (A 7)) can be found from the solution of a case with parameters *a ′* and *b ′* where *a ′* = 0 · 5 and *b ′* ⩽ 0 · 5. Let this second case give solution , then :

As with the simplest case, corrections can be made for discreteness when looking at pure classes.

Let us assume that, originally, cells were randomly distributed along a conceptually infinite line with the proportion of white cells being *p*_{0}. Let each original cell be replaced by *k* identical clonal descendents, *n* consecutive cells are now sampled.

*n ⩽ k*. Then the probability of choosing all

*n*from one clone = (k-n +l

*)/k*as the first sampled cell can be the 1st, 2nd,..

*k*th member of the clone with equal probability. Otherwise

*i*cells are chosen from one clone and (

*n-i*) from a neighbouring clone (where

*i*is an integer in the range (1,

*n —*1) with probability

*1/k*for each value of

*i*). Let

*X*be the number of white cells in the sample of size H. Then where

*V*

_{i}= the variance of the number of white cells in a sample of size

*i*which is all white with probability p

_{0}and all black with probability (1—

*p*

_{0}). The variances can be added as consecutive clones are independent of each other.

*V*

_{i}

*= i*

^{2}

*p*

_{0}(l -

*p*

_{0}). Thus (A 8) gives us that This can be proved by induction as follows. Let Var (

*X*/n) be the variance of the number of white cells from a sample of size

*n*. Let us sample

*n*cells and assume the variance is given by (A 10). Let us look at the next cell. It can be the 1st, 2nd, 3rd,.. .kth member of a clone with equal probability. If it were the 2nd member, say, then to get the Var (

*X*/n+1) from Var (

*X*/

*n*,), we would need to add

*V*

_{2}and subtract

*V*

_{1}. Thus averaging over all possibilities, Thus ,

Thus if (A 10) is true for *n*, it is true for M+1. (A 10) is true for *n = k* by comparison with (A9), thus by induction (A 10) is true for all *n ⩾k*.

*n*melanocytes are sampled from a large number (a conceptually infinite number) in which the proportion, p, of white cells takes any value between 0 and 1 with equal likelihood but is not zero or 1, i.e. using the flat distribution discussed previously. Thus

*p*is a uniform random variable on (0, 1) and has a probability density function,

*f*(

*p*), given by .

## ACKNOWLEDGEMENTS

We are very grateful to Dr B. M. Cattanach, Dr Mary F. Lyon, Dr Anne McLaren and Mr I. K. Gauld for providing us with unpublished data, and to Mr E. D. Roberts for drawing the figures.

P. J. Avery is grateful to the Science Research Council for financial support.

## REFERENCES

*Differentiation*

*J. Embryol. exp. Morph.*

*Genet. Res.*

*Proc. R. Soc.*B

*Embryogenesis in Mammals. Ciba Fdn Symp.*

*J. Embryol. exp. Morph.*

*Proc. R. Soc.*B

*Embryogenesis in Mammals. Ciba Fdn Symp.*

*Proc. natn. Acad. Sci., U.S.A.*

*J. exp. Zool.*

*Devi Biol*

*Nature, Lond.*

*J. Embryol. exp. Morph.*

*J. theor. Biol.*

*J. Embryol. exp. Morph.*

*J. theor. Biol.*