## ABSTRACT

A method is described for recording and analysing the projected shape of mouse vertebrae. The image of the shape is captured by a television camera, cleaned, digitized and subjected to mathematical analysis. A visual representation is obtained by reconstructing the shape in polar coordinates about its centre of area. Further statistical analysis of the whole shape is performed after a Fourier transform. This allows the shape to be represented by and reconstructed from 15 numbers. The method does not rely on homologous points or expert opinion and allows mean shapes to be constructed. It successfully distinguished between 92 % of the test data, T1 and T2 vertebrae from two strains of mice.

## INTRODUCTION

A common problem facing the morphologist is the comparison of the shapes of complex biological structures in a manner that allows full account to be taken of natural variation. The traditional answer to this problem has been to derive simple quantitative data which are suitable for univariate or multivariate statistical analysis. Traditionally these data have been in the form of linear and angular measurements taken between defined homologous points, or ratios of such measurements. In addition to the problems inherent in defining homologous points, this approach suffers from the additional disadvantages that it ignores the frequently large intervening regions and that it produces measurements which may be disconnected from each other. In consequence so much information is lost that the original shape, or even an approximation to it, cannot be reconstructed from the data.

This paper describes a system which retains as much of the information present in a shape as possible for mathematical analysis and does not depend on homologous points. The description falls into two parts, first the process of image capture and storage and secondly a discussion of the possible methods of analysis of the data so produced.

## MATERIALS AND METHODS

### 1. Image capture, processing and storage

Our first attempts to capture outlines of bones relied on a digitizing pad. The outline of a shape produced via a camera lucida or from a photographic print was digitized by tracing around it with a cursor. The accurate tracing of an outline proved to be excessively slow and laborious while the transfer of an outline to a tracing and thence to a computer multiplied errors.

We found that the process of image capture could be speeded appreciably by the use of a simple video camera interface (VCI, Educational Electronics, 30 Lake St, Leighton Buzzard, Beds LU7 8RX, England). This low-cost unit (less than £200) allows a standard video signal to be digitized by a microcomputer.

As examples of biological shapes we have chosen the anteroposterior projections of the first and second thoracic vertebrae (T1 and T2 respectively) of two strains of mice: (1) the multiple recessive strain (REC) is homozygous for the genes short ear *(se)*, vestigial tail (v*t*), non-agouti (a), brown *(b)*, dilute (*d*), pink eye (*p*), chinchilla *(c*^{ch}*)* and waved-2 *(wa-2);* (2) the F_{1} (DOM) of two inbred strains C57BL and C3H which carries dominant alleles at all these loci. The papaindigested skeletons used are part of the material of Griineberg & McLaren (1972) and were loaned by the British Museum (Natural History).

The mouse vertebra is placed on the illuminated base of a Wild M5a dissecting microscope (Fig. 1) and back lit. The microscope carries a standard C mount and is equipped with a black and white video camera. The image of the bone, suitably magnified to fill the screen and reversed black/white (for the convenience of the operator) appears on a monitor screen and is also fed to the VCI, which is in turn connected to a BBC (Acorn) microcomputer.

Once the synchronization and gain controls of the interface have been set to match the camera an image can be captured and displayed on the computer screen (Fig. 2). This is achieved by repeated sampling of the video signal and takes about four seconds. The BBC microcomputer works in various display modes. The software supplied with the VCI allows an image consisting of 160 ×256 pixels (in 4 shades) to be digitized.

Once the image has been captured the standard software allows it to be dumped to printer or to disc. We have added further programs which allow additional manipulations:

#### (i) Clean up

This program converts the four shades of the original image to two (i.e. produces a black and white image) and sharpens the edge by an averaging method similar to that used in many computer-enhancement programs. The screen memory is searched for a shade change and when one is detected the new shade is compared with that of its immediate neighbours and reset to conform to the majority. As well as cleaning the outline this program (which takes 45 secs) removes the image of dust from the background. We found accidentally that the image of a human hair laid across the vertebra was also removed.

#### (ii) Digitizing

This program finds the edge of a cleaned up image by sampling the screen diagonally. A vertebra is made up of two outlines, an outer representing the edge of the bone and an inner representing the border of the neural canal. The program allows the two outlines to be digitized automatically from one object by default. A ‘wandering probe’ starts at the bottom left corner of the screen and runs diagonally sampling locations 1,1; 2,2; 3,3 etc. until it locates a shade change at the edge of the shape. It then follows the edge of the outline until it has returned to the start point. A second iteration of this program starts at the centre of the screen by default (but can be preset anywhere). If the preset is within a foramen the outline of the latter is found and digitized as before. The outer outline is digitized clockwise and the inner anticlockwise to aid later identification (Fig. 3). More foramina can be identified by further defaults, or the ‘wandering probe’ which is visible on screen can be set by means of a joystick (50 secs for two outlines).

The data are stored on a disc as pairs of Cartesian coordinates then transferred to the mainframe computer for further analysis.

### 2. Mainframe manipulation of the data Superimposition of outlines and fitting

In order to achieve a mean outline from a series of individual outlines the latter must be superimposed. We do this in three stages. The outlines are first scaled to a standard area. The centre of area (centroid) of each is then found by integration and the shape re-expressed as 128 polar coordinates centred on this point (128 is a perfect square and a perfect square is needed for the fast Fourier transform, see below). This process allows superimposition of outlines upon their centroids.

where r_{n}, R_{n} are corresponding polar coordinates on two shapes, N = number of polar coordinates used and n is large.

Standard shapes were chosen by trial and error according to symmetry. A circle is obviously inappropriate because the residual area is equal in all orientations, but a semicircle is a possibility. In practice we used a simple polygon derived from a vertebral outline. Fig. 4 illustrates the process of fitting. The reference axis was taken as the midline of the standard shape and the start point for Fourier analysis (see below) as the point at which the ventral side of the outer outline of the vertebra crosses this axis.

### Errors

Errors in the procedure described above may arise from two sources, videodigitization and manipulation of data. To ascertain the size of these errors a test of technique was performed. This consisted of sequentially capturing 20 images of the same bone which was removed from the microscope stage and replaced in a different position and orientation after each capture. The 20 sets of Cartesian coordinates so generated were passed to the Amdal and fitted to a standard shape as described above.

### Further manipulation

Once all outlines in a group have been orientated with respect to a standard in this way the mean value of each polar coordinate is calculated and a mean outline generated (Fig. 5).

### Comparison of group means

Group mean outlines can now be plotted on top of each other. The significance of the difference at any polar radius can be estimated by a *t* test, or an estimate of total shape similarity can be made.

### Fitting an unknown outline to group means

A single individual of unknown provenance can be assessed for fit to any number of group means. A bone having a good fit (low total sum of differences of squares) with one group and a high total sum of differences of squares with another is likely to be a member of the first group.

### Fourier transforms

where a_{0} is a constant, a_{1} –a_{n} are known as cosine components, b_{1} –b_{n} are known as sine components, and F(θ) is the magnitude of a polar radius r.

where R_{0} –R_{n} are known as amplitude components and ϕ_{1} – ϕ_{n} are known as phase lag components. The amplitude/phase lag notation has the advantage over sine/cosine notation that the amplitude coefficients are independent of the start point of the waveform. It does not, however, allow shape reconstruction.

A simple shape is adequately described by the early harmonics of the Fourier series: a more complex one will require more components to describe it accurately. A corollary of this is that early components of the series describe gross features of the shape and later ones fine detail. In practice the ‘fine detail’ may represent noise in the measurement process and may often be discarded without detriment to shape analysis. Since we are dealing with several pairs of components, a multivariate statistical approach, which considers all variates simultaneously, is appropriate. We used the DISCRIM procedure within SAS (Statistical Analysis System; SAS users guide 1982) which calculates the generalized squared distances between each test individual and the calibration groups, and classifies them on the basis of ‘nearest group’.

The first pair of Fourier coefficients (a_{0}, b_{1}) are excluded because the first cosine component is constant (since areas are equalized, Lestrel, 1974) and the first sine component is always zero. The first fifteen pairs of coefficients referred to hereafter are therefore coefficient numbers 2 –31 inclusive.

## RESULTS

### Errors

### Capture

The unit of resolution of the television screen is the illuminated dot, the pixel. If the image of a bone occupies any part of a pixel the latter will be illuminated. This is obviously the source of a small error. With the magnification used (×50 objective plus ×0 · 3 correcting lens on the microscope, 16 “ colour monitor) the image of a test T2 vertebra had a maximum height of 152 mm and width of 172mm 1 pixel measured 0 · 5 mm high ×l · 0mm wide. The maximum error due to this source was thus less than 1 %.

### Computer rounding error

### Vertebral comparisons

Fig. 5 shows the computed mean outer outlines for 22 Tls and 14 T2s from the dominant strain. The computer-generated plot gives a polar reconstruction of vertebral shape (left) and an opened out linear plot (right). The trace represents the mean outline ± 10 standard errors of the mean. The more usual representation of mean ± 2 standard errors plots as a single line.

Fig. 6 shows comparison plots of Tls (upper) and T2s (lower) from dominant and recessive strains. Significant differences (P< 0 · 05) are present in many areas.

Table 1 shows the results of fitting individual T1s to group mean shapes. 44 out of 53 bones (83%) fitted best to their ‘correct^{5} group. A similar comparison amongst the T2s gave 32 out of 34 (94 %). Overall 87 % were correctly classified.

The Fourier series will adequately describe a shape in less than 128 variables (the number of polar coordinates chosen) and so potentially simplifies the data. Fig. 7 shows reconstructions of a T2 vertebra based on 5 –60 sine/cosine coefficient pairs. It can be seen that 15 pairs subjectively appear to describe the shape adequately and further coefficients add little to definition.

Univariate analysis of the first 15 coefficient pairs was undertaken. Examples of bar charts showing the upper and lower 95 % confidence limits and mean ± 2 S.E.M. for each population (group) are reproduced in Fig. 8. All cosine components showed significant differences between group means at the level of *P<* 0 · 0001 (Table 2), some discriminating between T1 and T2 and some between DOM and REC. Only 4 of 15 sine components were significant at this level. No single coefficient split all 4 groups unequivocally.

An objective test of the decision to analyse only 15 pairs of coefficients is to perform a discriminant function analysis which compares all variates simultaneously. For this a random sample of 10 % of all vertebrae was removed from the data set and an attempt made to classify them with respect to the remainder. This was repeated 10 times using 5,10,13,15,18 and 20 coefficient pairs. The best classification of this data set (92 % correct) was obtained using 15 sine/cosine pairs (Fig. 9). If fewer or more pairs are used definition suffers; below 15 pairs the shapes are poorly defined, above 15 pairs the effects of sample size and noise intrude. 79 out of 87 (91 %) of shapes were correctly classified using 15 cosine components only.

## DISCUSSION

The final shape attained by a bone must be dependent upon a host of factors, both genetical and environmental. The almost universal occurrence of pleiotropy (multiple effects of genes on characters) has led to the hypothesis that total phenotype is acted upon by selection and that it is this which evolves rather than individual characters or genes (Wright, 1968; Cheverud, 1982). Integrated systems are now emphasized in morphogenesis (Waddington, 1957; Leamy, 1977; Riedl, 1978; Lande, 1979; Atchley, Rutledge & Cowley, 1981; Cheverud, 1982; Bonner, 1982). If we view a bone as an integrated system then we must ask how best to measure its total shape.

In the conventional methods of comparing bone shapes homologous points are defined in such a manner as to permit measurements which reflect individual features thought to be of biological significance and which can be taken quickly and consistently. In practice we suspect that the latter consideration often outweighs the former. Thus Festing (1972) chose 13 measurements of the mouse mandible which could be read off ‘as quickly as they could be recorded by an assistant’, Atchley (1983) used eight traits ‘chosen because they are easily measured and the measurements are highly repeatable’ (rat mandible) and Leamy & Atchley (1983) used 19 scapular measurements ‘taken from well defined landmarks to optimise repeatability’. Multivariate analysis will remove correlations between such measurements so that they are mathematically respectable, but their biological significance must remain in doubt.

The technique described here does not rely upon homologous points. No start point for the coordinate stream is specified: the only defined point is the centre of area, which is a relatively neutral property of the shape. The number of polar coordinates generated depends upon the video system used. More than 128 points could, of course, be generated from a video system giving higher resolution. These are easily obtainable, but expensive.

Because the system generates a reconstructed shape rather than a series of numbers we need to think about analysis of results in a different way, deriving functions which relate to the shape as a whole rather than arbitrary measurements within it. Our simple polar plot superimposition allows variation to be taken into account, provides acceptable discrimination between shapes and tells us whether the difference is significant at a particular point or in a particular area. Using polar coordinates homologous points (the tips of the transverse processes, for example) will not necessarily map on the same radius in two compared shapes. The nth polar radius, encompassing the tip of the transverse process in shape A, should not, therefore, be compared blindly with the nth polar radius in shape B which misses it. If the nth radii between two shapes differ, then the shapes differ.

It can be seen from the data of Table 1 that a good fit to the dominant strain does not necessarily indicate a poor fit to the recessive shape and vice versa. This is because the shapes are highly irregular and a bone fitting the dominant shape well in some areas may fit a recessive shape well in others. Because of this further statistical analysis (such as probability of group membership) using this routine was not attempted. We also suspect that the populations of shapes may overlap in some cases, so that some individuals could be a member of either population. Moore & Mintz (1972) however were able to identify coded bones from C3H and C57BL mice with 85 –100% accuracy. The variation in inbred strains would, of course, be lower than in our material.

Each Fourier component, unlike each polar radius, represents a property of the whole outline. The problem of homologies is thus minimized in that each Fourier component is dependent only on the centroid (and in the case of sine/cosine components on the start point which is derived from our fitting routine, not arbitrarily specified by eye). Fifteen Fourier coefficient pairs can classify the shape as well as 128 polar coordinates and fifteen cosine components as well as fifteen sine/cosine pairs. The number 15 is probably a function of the particular shapes used and the variance within the data set: other sets of data describing bones of different shapes might well need more or fewer components to best describe them.

Since the Fourier procedure is a simple transformation of the original data its discrimination cannot exceed that of the former. Less than 15 pairs of components will simplify the shape (Fig. 7) and thus inhibit discrimination. The use of more than 15 component pairs cannot increase the discrimination further, but should not diminish it. In fact more pairs reduce discrimination a little: we suggest that this is due to noise, i.e. cumulative errors in the system.

It is a property of the Fourier series that sine components describe axial asymmetry (Zahn & Roskies, 1972; Lestrel, 1974): since the vertebral shapes are essentially symmetrical about their midline they can be reconstructed minus any asymmetry from the first 15 cosine components alone (Fig. 10, cf. Fig. 7).

Fourier components as used in this context are based upon a centroid and thus reflect the disposition of the shape about this point relative to the start point. Use of amplitude coefficients only would remove the start point dependency, but not that on the centroid. Because we are dealing with similar shapes the effects of centroid dependency are minimized and differences in Fourier components reflect differences in shape. The Fourier analysis of an edge-based decomposition of a shape (e.g. the tangent/angle function, Bookstein, 1977; Zahn & Roskies, 1972) would remove centroid dependency also.

We suggest that an ideal system for shape measurement should conform to the following criteria:

1.It should be practical and practicable.

2.It should accurately measure the form or any part of it.

3.It should allow reconstruction of the original shape (i.e. the derived measures should be related to the shape by a determinable function).

4.The data should be suitable for statistical analysis, so that biological variation can be accommodated.

5.Measurements of size should be independent of shape and vice versa.

6.Measurements of shape should, if required, be independent of any necessity to define ‘homologous points’.

The Fourier method described here conforms to the first five of these desiderata: Fourier analysis of a curvature function would conform to all six.

Any new method of measuring shape must offer significant advantages over existing methods. The system described here is of the same order of accuracy as existing methods and, we think, offers considerable advantages. For further comparison of traditional and more modern methods of analysis the reader is referred to Ashton, Flinn, Moore & O’Higgins (in preparation).

The ultimate test of shape measurement must be its ability to ‘recognize’ unknown shapes. The Fourier method described here classified 92 % of the sample of outlines correctly. 15 coefficients describe total shape with a high measure of accuracy, with no reliance upon expert opinion or ‘homologous points’. We suggest that it should now be possible, using suitable material, to partition size and shape allowing the complex interrelations of these properties to be studied in a biological context.

## REFERENCES

*Cranial shape and hominoid classification*

*J. Craniofac. Genet, devl Biol*

*Evolution, Lawrence, Kans*

*The Measurement of Biological Shape and Shape Change*

*Evolution and Development*

*Evolution, Lawrence, Kans*

*Nature, Land*

*Proc. R. Soc. Lond. B*

*Evolution, Lawrence, Kans*

*Evolution, Lawrence, Kans*

*J. Zool. (Lond.)*

*Yearb. Phys. Anthrp*

*Factors and Mecanisms Influrencing Bone Growth*

*Devi Biol*

*SAS USERS GUIDE: STATISTICS*

*J. Zool. (Lond.)*

*Evolution and the Genetics of Populations 1. Genetic and Biomedical Foundations*

*IEEE Trans. Comput*