Delphinoids (Delphinidae, Odontoceti) produce tonal sounds and clicks by forcing pressurized air past phonic lips in the nasal complex. It has been proposed that homologous, hypertrophied nasal structures in the deep-diving sperm whale (Physeter macrocephalus) (Physeteridae, Odontoceti) are dedicated to the production of clicks. However, air volumes in diving mammals are reduced with increasing ambient pressure, which seems likely to influence pneumatic sound production at depth. To study sperm whale sound production at depth, we attached ultrasound time/depth-recording tags to sperm whales by means of a pole and suction cup. We demonstrate that sperm whale click production in terms of output and frequency content is unaffected by hydrostatic reduction in available air volume down to less than 2% of the initial air volume in the nasal complex. We present evidence suggesting that the sound-generating mechanism has a bimodal function, allowing for the production of clicks suited for biosonar and clicks more suited for communication. Shared click features suggest that sound production in sperm whales is based on the same fundamental biomechanics as in smaller odontocetes and that the nasal complexes are therefore not only anatomically but also functionally homologous in generating the initial sound pulse.
Sperm whales (Physeter macrocephalus) are among the largest, yet most elusive, creatures inhabiting deep ocean waters. Adult sperm whales undertake long, deep dives (Watkins et al., 1993) into the darkness and high pressure of the meso- and bathypelagic depths. They do this to locate and catch approximately 1000 kg(Lockyer, 1981) of medium-sized squid and fish (Clarke et al.,1993) each day. The most prominent feature of the sperm whale physique is the large nasal complex (Fig. 1), accounting for up to one-third of the body length of large males (Nishiwaki et al.,1963). The entire forehead is heavily innervated by cranial nerves V and VII (Oelschläger and Kemp,1998), and the potential level of activity in the muscle complex controlling the forehead is implicated by the highest density of arteries found in any muscle tissue of the sperm whale(Melnikov, 1997).
Norris and Harvey (1972)proposed that the sperm whale nose, homologous with the sound-producing nasal complex of smaller odontocetes (Cranford et al., 1996), is a pneumatic sound generator(Fig. 1). Recent investigations have corroborated some of the basic concepts of the Norris and Harvey theory by showing that clicks are produced in the anterior part of the nasal complex(Ridgway and Carder, 2001) and that sound can be transmitted through the spermaceti compartments(Møhl, 2001). The sperm whale sound generator is believed to be driven by air which, when recycled,allows for continuous sound production throughout a dive(Norris and Harvey, 1972). However, air volumes contained in soft structured tissue(Ridgway et al., 1969) are reduced in proportion to increasing ambient pressure (Boyle's law: PV=C, where P is pressure, V is volume and C is a constant), so the available volume for sound production varies considerably with depth.
Sperm whales are vociferous animals and, unlike most odontocete species that have been investigated, their vocal repertoire is made up solely of clicks. It has been suggested that the so-called usual clicks(Weilgart and Whitehead, 1988)are involved in echolocation (Gordon,1987), whereas stereotyped patterns of clicks, termed codas(Watkins and Schevill, 1977),are allegedly involved in communication to maintain the complex social structure in female groups (Weilgart and Whitehead, 1993). Recent investigations have demonstrated that sperm whale usual clicks are highly directional and have the highest biologically produced source levels ever recorded(Møhl et al., 2000). Clicks of high sound pressure levels and directionality serve biosonar purposes well (Au, 1993) but seem a poor choice for communication because directionality reduces the communicative space.
Because of the directional properties of sperm whale usual clicks,far-field recordings cannot quantify changes in the acoustic output of the sound generator since scanning movements of a directional source rather than output modulations may be the cause of the observed changes. By placing a calibrated recording unit in a fixed position on a phonating sperm whale,directional and/or hydro-acoustic effects on the recorded signals can be ruled out, and any observed changes will reflect actual changes in the acoustic output of the sound generator. Sound-recording tags have successfully been placed on elephant seals (Fletcher et al.,1996; Burgess et al.,1998) and sperm whales(Malakoff, 2001) to register levels of low-frequency noise impinging on the tagged animal and how the behaviour of the animal is affected. Of interest in the present study are the acoustics and biomechanics of the sperm whale sound generator. To study these,we developed a tag that allows for absolute sound pressure recordings of clicks for 30 min and combination of these data with the real time and depth of the whale.
Here, we report that sperm whales can maintain and regulate acoustic outputs even when they have a very limited volume of air in the nasal complex. We also present evidence to suggest that the sound-generating mechanism has a bimodal function that allows for the production of clicks suited for biosonar and clicks more suited for communication.
Materials and methods
Investigations were carried out in the Bismarck Sea off Papua New Guinea from the research vessel R/V Odyssey in May 2001. The voyage of the R/V Odyssey is a multiyear, collaborative program designed to gather the first-ever coherent set of baseline data on levels of synthetic contaminants throughout the world's oceans and to measure the effects of these substances on ocean life. The voyage is coordinated by The Ocean Alliance/The Whale Conservation Institute. The Bismarck Sea (centre 5°S, 150°E) is an important habitat for sperm whales and other odontocetes. Several mother/calf pairs and sexually mature males have been observed, indicating that the area is a breeding ground for sperm whales (Physeter macrocephalus). In this study, only adult or semi-adult specimens were approached for tagging.
The tag was based on an aluminium housing (diameter 100 mm) with a Syntactic foam tail (MacArtney, Denmark) pressure-tested to a depth of 1100m. Signals from a custom-built hydrophone were highpass-filtered (-12 dB per octave, fundamental frequency 1 kHz) and relayed, via an adjustable gain/anti-alias filter unit, to a 12-bit ADC (Analog Devices: AD7870) andμcontroller (Maxim Integrated Products, Inc. DS5000T) unit (sampling at 62.5 kHz) writing acoustic, real-time and depth data to a 192 Mb Sandisk Compact flash card. The hydrophone was calibrated relative to a B&K 8101 hydrophone in an anechoic tank before and after deployment. Sound recording(bandwidth 30 kHz) was triggered at a depth of 20 m. The depth transducer was a calibrated Keller PA-7-200 transducer providing depth information in the range 0-1500 m with an accuracy of 3 m. The suction cup (diameter 25 cm) was moulded from Wacker silicone (Elastosil M-4440) in a custom-built cast.
Attachment and retrieval
The tags was deployed with a 4.5 m pole from a special boom rigged on the R/V Odyssey; the tag was attached to the whale with a suction cup(Fig. 2). The whales were approached from behind, and the ship drifted the last 30-50 m with the engine turned off to make a silent approach. Four whales were successfully tagged in 45 trials. After detachment, the tag was retrieved by taking a bearing with four-element Yagi antennae (Televilt, Y-4FL) to signals from a Telonic MOD-305, Cast 3C, transmitter integrated in the Syntactic foam tail. A B&K 8101 hydrophone was deployed to record the far-field signatures of the clicks recorded by the tag. Signals from the B&K 8101 hydrophone were recorded on a Sony TCD-D8 DAT recorder. This recording chain had a flat (within 2 dB)frequency response from 0.01 kHz to 22 kHz. From video footage of the tag attachments, it was possible to calculate the size of the whale from the diameter of the attached suction cup(Whitehead and Payne,1981).
Data were transferred via the Flash card and a PCMCIA slot to a laptop. The anti-alias filter was compensated for during analysis, giving a flat frequency response of the tag in the range 0.1-30 kHz. Analysis was performed with Cool edit 2000 (Syntrilium) and routines written in Matlab 5.3(MathWorks). Inter-click intervals (ICI) were derived with a peak detector looking for suprathreshold values of the envelope of the recorded signals. The spectral content of the clicks was described by the end points of the -10 dB bandwidth. Centroid frequency was derived as the frequency dividing the spectrum into halves of equal energy. The duration of a click was defined as the interval between the -10 dB points relative to the peak of the envelope function.
Four whales were tagged in 45 attempts. Here, we present data mainly from the fourth tagging event since that tag gathered acoustic data from an entire dive cycle. Tag IV was placed behind the crest of the skull (see Fig. 1). The whale initiated a deep dive (Fig. 3) 2 min after attachment of the tag. At a depth of 50 m, the whale started to produce codas. After emitting 11 codas during descent to 265 m, the whale switched to the production of usual clicks after 15 s of silence. When the air volumes are pressurized during descent, the volume of air will be reduced in accord with Boyle's law, and the density of the air will increase, whereas its viscosity will remain largely unchanged. When the whale started to produce coda clicks at a depth of 50 m, it would have had less than 20 % of its initial air volume; it would have had less than 4 % when it switched to producing usual clicks at a depth of 265 m (Boyle's law)(Fig. 3 inset). Of the 1804 usual clicks, 80 % were made at a depth of more than 600 m and thus were produced by the whale when it had less than 2 % of the initial air volume available to it for sound production. After 23 min of submergence, the whale stopped clicking and remained silent during ascent. Descent rate was 60 m min-1 and ascent rate 75 m min-1(Fig. 3).
The production of usual clicks is initiated with an ICI of approximately 1 s, but as the whale approaches the depth at which its dive levels off, the ICIs drop to a stable 0.5 s (Fig. 4). During descent, the ICIs decrease by 100-200 ms and subsequently increase almost back to the starting level in 3-4 repeated cycles(Fig. 4). Click trains are interrupted by periods of silence lasting 5-30 s.
Recorded levels of all 1804 usual clicks are plotted in Fig. 5. The recorded levels of the first usual clicks are less than 170 dB re. 1 μPa (peak to peak, pp),and the amplitudes of the following clicks increase to approximately 178 dB re. 1 μPa (pp). The acoustic output is independent of depth within a 20 dB range from 170 to 190 dB re. 1 μPa (pp)(Fig. 5).
As seen from the data presented in Table 1, there are marked differences between the waveforms of usual clicks and coda clicks. The coda clicks (N=54) have a mean recorded level of 165±5 dB re. 1 μPa (pp), which is significantly lower than the mean recorded level of usual clicks (N=1804) of 178±4 dB re. 1 μPa (pp) (P<0.001). Also, the centroid frequency of the coda clicks is 7-9 kHz with a -10 dB bandwidth of 3-4 kHz, compared with a higher and more variable centroid frequency for the usual clicks between 8 and 25 kHz and a -10 dB bandwidth of 10-15 kHz.
The duration of the individual pulses within a click is approximately 100μs for the initial sound pulse (p0) in usual clicks and approximately 300 μs for p0 in coda clicks. A distinct difference between usual clicks and coda clicks is seen in the decay rate(peak amplitude) between the successive pulses within a click(Fig. 6). It is evident from Fig. 6A that there is a decay rate of the order of 20 dB between p0 and the second pulse(p1), and that no third pulse (p2) can be detected above background noise in usual clicks. The decay rate of usual clicks is largest for the most powerful clicks but independent of depth because both low (15 dB) and high (23 dB) decay rates between p0 and p1 are seen at the deepest part of the dive. In coda clicks, the decay rate is approximately 4-8 dB between p0 and p1(Fig. 6B) irrespective of the whale's depth.
The far-field signature of the clicks was recorded from the research vessel. The waveforms of usual clicks differed significantly from the tag recordings, with the centroid frequency occurring at lower frequencies. The inter-pulse interval (IPI) denotes the period between two successive pulses within a click (Norris and Harvey,1972). The IPI of both coda clicks and usual clicks was 3.4 ms irrespective of depth. The centroid frequency of usual clicks is independent of depth because both high and low centroid frequencies are found in clicks during shallow and deeper parts of the dive. There is, however, a positive relationship (r=0.70, P<0.001) between the acoustic output (recorded level) and centroid frequency(Fig. 7).
With a body length of 10 m and an estimated mass of 9800 kg(Lockyer, 1981), the whale tagged with tag IV probably contained some 2001 of air after inhalation while at the surface (inferred from Clarke,1978). If the lungs of a sperm whale collapse(Ridgway, 1971) as they do in smaller odontocetes (Ridgway et al.,1969), the whale would have had, at most, 3.51 of air available to it for sound production at a depth of 600 m. Thus, sperm whales recycle the air after each click or group of clicks (as demonstrated in Tursiopssp.; Dormer, 1979) and/or use very small volumes of air to generate each click. Considering the highly reduced air volume available for sound production when the whale is at a depth of 700 m and that sperm whales have been reported to phonate at depths of more than 2000 m (Whitney, 1968),it is conceivable that air simply is not involved in sperm whale sound production. That view, however, is not supported by experimental data on sound production in the homologous structures of smaller odontocetes(Ridgway and Carder, 1988) or by anatomical evidence (Cranford,1999). Accordingly, we propose that air is indeed involved in sperm whale click production and that the reduction in air volume may not be significant for click production even at the extreme depths to which sperm whales dive.
The adjustment in ICI with depth during a dive(Fig. 4) may be explained by a longer sonar range at the beginning of the dive and by the fact that the ICI is reduced as the whale approaches sonar targets (e.g. prey or bottom),thereby reducing the two-way travel time of the clicks and the echo(Au, 1993). This adjustment in ICI has also been reported in other sperm whale studies (e.g. Gordon, 1987; M. Wahlberg,manuscript submitted), suggesting that it is an integrated part of sperm whale ecophysiology during feeding dives. However, the sound pressure levels are not reduced accordingly (Fig. 5),indicating that sonar range alone does not dictate the magnitude of the acoustic outputs.
In the near field of what is considered to be 180° off the acoustic axis of the sound generator (Møhl et al., 2000), the mean recorded level of usual clicks is 178±4 dB re. 1 μPa (pp). This is consistent with off-axis levels reported from array recordings of usual clicks made by male sperm whales(Møhl et al., 2000). The recorded levels are within a 20 dB range of 170-190 dB re. 1 μPa (pp)(Fig. 5), and it is feasible that the source levels (the sound pressure at a distance of 1 m on the acoustic axis) are emitted within the same 20 dB dynamic range but that they are some 40 dB higher (Møhl et al.,2000). There is no apparent link between available volumes of air and sound pressure since both high- and low-sound-pressure clicks are produced during the deepest part of the dive (Fig. 5). Thus, sperm whales can regulate the sound pressure levels of their clicks, and it is sonar or feeding demands rather than available air volume that dictate acoustic output levels at these depths.
Data from Møhl(2001) suggests that the multipulses in sperm whale clicks are the result of a single pulse(p0) being reflected on the air surfaces of the distal and frontal air sacs (Fig. 1). From this,it can be inferred from the decay rate data presented here that the bulk of the energy of the initial pulse, p0, in usual clicks is directed forwards into the water after a single round trip through the spermaceti organ and the junk, and that only a small fraction is intercepted by the distal air sac, giving rise to the low amplitude of p1 shown in Fig. 6A.
As noted above, the recorded levels of p0 in coda clicks are 20 dB less intense than those of usual clicks, suggesting that the overall acoustic output in coda clicks is reduced compared with that of usual clicks or that a smaller fraction of the initial energy is directed backwards into the spermaceti organ and consequently towards the recording tag. When generating coda clicks, a large fraction of the returning pulse (p1) from the frontal sac appears to be intercepted by the distal air sac and contained in the nasal complex for further round trips, thereby giving rise to a large number of pulses with small decay rates within each coda click. We propose that these two different ways of handling the initial sound pulse represent a bimodal generation of clicks depending on whether they are intended by the whale for use in biosonar or for communication. In usual clicks, most of the energy is put into a single pulse, directed into the water in front of the whale after traversing the spermaceti complex twice. In coda clicks, the energy is recycled in the nasal complex by multiple reflections that seem to result in less-directional clicks that are better suited for communication. In addition to the inferred low directionality, the narrow-band nature, longer pulse duration and low decay rate of coda clicks may offer useful information about the transmitter to conspecifics. We suggest that the initial pulse of the two click types is generated in the same way and that the marked differences between coda clicks and usual clicks are caused by different sound propagation in the nasal complex. The difference in click structure and the inferred difference in directionality between coda clicks and usual clicks may also explain in part the substantial discrepancy between reports of low directionality in clicks from coda-producing sperm whales(Watkins, 1980) and the high directionality observed in usual clicks from foraging male sperm whales(Møhl et al.,2000).
If the distinct multipulse structure of the coda clicks is generated by repetitive reflections on the air sacs, it may explain why coda clicks are produced in the shallow part of the dive cycle when more than 4% of the initial air volume is still present. It is possible that a certain air volume is needed to maintain the production of coda clicks and that sperm whales are accordingly limited by depth in coda production. However, the fact that the whale switched from the production of coda clicks to usual clicks within 10 s,at a depth of 265 m suggests that shifts between the two modes of click generation are not determined solely by the available air volume. It is feasible that, during the formation of a usual click, muscle action in the complex muscle/tendon system covering the dorso-lateral part of the spermaceti organ could be changing the conformation of the sound-transmitting structures and the distal air sac, thereby causing most of the energy to be projected forwards into the water after one round trip through the spermaceti complex. On the basis of observations of several other pulsed sound types from sperm whales (Gordon, 1987; Weilgart and Whitehead, 1988),the possibility that the sperm whale sound generator may have additional modes from the two deduced from this study cannot be excluded.
The far-filed signature of the clicks revealed a different waveform and emphasis at lower frequencies compared with the tag recording. The waveform differences between the nearfield (the tag) and the far field cannot be explained solely by surface reflections and hydrodynamic effects because the decay rate of the usual clicks was lower in the far field than when recorded in the near field from the crest of the skull. It is tempting to suggest that the lower centroid frequency observed in the far field relates to lowpass-filtering of the clicks by frequency-dependent absorption. However,considering the physical limits of the range between the tagged animal and the research vessel during 10-20 min of swimming (1-5 m s-1),frequency-dependent absorption in the relevant frequency range of sperm whale clicks cannot account entirely for the observed changes(Urick, 1983). It appears that the main contributing factor to the waveform and frequency differences is the directional effects of the sperm whale sound generator.
The Gordon equation (Gordon,1991) describes the relationship between IPI and the size of a whale. With an IPI of 3.4 ms, the Gordon equation predicts a body length of 9.8 m, which matches the visual estimate of 10 m from video recordings of the whale and the tag. Consequently, the data presented here lend weight to the Gordon equation as a reliable acoustic means of measuring the size of sperm whales from their clicks.
The inter-pulse interval (IPI) is 3.4 ms in both click types and constant throughout the dive. Clarke(1970) has proposed that the nasal complex of the sperm whale is a buoyancy regulator that facilitates descent and ascent during dives by cooling and heating the spermaceti oil. Assuming a pressure range of 7000 kPa (70 atmospheres) (0-700 m depth) and a temperature difference of 22-37°C, it can be calculated that the sound speed would differ by 7% between the start and the deepest point of a dive (on the basis of data from Goold et al.,1996). In a sperm whale with an estimated two-way sound travel path of 4.7 m (Fig. 1), such differences in sound speed would change the IPI by more than 200 μs during a dive to 700 m. We did not observe IPI fluctuations of that order of magnitude, so the theory (Clarke,1970) proposing that ascent and descent of sperm whales are assisted by changes in buoyancy of the head due to heating and cooling of the spermaceti oil is not supported.
The centroid frequencies of the usual clicks vary between 8 and 26 kHz. These values are consistent with previous reports on the frequency content of sperm whale clicks (Watkins,1980; Madsen and Møhl,2000). It is, however, surprising that centroid frequencies above 10 kHz can be found in usual clicks recorded from what is believed to be 180° off the acoustic axis of the sound generator(Møhl et al., 2000). It can be conjectured that the high centroid frequencies recorded from the crest of the skull are due to near-field phenomena and the peculiar sound transmission in the sperm whale nasal complex, where the bulk of the initial pulse is directed backwards into the spermaceti organ by the distal sac and anatomy of the monkey lips. This problem calls for further investigations.
There are no apparent correlations between the spectrum of the usual clicks and the whale's depth because both high and low centroid frequencies were recorded from clicks at the deepest part of the dive. This contrasts with investigations on white whale (Delphinapterus leucas) whistles at depth (Ridgway et al., 2001). Ridgway and co-workers found that the peak frequency of whistle spectra increased with depth and proposed that this effect is the result of increased air density and a reduction in total air volume at depth. The absence of a similar effect in sperm whale clicks emphasises, in our view, the difference in how clicks and whistles are generated in the odontocete nasal complex.
When centroid frequency is plotted against recorded sound level(Fig. 7), it appears that there is a positive correlation between acoustic output and frequency. This correlation should not be confused with the fact that the on-axis parts of the clicks contain more high-frequency components than the off-axis parts(Møhl et al., 2000). Investigations on smaller odontocetes have revealed a positive correlation between acoustic output and centroid frequency in clicks from D. leucas,P. crassidens and Tursiops truncatus(Au, 2001). That a similar relationship has been found in the present study supports the conclusion that sound production in sperm whales is based on the same fundamental biomechanics as in smaller odontocetes and that the nasal complexes are, therefore, not only anatomically (Cranford,1999) but also functionally homologous in generating the initial sound pulse.
In conclusion, sperm whale click production in terms of output and frequency content is unaffected by hydrostatic reductions in available air volume down to depths of at least 700 m. Evidence is presented to suggest that the sound-generating mechanism has a bimodal function, allowing for the production of clicks suited for biosonar and clicks more suited for communication. Shared click features suggest that sound production in sperm whales is based on the same fundamental biomechanics as in smaller odontocetes. This project has shown that it is possible to gather information about the physiology and biomechanics of sound production from free-ranging animals not suited for study in captivity. Together with other approaches, the development of this technique can provide further insight into the mechanics of the largest biological sound generator, the sperm whale nose, and may prove to be heuristic in the development of biomimetic sound sources in man-made sonars.
We thank M. Bjørn, P. T. Sørensen, M. F. Christoffersen and B. K. Nielsen for their help and engineering skills during the development of the tag. Drs R. Baird, S. Hooker and H. Whitehead gave valuable advice. Special thank goes to first mate J. Jones for his professionalism and persistence during tagging. We wish to thank the entire crew/staff of the R/V Odyssey/Ocean Alliance: Captain R. Olsen, C. Johnson, G. Johnson, J. Cavanaugh, A. Furst, Dr Celine Godard and K. Marshall-Tilas for their help during this project. We thank the Papua New Guinean authorities for their collaboration and assistance. Earlier versions of this manuscript benefited from comments by B. Dahl, P. Frederiksen, J. P. Lomholt, B. K. Nielsen, J. Tougaard and two anonymous referees. The Novo Nordisk Science Foundation funded the development of the tag and the Whale Conservation Institute/Ocean Alliance funded fieldwork and ship time. P.T.M. was funded by the Department of Zoophysiology, University of Århus, Denmark. This work was conducted under NMFS authorization no. 1004, Cites 00US19824/9 and a scientific permit afforded to the Whale Conservation Institute by the authorities of Papua New Guinea.