Insects have evolved a great diversity of sound-producing mechanisms largely attributable to their hardened exoskeleton, which can be rubbed, vibrated or tapped against different substrates to produce acoustic signals. However, sound production by forced air, while common in vertebrates, is poorly understood in insects. We report on a caterpillar that ‘vocalizes’ by forcing air into and out of its gut. When disturbed, larvae of the Nessus sphinx hawkmoth (Sphingidae: Amphion floridensis) produce sound trains comprising a stereotyped pattern of long (370 ms) followed by multiple short-duration (23 ms) units. Sounds are emitted from the oral cavity, as confirmed by close-up videos and comparing sound amplitudes at different body regions. Numerical models using measurements of the caterpillar foregut were constructed to test hypotheses explaining sound production. We propose that sound is generated by ring vortices created as air flows through the orifice between two foregut chambers (crop and oesophagus), a mechanism analogous to a whistling kettle. As air flows past the orifice, certain sound frequencies are amplified by a Helmholtz resonator effect of the oesophagus chamber. Long sound units occur during inflation, while short sound units occur during deflation. Several other insects have been reported to produce sounds by forced air, but the aeroacoustic mechanisms of such sounds remain elusive. Our results provide evidence for this mechanism by showing that caterpillars employ mechanisms similar to rocket engines to produce sounds.
Acoustic communication is widespread among insects, where airborne sounds and solid-borne vibrations play vital roles in mating, predator–prey interactions, aggression and group information transfer (Haskell, 1961; Greenfield, 2002; Cocroft and Rodríquez, 2005; Hill, 2008). Specialized structures for generating acoustic signals have evolved multiple times and on almost every part of the body wall including legs, wings, mouthparts, head and even genitals (Haskell, 1961; Dumortier, 1963; Greenfield, 2002; Hill, 2008). This diversity of sound-producing mechanisms in insects is mostly attributable to their hardened exoskeletons which can be rubbed, vibrated or struck against other body parts or substrates to generate signals (Haskell, 1961; Dumortier, 1963; Ewing, 1989; Greenfield, 2002). However, the source of sound in insects is not always limited to solid body parts – mechanisms involving the movement of a fluid (air or liquid) through a tube, chamber or orifice have also been reported but are poorly understood (Haskell, 1961; Ewing, 1989; Greenfield, 2002).
Sound production by airflow is common for terrestrial vertebrates as vocalizations. Broadly defined, vocalizations are acoustic byproducts of eating or breathing, and include sounds made by air leaving the animal from the respiratory system or the gut (Clark, 2016). These sounds can be generated in two ways: by a mechanically vibrating element or by an aerodynamic mechanism (Fletcher, 1992). Sounds generated by mechanical vibration result from oscillations of a taut, thin membrane, and are a function of the tension and material properties of the membrane (Dowell, 1977). Most vertebrate vocalizations result from mechanical vibrations whereby vocal folds vibrate when air flows outwards from the lungs (Fletcher, 1992; Bradbury and Vehrencamp, 2011). Sounds generated by aerodynamic mechanisms, in contrast, are produced by vortices caused by turbulence (Fletcher, 1992; Mongeau et al., 1997). As fluid passes over an edge, some of the fluid curls around the edge forming a vortex, a low-pressure region that collapses, generating a sound wave. Examples in vertebrates include human whistles and consonants, hissing, and ultrasound vocalization in mice (Fletcher, 1992; Mahrt et al., 2016).
In insects, sound production by airflow is less common because they do not breathe in the same way as vertebrates. However, this mechanism has been reported for several insects including some cockroaches, caterpillars, hawkmoths, flies and wasps (see Haskell, 1961; Nelson, 1979; Brehm et al., 2015; Bura et al., 2016). In most species, experimental evidence for sound production by airflow is lacking (Haskell, 1961), and even in two species that have been experimentally confirmed to produce sounds by airflow – the Death's-head hawkmoth and the Madagascar hissing cockroach – the aeroacoustic mechanisms remain unverified (see Discussion).
This study introduces a novel form of sound production by airflow in an insect and investigates the mechanisms responsible for sound generation and amplification. In a previous comparative study on defence sounds in Bombycoidea caterpillars (Bura et al., 2016), several species (Sphecodina abbottii, Amphion floridensis, Pachygonidia drucei and Nyceryx magna) belonging to the subfamily Macroglossinae were reported to produce sounds by ‘vocalization’, a mechanism proposed based on the observation that mandibles were held open during sound production. However, in that study, the mechanisms of sound production were not verified experimentally. Here, we investigated this novel mechanism in caterpillars of the Nessus sphinx hawkmoth, Amphion floridensis. Specific objectives were to: (1) describe the sound characteristics; (2) test the hypothesis that sounds are emitted from the oral cavity; and (3) test hypotheses on the mechanisms of sound production by constructing models based on foregut morphology and sound characteristics.
MATERIALS AND METHODS
Insect collection and rearing
Larvae of the Nessus sphinx hawkmoth, Amphion floridensis Clark 1920, were reared from eggs laid by wild-caught females captured at ultraviolet lights in Florida, USA, during May–June 2012–2015. Larvae were reared on cuttings of Virginia creeper (Parthenocissus quinquefolia) or wild grape (Vitis spp.). All experiments were performed on the final (4th) larval instar. A total of 60 larvae were used in different experiments.
Sound and video recording set-up
To assess relationships between attack, sound production and other defensive behaviours, caterpillars were videotaped during simulated attack trials. A caterpillar was placed on a sprig of host plant held in a water-filled vial and left undisturbed for 15–30 min prior to the experiment. Attacks were conducted by squeezing the posterior end of the caterpillar using blunt forceps, simulating the attack of a predator (Cornell et al., 1987; Bura et al., 2011). Sequential attacks were applied at 5 s intervals or until the caterpillar ceased signalling. Trials were videotaped using a high-definition Handycam HDR-HC7 (Sony, Tokyo, Japan) equipped with a Sony ECM-MS957 microphone or a bat detector (type D240x; Pettersson, Uppsala, Sweden). Videos were analysed using iMovie 7.1.4 (Apple, San Bernardino, CA, USA).
A modification of the set-up described above was used to record sounds for analysis of sound characteristics. Because caterpillars often thrashed when attacked, it was necessary to hold the specimen in place to control for distance and orientation to the microphone. The caterpillar rested on its host-plant sprig as described above, and its head capsule was held between the fingers of one experimenter to position the mouth at specific distances from the microphone. Attacks were simulated as described above. Sounds were recorded using a ¼ in microphone [type 4939; Bruel & Kjaer (B&K), Naerum, Denmark], amplified with a B&K Nexus conditioning amplifier (type 2690) and recorded to a FR-2 Field Memory Recorder (Fostex, Gardena, CA, USA) at a sampling rate of 192 kHz. All recordings were conducted in an acoustic chamber (Eckel Industries Ltd, Cambridge, MA, USA).
Analysis of sound characteristics
Sound files were analysed to assess the relationships between attack and sound train characteristics, as well as the temporal, spectral and amplitude characteristics of sound units. A train is defined as the sequence of all sound units following a single attack. We define a unit as an uninterrupted sound as perceived by the human ear, as others have used ‘chirp’ (Broughton, 1976), which can be formed by one or more pulses. A pulse is a transient waveform with a distinct rise and fall component. Analyses were conducted using Avisoft-SASLab Pro (Avisoft Bioacoustics, Berlin, Germany) or Raven Pro 1.4 (Cornell Laboratory of Ornithology, Ithaca, NY, USA).
To assess temporal characteristics of sound trains in response to attack, we measured latency, duration and duty cycle. Latency was the interval between the forceps coming into contact with the caterpillar and the onset of the first sound unit. Train duration was the interval between the onset of the first unit and the end of the last unit. Duty cycle, defined as the proportion of the train occupied by sound, was the sum of all sound unit durations within a train divided by train duration.
Sound units were analysed for specific temporal, spectral and amplitude characteristics. Temporal characteristics included unit duration, number of pulses per unit and pulse rate. Durations were measured from all sound units following the first two consecutive attacks for 15 individuals. It appeared that units differed categorically based on duration. To verify this, we conducted a regression analysis (see ‘Statistical analyses’, below). As a result of this analysis, two clusters of unit durations occurred and were characterized as being long or short (see Results). Subsequent measurements of temporal sound characteristics were performed on long and short units by randomly sampling five long units and then all short units that followed the long unit from 15 individuals. Long and short unit durations and the number of pulses were measured using Avisoft-SASLab Pro Pulse Train Analysis, and pulse rates were calculated by dividing the number of pulses by the unit duration. Units composed of single pulses were not considered for the pulse rate calculation. Spectral characteristics analysed included dominant frequency and bandwidths at −6 and −12 dB from peak. Five randomly selected long units and the first short unit following that long unit were selected from each of 15 individuals. Power spectra and spectrograms were produced using a 1024-point fast Fourier transform (Hann window, 50% overlap) in Avisoft-SASLab Pro. In the amplitude domain, we measured relative amplitudes of long and short units and amplitude envelopes. We randomly sampled five long units and all short units that followed the long unit. Relative amplitudes were obtained by measuring the maximum peak-to-zero amplitude of each unit, as well as the peak-to-zero amplitude of all pulses from each unit. We used all pulse amplitudes to describe the envelope of long and short units (see ‘Statistical analyses’, below). For this envelope description, a total of five long and all following short units were sampled from 10 animals.
Localizing sound source
To investigate which body parts are involved in sound production, we first videotaped entire caterpillars on their host plant while they made sounds. We then focused the camera on specific body regions, including the spiracles, the anterior prothoracic region and mouthparts, using previously described video and audio equipment.
To narrow down the location of sound emission, we compared relative sound amplitudes along the length of the body from the mouth to the anus. Caterpillars were positioned horizontally on a stem of their host plant with leaves removed so that microphones could be positioned at set locations and distances from the caterpillar. Sound production was evoked by a pinch attack as described above, but if the caterpillar thrashed, the trial was discarded. Two miniature condenser microphones (Cold Gold Audio, Nanaimo, BC, Canada) were positioned on stands in three different configurations: 1 cm from the mouth and 1 cm from the anus; 1 cm from the mouth and 1 cm from the middle of the animal (lateral side between spiracles 4 and 5); and 0.5 cm from the mouth and 0.5 cm from first spiracle. Microphones were connected to a PC laptop and sounds were recorded to Raven Pro 1.4. To ensure that the two microphones were equally sensitive, we generated a clicking sound at equal distances between the two microphones and compared the peak-to-peak amplitudes. Microphone positions were alternated after each recording so that each caterpillar was recorded twice with each configuration. We measured peak amplitudes and root mean square (RMS) amplitudes of the longest unit produced for each recording in five caterpillars.
The internal anatomy was examined to identify any structures that could potentially be involved in sound production, including accessory air sacs, lobes or membranes. Five 4th instar larvae preserved in 80% ethanol were dissected to expose parts of the alimentary canal and associated musculature. Anatomical structures were identified following Eaton (1988) and Snodgrass (1935). Specimens were photographed using a stereomicroscope (M205C; Leica Microsystems, Wetzlar, Germany) equipped with a Leica DMC4500 camera. Lengths and diameters of the crop, oesophagus, pharynx and buccal cavity were measured using Leica Application Suite LAS X v.4.8.
Localization of sound source
Sound and attack
Differences in temporal characteristics of sound trains following the first and second attacks were assessed using a paired Student's t-test with α=0.05. To determine whether consecutive sound units following the first attack differed in their durations, we used ANOVA followed by Tukey's HSD tests executed in R.
To determine whether sound units could be categorized as long and short, we plotted the durations of all units following two consecutive attacks onto a frequency histogram and predicted a bimodal distribution. A linear regression analysis was executed on the histogram data, considering a bin's upper duration measured in seconds as the independent variable and frequency as dependent. Because duration was defined as the interval between the first and last pulses of a unit, single pulse units were attributed a duration of 0 ms and included in the first bin. The best-fit equation, based on higher R2, F-value and all parameters significant for α=0.05, was selected using Table Curve 2D v5.01 (Systat Software Inc., San Jose, CA, USA). The equation curve was plotted on the histogram to verify that it matched the bin distribution.
Correlation between the number of pulses and unit duration was examined by plotting five randomly selected long units and the consecutive four units that followed them, independently of their duration, from 15 caterpillars.
Envelope shapes were characterized through normalization of pulse amplitudes and times. Each pulse peak-to-zero amplitude was divided by the maximum amplitude of its unit. Similarly, each pulse time was divided by the unit total duration. In this way, the highest pulse of each unit had a normalized amplitude of 1.0 and the first and last pulses were attributed normalized times of 0.0 and 1.0, respectively. Units with one or two pulses were excluded from the envelope analysis. Envelopes of long and short units were assessed separately by executing two regression analyses in Table Curve 2D v5.01, considering normalized time as the independent variable and normalized amplitude as dependent. The best-fit equation was selected using the same criteria as described above.
Differences between short and long units were tested using Student's t-tests with paired samples for spectral characteristics, and two independent samples for amplitude and temporal characteristics, using α=0.05.
Numerical simulation methods for sound-producing mechanism
Numerical models were constructed to test hypotheses for sound production using the measurements of the caterpillar foregut. The numerical simulation was completed using custom-written MATLAB scripts (R2016a, MathWorks, Natick, MA, USA; available from the corresponding author on request). Sounds recorded from caterpillars were analysed using MATLAB's Signal Processing Toolbox, and compared with the simulated sounds using the frequencies and amplitudes of each proposed model against the measured data.
Response to attack
All caterpillars were silent prior to the first attack, and all responded by making sounds (N=15) (Fig. 1; Movie 1, Audio 1). The latency of sound in response to the second attack was significantly shorter than that to the first (t10=2.31; P=0.04), whereas there was no difference in train duration (t10=−1.69; P=0.12), duty cycle (t14=−1.77; P=0.10) or number of units (t10=−1.50; P=0.16) (Table 1). A single thrash of the anterior body frequently occurred, but regurgitation was never observed during the first two attacks (Table 1; Movie 1). Regurgitation was rarely observed even following multiple (>6) attacks. Once a caterpillar was attacked, other physical stimuli such as touching the plant or blowing on the caterpillar sometimes evoked an acoustic response, indicating that the caterpillar became sensitized. During rearing of >100 caterpillars over the course of this study, there was no evidence of caterpillars responding acoustically to the presence of conspecifics, even under crowded conditions.
Sound unit characteristics
Sound units comprised a series of 1–501 regularly spaced pulses (Fig. 2; Audio 1). Unit durations were bimodally distributed into two distinct clusters: short and long (F8,58=14,002.2; R2>0.99) (Fig. 3A). The first and second clusters included units ranging from 0 to ∼120 ms and ∼150 to 855 ms, respectively. Pulse rates of long and short units did not differ significantly (t364.8=−0.54; P=0.59) (Table 2). There was a positive relationship between the number of pulses and unit duration (Fig. 3B), indicating that long units result from more pulses rather than a decrease in pulse rate.
Following a first attack, a resting caterpillar begins its sound train with a long sound unit followed by multiple short units (Fig. 3C). With the exception of one trial, the first unit was always longer than 150 ms and its average duration was significantly greater than that of the following four units (F4,69=13.51; P<0.001). In turn, the second to fifth units following attack were most often shorter than 150 ms and did not differ significantly from one another.
Sound units were broadband with dominant frequencies in the ultrasound range. Dominant frequencies differed significantly between long (32.7 kHz) and short units (26.8 kHz) (t74=4.20; P<0.001), but bandwidths measured at −6 and −12 dB from peak did not differ significantly (Fig. 4, Table 2).
Long sound units were significantly higher in amplitude than short units as measured by maximum relative amplitude (t80.6=8.73; P<0.001) and average amplitude of all pulses (t9108.4=16.42; P<0.001) (Fig. 2, Table 2). Amplitude envelopes of long and short units also differed: long units were typically bell shaped (F1,12555=5215.6, R2=0.29), whereas short units were descending (F1,3697=491.27, R2=0.12) (Fig. 2).
Localization of sound source
Our results support the hypothesis that sounds are emitted from the mouth. First, video recordings showed that no external body parts move consistently during sound production (Movies 1 and 2) (N=16). Close-up videos of mouthparts showed that mandibles were entirely or partially held open during sound production (Movie 2) (N=12). Second, sounds recorded simultaneously from the mouth and any other body regions were always higher in amplitude at the mouth (Fig. 5). The RMS amplitude measured at the mouth differed significantly from that measured at all other body regions (F4,55=47.90; P<0.001).
The foregut of the digestive tract comprises the buccal cavity, the pharynx and oesophagus chamber, and the crop (Fig. 6). The tract begins with a hypognathous mouth enclosed by the labrum dorsally, one mandible on each side, and the hypopharynx ventrally. At the base of these mouthparts lies the opening to the buccal cavity. The buccal cavity (mean length 0.56 mm, diameter 0.58 mm) is surrounded by several thick, transverse muscle bands, and has four pairs of dorsal dilator muscles arising from the clypeus. The beginning of the pharynx is marked by the frontal ganglion and ends at the brain. There are four pairs of dorsal pharyngeal dilator muscles located between the brain and the frontal ganglion, arising from the postclypeal region of the head, and three pairs of ventral pharyngeal dilator muscles. The pharynx (mean length 0.69 mm, diameter 0.70 mm) consists of four pairs of thick, transverse muscle bands, and curves towards the oesophagus. The oesophagus (mean length 2.02 mm, diameter 1.26 mm) has thick circular muscles along its entire length and two large sets of both dorsolateral and ventrolateral dilator muscles. Together, the pharynx and oesophagus comprise one chamber, while a very narrow constriction, probably a sphincter, distinctly separates the oesophagus from the crop. The crop is highly expandable (mean length 10.68 mm, diameter 5.38 mm), with many circular and some longitudinal thin muscle fibres. We found no evidence of a structure in the tract that could function as a vibrating lobe, nor did we find accessory air sacs.
Models for sound-producing mechanisms
We hypothesized that sound production in A. floridensis is a two-stage process involving, first, the generation of sound and, second, the amplification of certain frequencies. Based on our knowledge of gut morphology, we proposed and tested three sub-hypotheses that could explain sound generation, as detailed below. We then proposed a hypothesis to explain sound amplification. To test these hypotheses, numerical modelling using aeroacoustic principles and concepts was enlisted.
Our three sub-hypotheses for the generation of sound were as follows: (1) vibration of a membrane, which in A. floridensis could potentially result from one end of the crop acting as a drum skin that vibrates through muscle contraction; (2) vibration of a chamber, which in A. floridensis could potentially result from the whole crop or oesophagus vibrating as a chamber through muscle contraction; and (3) pulsating jet flow through an orifice, which in A. floridensis could potentially arise from airflow through the orifice between the oesophagus and crop. For a sub-hypothesis to be accepted, the natural frequencies of the proposed model should match the experimental data. In each case, the natural frequency of vibration of these structures is of interest because the natural frequency and associated vibration mode require the least amount of energy to trigger. Only the third model, which tested the pulsating jet flow hypothesis, resulted in values that matched the recorded sounds. This hypothesis is explained below. Further details of all numerical models are given in the Appendix.
All hypotheses and numerical models use the terms ‘sound generator’ and ‘amplification’ rather than primary and secondary resonator. The nomenclature sound generator and sound amplification is more accurate for the proposed mechanisms because the hypothesized pulsating jet flow does not involve resonance. Resonance is the result of a forced mechanical vibration matching the natural frequency of a structure, such as a file and scraper mechanism. The jet flow does not have a mechanical structure.
The pulsating jet flow hypothesis proposes that sound is generated by air flowing through a small hole, similar to the steam whistle on a kettle. A whistle works by forcing steam or air from a chamber at higher pressure through a narrow hole into a region of lower pressure. The flow curls around the edge of the hole, creating a series of vortex rings. When these rings are formed from a pulsating high-speed flow, they are unstable and collapse. Sound is emitted during the collapse of a vortex, so what is heard as a whistle is a series of vortices collapsing (Kierkegaard et al., 2012). The vortices generate a broad band of frequencies, enabling a range of sounds (Zhang and Mongeau, 2006).
The foregut anatomy of A. floridensis shows a sharp constriction forming a hole between the crop and oesophagus (Figs 6 and 7A) that can act as an orifice for air flowing in either direction. Frequencies of a whistle generated by an orifice are expressed using the Strouhal number, a non-dimensional parameter for frequencies that accounts for the diameter of the hole and the flow speed (see Appendix). Sound generation by pulsating jet flow through an orifice is a realistic hypothesis as there are orifice dimensions that provide feasible Strouhal numbers matching recorded frequencies (see Appendix). Further, the flow speed is feasible given that 4th instar A. floridensis are more than 6 cm long, providing sufficient volume to generate the flow. The model proposes that the orifice is large during inflation and the resulting whistling frequency corresponds to the long sound unit frequency. The orifice is narrower during deflation and the whistling frequency corresponds to the short sound unit frequency (see Appendix). Therefore, we accept this hypothesis to explain sound generation in A. floridensis. This hypothesis does introduce an additional concern: orifices will generate a broad range of frequencies as the flow speed fluctuates, but the recorded amplitudes are nearly uniform. To produce the recorded sound spectra for A. floridensis (Table 2), this hypothesis must be coupled with an explanation for sound amplification.
We propose that the foregut in A. floridensis operates as a multi-chamber, multi-throat Helmholtz resonator. A simple Helmholtz resonator is encountered when you blow over the top of an empty bottle. When blowing over a bottle, the geometry of the throat filters the frequencies, and the large chamber of the bottle further amplifies a subset of the frequencies. For a multi-chamber, multi-throat Helmholtz resonator, the filtering and amplification effects are the collective result of the geometries of each chamber and each throat. The chambers will amplify only certain frequencies, and the throats will only allow certain frequencies to be emitted.
The amplified frequencies are a function of the resonator dimensions. The size of A. floridensis restricts the allowable resonator dimensions to the millimetre scale. Millimetre scale resonators amplify only ultrasonic frequencies. Ultrasonic Helmholtz resonators achieved amplification ratios of 17 and quality factors of 8 for frequencies up to 50 kHz (Suzuki et al., 2009). A Helmholtz resonator mechanism has been proposed for sound amplification of the neotropical bush-cricket Acanthacara acuta (Jonsson et al., 2017) and the cicada Cyclochila australasiae (Bennet-Clark and Young, 1992). The bush-cricket study examined a 10 mm-long resonator with frequencies between 10 and 30 kHz. The cicada study involved a 15 mm-long resonator with correspondingly lower frequencies. Neither study involved a multi-chamber, multi-throat Helmholtz resonator.
A multi-chamber, multi-throat Helmholtz resonator hypothesis is feasible for A. floridensis because the crop and the oesophagus are connected via a sphincter. The anterior end of the oesophagus has a second constriction through the pharynx. The crop and oesophagus are the chambers, while the sphincter and the pharynx are the throats (Figs 6 and 7B). For A. floridensis, this hypothesis is verified by numerical simulation of a multi-chamber, multi-throat Helmholtz resonator (see Appendix).
The simulations use the whistling frequencies and the estimated flow speed of 0.02 m s−1 to remain consistent with the conditions stipulated by the Strouhal numbers for inflation and deflation. During inflation, the whistle is generated on the crop side of the orifice (Fig. 7B). During deflation, the whistle is generated on the oesophagus side of the orifice (Fig. 7B). The simulation calculates the variations in flow speeds and the resulting sound waves that exit the Helmholtz resonator. The simulated sound waves are compared with the recorded sounds, as shown in Fig. 8. Fig. 8A represents A. floridensis inflating the crop (flow speed, U=−0.025 m s−1) and matches the long sound unit. Fig. 8B represents A. floridensis deflating the crop (U=0.02 m s−1) and matches the short sound unit.
This study introduces a novel mechanism for sound production in insects. Caterpillars ‘vocalize’ by forcing air into and out of the foregut. By integrating acoustic analyses, functional morphology and modelling, we conclude that sounds are generated by pulsating jet flows through gut chambers which in turn function as Helmholtz resonators to amplify sounds.
Mechanism of sound production
Our results confirm that sounds are emitted from the mouth region as sounds recorded from this region have higher amplitudes than those from any other part of the body. Sounds emitted from the mouth could conceivably be produced by three mechanisms: stridulation, as a byproduct of regurgitation or by airflow from the gut. Maxillo-mandibular stridulation, where sclerotized mouthparts are rubbed together to produce sounds, occurs in some adult Orthoptera, larval Coleoptera and larval Lepidoptera (Dumortier, 1963; Bura et al., 2012, 2016). This mechanism is rejected for A. floridensis because the mandibles do not interact but, rather, are held open during sound production. The second possibility – that sound is a byproduct of emitting a liquid or froth (Haskell, 1961; Dumortier, 1963) – is also rejected because regurgitation does not coincide with sound production in A. floridendis. However, the third mechanism – that sounds are produced by airflow from the gut – is supported by our results. This has been proposed for some species of adult hawkmoths (Zagorinsky et al., 2012), but the mechanism has only been studied in the Death's-head hawkmoth, Acherontia atropos. When disturbed, these moths produce a series of ‘squeaks’ (Busnel and Dumortier, 1959; Ewing, 1989). The most recent study by Brehm et al. (2015) reported that each squeak always consisted of two ‘phases’: a larger amplitude phase followed by a second phase of lower amplitude and shorter duration. Brehm et al. (2015) proposed that the first phase results from air being drawn into the pharynx (inflation) and the second phase by air being pushed out of the pharynx (deflation) through the proboscis. It was also proposed that sound pulses result from vibrations of an epipharyngeal lobe. Using ablation methods, Brehm et al. (2015) confirmed the head and proboscis were important for sound emission. They also demonstrated using in vivo CT scanning and high-speed videography that rhythmic movements of the thorax accompanied sound production, providing support for the proposal that the two sound phases resulted from ‘inflation’ and ‘deflation’. However, we argue that the specific aeroacoustic mechanisms (e.g. whether sounds are generated by membrane vibration or aerodynamic flow) for A. atropos, or any other insect proposed to produce sounds using forced air, remain unverified. The lack of information on such mechanisms is partly owing to the difficulty in visualizing movements of internal body structures in live insects. To investigate these mechanisms in A. floridensis, we used an alternative approach by constructing models based on morphological measurements, simulating airflow, and comparing results with measured sounds.
Based on our model of A. floridensis sound production, we propose that sound waves are generated by airflow through the orifice between two foregut chambers, and that certain frequencies are subsequently amplified. Additionally, we propose that long and short sound units result from air flowing into and out of the chambers, respectively. As an analogy, consider inflating a balloon where the injection of air into the balloon produces a long continuous sound. During inflation, the neck, or orifice, of the balloon is at a relatively large diameter. If the air is subsequently let out of the balloon, the neck is held at a smaller diameter and a series of short sounds result. In A. floridensis, long units are almost always the first to occur following an attack, even when a caterpillar is eating. If the first sound produced occurred by air flowing out of the gut, we would expect that sometimes food would be expelled; however, regurgitation of gut contents did not coincide with sound production. Also, it is unlikely that the caterpillar has a storage of air readily available to expel because we found no evidence of accessory air sacs. Therefore, A. floridensis must have a mechanism to draw air into and out of the foregut. We speculate that A. floridensis employs musculature that in other caterpillars is used in the contexts of feeding and regurgitation, although the exact mechanisms for these behaviours are not well understood for any larval Lepidoptera to the best of our knowledge (Miles and Booker, 1994; Barbehenn and Kristensen, 2003). One possibility is that the oesophageal dilator muscles (see Fig. 6B) could contract, opening the oesophagus and causing air to be sucked in. The circular muscles of the crop could then contract, forcing the air out. Various hypotheses on muscle involvement in sound production could be further tested through comparative analyses with other vocalizing and non-vocalizing caterpillars, electromyograms or possibly live micro-CT scanning. Mechanisms of sound generation and amplification could be further tested by constructing mechanical 3D models of the foregut chambers and simulating airflow while modifying the sizes of the orifices (e.g. Zhang et al., 2002), or by measuring air pressure directly at the mouth using sensitive probes.
The specific mechanism described here for sound production in A. floridensis appears to be unique to caterpillars. Insects that produce sounds using airflow through spiracles, including the Madagascar hissing cockroach and the walnut sphinx caterpillar, force air outward in one direction by contracting their bodies (Nelson, 1979; Sueur and Aubin, 2006; Bura et al., 2011). Nelson (1979) proposes that sounds in the hissing cockroach are generated by turbulence as air is forced through the narrow neck of the trachea, and then amplified by a tracheal horn. While the mechanism of sound generation by turbulence was not verified, the sound frequencies proposed based on anatomical measurements of the tracheal horn did match the recorded sound frequencies. Therefore, some aspects of this system may resemble sound production in A. floridensis caterpillars. The previously mentioned Death's-head hawkmoth (Brehm et al., 2015), like A. floridensis, produces sounds from the oral cavity. However, based on current knowledge, we surmise that the Death's-head hawkmoth and A. floridensis employ different mechanisms, for three reasons: first, we found no evidence of an epipharyngeal lobe in A. floridensis; second, the temporal patterns of the sounds differ markedly between these two insects; and third, adult and larval Lepidoptera have very different foregut anatomy because they feed on liquids and solids, respectively (Barbehenn and Kristensen, 2003). In vertebrates, examples of sounds generated by aerodynamic mechanisms include various hissing noises, consonants in human speech and ultrasonic vocalizations in mice (Fletcher, 1992; Mahrt et al., 2016). Fletcher (1992) also proposed that human and some bird whistles may result from turbulence as air flows through an aperture followed by amplification of certain frequencies by a Helmholtz resonator. However, no similar mechanisms parallel to A. floridensis involving multi-chamber resonators seem to exist.
The best analogy to sound production in vocalizing caterpillars is a rocket engine. Sound generation and amplification within a series of chambers connected by throats has been reported in acoustic examinations of rocket motors. A solid rocket motor is composed of a series of large, cylindrical connected chambers leading to a nozzle at the bottom of the rocket, with the most well-known example being the white booster rockets that were used with the Space Shuttle (Systems Dynamics Laboratory (US), 1984). For some rocket motors, the internal chambers are connected by a narrow hole that acts as an orifice for the exhaust to flow through. Some rocket motors have encountered unintended noise problems because of this internal geometry, leading to a number of aeroacoustic and combustion studies (Phillips, 1968; Phillips et al., 1969; Combs et al., 1974). The internal geometry and fluid flow of rocket motors is similar to the morphology and airflow of A. floridensis.
Function and evolution
Defensive sound production has now been reported for 20 species of Bombycoidea caterpillars (Bura et al., 2016). Sound production is taxonomically widespread, with four distinctive mechanisms: mandible stridulation, mandible clicking, whistling through spiracles, and, confirmed by this study, ‘vocalization’. Bombycoidea caterpillar sounds are believed to function in defence against vertebrate predators. However, the specific effects that these and most other insect defence sounds have on predators remains poorly understood (Conner, 2014; Dookie et al., 2017). It has been recently proposed that different sound characteristics may communicate different meanings to predators (Bura et al., 2016). We propose that acoustic signalling in A. floridensis functions in startle displays to deter vertebrate predators, including birds, lizards, rodents and gleaning bats. It is unlikely that these sounds function to warn of a chemical defence as A. floridensis lacks chemical-releasing spines or scoli, and regurgitation is not coupled to sound production. Compared with visual defensive ecology in insects, little is known about the functional significance of signal variation in defence sounds (Conner, 2014), and future studies should employ live predators and comparative analyses to test hypotheses (Bura et al., 2016; Dookie et al., 2017).
In vertebrates, vocalizations evolved as byproducts of breathing and eating (Clark, 2016). Because insects do not breathe in the same way as vertebrates, it is worthwhile considering the evolutionary origins of caterpillar ‘vocalization’. In adult hawkmoths, pharyngeal sound production is proposed to have evolved from strong sucking mouthparts (Brehm et al., 2015). However, this cannot be the case for caterpillars because they lack sucking mouthparts. We propose two evolutionary scenarios that may have led to sound production. First, sound production may have been co-opted from a similar mechanism in adults of the same species. However, no adults of species reported to vocalize as caterpillars have been reported to produce sounds, and caterpillars of sound-producing adults do not vocalize (Zagorinsky et al., 2012; Bura et al., 2016). Therefore, this hypothesis is not supported. Second, sound production may have evolved from defensive regurgitation. Regurgitation is a common defence strategy in Bombycoidea larvae (Bura et al., 2016), and some species, deemed primary regurgitators, can direct their regurgitant towards an attacker, control the volume, and re-imbibe the fluid. This control of fluid may be a precursor to controlling airflow.
Air expulsion has been described as the most interesting but by far the least investigated mechanism for producing sounds in insects (Haskell, 1961). Several insects have been proposed to produce sounds by airflow, but the aeroacoustic mechanisms remain elusive for most species. In this study, we describe a novel mechanism of sound production in caterpillars that use their gut chambers to make sounds, akin to a whistling kettle or rocket engine. These results provide a framework for further investigations on the mechanisms and evolutionary origins of sound production by airflow in insects.
Sound-generation hypothesis 1: vibration of a membrane
For the end of the crop, the diameter (=2a) is estimated from the dissections to be 5.29 mm and the thickness (h) to be 0.1 mm. The material properties of A. floridensis specimens were not tested; therefore, the crop of A. floridensis is assumed to be cuticle with a specific stiffness (E/ρ) of 0.3 GPa/(Mg m−3) (Wegst and Ashby, 2007). The value of 0.3 for specific stiffness is used as a representative value. Specific stiffness of insect cuticle ranges from 0.05 to 10 GPa/(Mg m−3) (Vincent and Wegst, 2004). The Poisson's ratio for insect cuticle is not available, but a value of 0.3 may be assumed (Goyens et al., 2014).
The frequencies associated with the first three modes are calculated as shown in Table A2 for specific stiffness values of 0.05, 0.3 and 10 GPa/(Mg m−3). The frequencies measured by the experiment are 32.7 kHz for long units and 26.8 kHz for short units. The circular membrane frequencies shown in Table A2 are below these values, except for modes 2 and 3 for E/ρ=10 GPa/(Mg m−3). These higher modes cannot exist without mode 1, which is below the recorded frequencies; therefore, the hypothesis of vibration of a membrane may be discarded.
Sound-generation hypothesis 2: vibration of a chamber
Sound-generation hypothesis 3: pulsating jet flow through an orifice
Pulsating jet flows occur when a fluid passes through an orifice, as shown in Fig. 7, similar to the sound generation by a tea kettle's whistle. The centre core of the flow increases in velocity, while the portions of the flow closer to the walls of the tube curl around the edges of the orifice, forming a three-dimensional vortex ring similar to a smoke ring. The ring walls collapse, which generates a sound wave; therefore, if many vortex rings are formed sequentially, a steady production of sound waves occurs (Kierkegaard et al., 2012; Zhang et al., 2004).
The calculated values in Table A4 assume a constant orifice thickness of 0.1 mm for all cases because manipulation of the thickness is less likely compared with manipulation of the orifice diameter. The long sound unit corresponds to a larger diameter for the orifice with a Strouhal number for a whistling frequency occurring during inflation. The short sound unit has a Strouhal number within the whistling range during deflation.
Sound amplification hypothesis: multi-chamber, multi-throat Helmholtz resonator
We are grateful to Akito Kawahara and Megan Goulding for assistance with insect and data collection, and to Violet Yacksmith and Margy Nelson for line illustrations.
Conceptualization: C.G.M., J.E.Y.; Methodology: C.A.R.-D., M.L.S., C.G.M., J.E.Y.; Software: C.G.M.; Validation: C.A.R.-D., M.L.S., C.G.M., J.E.Y.; Formal analysis: C.A.R.-D., C.G.M., J.E.Y.; Investigation: C.A.R.-D., M.L.S., C.G.M., J.E.Y.; Resources: J.E.Y.; Data curation: M.L.S., J.E.Y.; Writing - original draft: C.A.R.-D., M.L.S., C.G.M., J.E.Y.; Writing - review & editing: C.A.R.-D., M.L.S., C.G.M., J.E.Y.; Visualization: J.E.Y.; Supervision: J.E.Y.; Project administration: J.E.Y.; Funding acquisition: J.E.Y.
This work was supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada [RGPIN 2014-05947], a New Opportunities Award from the Canadian Foundation for Innovation , and an Ontario Ministry of Research, Innovation and Science Early Researcher Award [ERO7-04-1-44], to J.E.Y.
The authors declare no competing or financial interests.