Drosophila melanogaster hear with their antennae: sound evokes vibration of the distal antennal segment, and this vibration is transduced by specialized mechanoreceptor cells. The left and right antennae vibrate preferentially in response to sounds arising from different azimuthal angles. Therefore, by comparing signals from the two antennae, it should be possible to obtain information about the azimuthal angle of a sound source. However, behavioral evidence of sound localization has not been reported in Drosophila. Here, we show that walking D. melanogaster do indeed turn in response to lateralized sounds. We confirm that this behavior is evoked by vibrations of the distal antennal segment. The rule for turning is different for sounds arriving from different locations: flies turn toward sounds in their front hemifield, but they turn away from sounds in their rear hemifield, and they do not turn at all in response to sounds from 90 or −90 deg. All of these findings can be explained by a simple rule: the fly steers away from the antenna with the larger vibration amplitude. Finally, we show that these behaviors generalize to sound stimuli with diverse spectro-temporal features, and that these behaviors are found in both sexes. Our findings demonstrate the behavioral relevance of the antenna's directional tuning properties. They also pave the way for investigating the neural implementation of sound localization, as well as the potential roles of sound-guided steering in courtship and exploration.
Sound localization is a basic function of auditory systems. In organisms with tympanal ears, sound localization depends primarily on inter-aural differences in the amplitude of eardrum vibrations, as well as inter-aural differences in the timing of those vibrations (Ashida and Carr, 2011; Grothe et al., 2010; Middlebrooks, 2015). Lord Rayleigh (1907) was the first to realize that amplitude differences are mainly used for localizing high-frequency sounds, whereas timing differences are mainly used for localizing low-frequency sounds.
For insects with tympanal ears, sound localization can be a heroic achievement, because the insect's body is small, and so interaural differences are small (Michelsen, 1992; Robert, 2005; Robert and Hoy, 1998). Some insects, such as the tiny fly Ormia ochracea, have specialized tympanal ears that allow them to detect inter-aural timing differences as small as 50 ns (Mason et al., 2001; Miles et al., 1995; Robert et al., 1998). Specializations for directional hearing can also be found at the level of the insect central nervous system, as demonstrated by electrophysiological studies in crickets, locusts and katydids (Atkins and Pollack, 1987; Brodfuehrer and Hoy, 1990; Horseman and Huber, 1994a,b; Marsat and Pollack, 2005; Molina and Stumpner, 2005; Rheinlaender and Römer, 1980; Schildberger and Hörner, 1988; Selverston et al., 1985). In insects, the most well-studied behavioral evidence of sound localization ability is phonotaxis, defined as sound-guided locomotion (Atkins et al., 1984; Bailey and Thomson, 1977; Hedwig and Poulet, 2004; Mason et al., 2001; Schildberger and Hörner, 1988; Schildberger and Kleindienst, 1989; Schmitz et al., 1982).
Because Drosophila melanogaster are small (even tinier than Ormia ochracea), sound localization might seem impossible. However, D. melanogaster have evolved a non-tympanal auditory organ that is well suited to directional hearing. Protruding from the distal antennal segment (a3) is a hairy planar branching structure called the arista (Fig. 1A). The arista is rigidly coupled to a3, so when air particles push the arista, a3 rotates freely (around its long axis) relative to the proximal antenna (Göpfert and Robert, 2002). Sound waves are composed of air particle velocity oscillations as well as pressure oscillations (Kinsler and Frey, 1962), and it is the air particle velocity component of sound that drives sound-locked antennal vibrations (Göpfert and Robert, 2002).
The directional tuning of the Drosophila auditory organ arises from two factors. First, the movement of the arista–a3 structure is most sensitive to air particle movement perpendicular to the plane of the arista (Morley et al., 2012). The two antennae are intrinsically tuned to different air movement directions, because the two aristae are oriented at different azimuthal angles (Fig. 1B). Second, boundary layer effects distort the flow of air particles around the head. Specifically, the shape of the head creates high air particle velocities at the arista contralateral to the sound source, with comparatively lower particle velocities at the ipsilateral arista (Fig. 1C) (Morley et al., 2012). These boundary layer effects reinforce the left–right asymmetry in antennal vibration amplitudes when a sound source is lateralized. Taken together, the intrinsic directionality of the antennae (Fig. 1B) and these boundary layer effects (Fig. 1C) can produce large inter-antennal differences in vibration amplitudes when a sound source is lateralized (Fig. 1D). Specialized mechanoreceptors in Johnston's organ transduce these vibrations, with larger-amplitude vibrations producing larger neural responses (Effertz et al., 2011; Kamikouchi et al., 2009; Lehnert et al., 2013; Patella and Wilson, 2018).
In short, there are good reasons why Drosophila should be capable of sound localization. However, this prediction has not been tested. Most studies of auditory behavior in Drosophila have focused on the effects of auditory stimuli on locomotor speed. For example, when a courting male sings to a receptive walking female, it causes her to gradually slow down, with more singing producing a higher probability of slowing (Bussell et al., 2014; Clemens et al., 2015; Coen et al., 2014). Drosophila also transiently suppress locomotion and other movements in response to nonspecific sounds; these behaviors are termed acoustic startle responses (Lehnert et al., 2013; Menda et al., 2011). No studies have described evidence of sound localization ability in Drosophila.
Here, we report that walking D. melanogaster turn in response to lateralized sounds. They turn toward sounds in their front hemifield (positive phonotaxis), but they turn away from sounds in their rear hemifield (negative phonotaxis), and they do not turn at all in response to sounds originating from 90 or −90 deg. All of these results can be explained by a simple heuristic: D. melanogaster compare vibration amplitudes at the two antennae, and they turn away from the antenna with larger-amplitude vibrations. Although this heuristic is simple, we argue that it can produce potentially adaptive outcomes during courtship and exploration.
MATERIALS AND METHODS
Fly strains and culture conditions
Experiments were performed using cultures of Drosophila melanogaster Meigen 1830 established from 200 wild-caught individuals (Frye and Dickinson, 2004). Flies were cultured in 175 ml plastic bottles on custom food consisting of: 83.40% water, 7.42% molasses solids, 5.50% cornmeal, 2.30% inactivated yeast, 0.51% agar, 0.32% ethanol, 0.28% propionic acid, 0.19% phosphoric acid and 0.08% Tegosept (Archon Scientific, Durham, NC, USA). Culture bottles were started with five female and three male flies, and these parental flies were left in the bottle until the first progeny eclosed. Bottles were stored in a 25°C incubator with a 12 h:12 h light:dark cycle and 50–70% humidity. Progeny were collected on the day of eclosion (0 days old) on CO2 pads and housed (grouped by sex) in vials at 25°C. Females were used for all experiments except those detailed in Fig. 9. Flies were aged 2 days (Figs 2–5) or 1 day (Figs 6–9). A few flies in Figs 4–5 were aged 3 days, and a few flies in Figs 6–9 were aged 2 days.
Gluing was performed the day before an experiment. First, a fly was cold-anaesthetized and moved to a metal block cooled by ice water and covered with a damp low-lint wipe. The fly was immobilized with two glass slides placed at its sides. A small drop of glue (KOA-300, Poly-Lite, York, PA, USA) was mouth-pipetted onto the antenna(e). To immobilize the a1–a2 joint, we placed flies ventral-side down on the metal block, and we used glue to attach a2 to the head. We ensured that the glue did not change the antenna's resting position. To immobilize the a2–a3 joint, we placed flies dorsal-side down, and the glue drop was placed on the medial side of the antenna over the joint. The glue was cured with 3–5 s of UV light (LED-100 or LED-200, Electro-Lite, Bethel, CT, USA; held ∼1 cm from the fly). ‘Sham-glued’ flies underwent the same steps except that no glue was placed anywhere on the fly.
Flies were tethered immediately before an experiment. Flies were immobilized on a cool block as described above. A third glass slide was placed, like a bridge, over the two lateral slides and the abdomen. A drop of glue was placed on the end of a tungsten wire tether, and the wire was lowered onto the thorax with a micromanipulator. The glue was UV-cured as described above. Next, bilateral drops of glue were used to attach the posterior-lateral eye to the thorax and again cured as above.
Tethered walking experiments
The room had a temperature ranging from 21.3 to 22.9°C with a mean of 22.3°C. The humidity ranged from 21 to 51% with a mean of 30%. Most experiments were started 0–6 h before the light→dark transition of the fly's light:dark cycle, but occasionally experiments were started up to 8 h before or 5 h after this time. The fly was lowered onto the spherical treadmill using a micromanipulator attached to the tether. Three cameras with zoom lenses (anterior, dorsal and lateral views) were used to align the center of the fly's thorax with the center of the ball and to adjust the fly's height from the ball. The cameras were one of two USB 2.0 models: FMVU-03MTM-CS or FMVU-13S2C-CS (FLIR Integrated Imaging Solutions Inc., Richmond, BC, Canada). The lenses were also one of two models: MLM3X-MP (Computar, Cary, NC, USA) or JZ1169M mold (SPACECOM, Whittier, CA, USA). The MLM3X-MP lens was mounted to the camera with a 5 mm spacer. The JZ1169M mold was mounted with a 10 mm spacer to increase magnification. The fly was then left to habituate for ∼30 min. The fine alignment of the fly was sometimes adjusted during this period to reduce systematic biases in walking direction.
During an experiment, stimuli were presented in a block design: the order of stimuli within the block was random, and within a block each stimulus condition was presented the same number of times. Stimuli with the same waveform but delivered from a different sound source location were treated as different stimulus conditions. The block size used was either two times (Figs 2, 6–8) or four times (Figs. 2–5, 9) the number of different stimulus conditions used in an experiment. Some experiments were run with a ‘no stimulus’ condition (all except Fig. 6 and part of Fig. 7), but the data for the no stimulus condition is only shown in Figs 2–5. Each trial began with 2 s of silence, followed by the stimulus, and concluded with another 2 s of silence. Between each trial, there was a variable length of time (∼10–20 s).
Each experiment was run for as long as possible (maximum=9.5 h). For each fly and each stimulus, at least 56 ‘accepted’ trials were acquired, where accepted trials were defined as trials in which the resultant velocity was above a threshold of 10 mm s−1 but not saturated (see below). The maximum number of accepted trials per stimulus was 884 and the mean was 296. Mean forward velocity during the pre-stimulus period for most trials was generally ≥10 mm s−1. Experiments were performed on 59 flies, and 49 were included in analyses; four experiments were stopped because the resultant velocity never consistently reached threshold (10 mm s−1), three were stopped because the ball of the spherical treadmill got stuck before sufficient trials were acquired, and three were excluded post hoc because the flies did not consistently run at or above 10 mm s−1 for any portion of the experiment.
Spherical treadmill apparatus
A hollow plastic ball (0.25 inches=0.635 cm diameter) was held in a plenum chamber, supported by a cushion of air under positive pressure, and an optical sensor was positioned below the ball. Spacers 0.125 inches (=0.3175 cm) thick were placed between the sensor lens and the plenum so that the reference surface of the lens was ∼0.125 inches from the ball surface. Data were acquired using an ADNS-9800 High-Performance LaserStream™ Gaming Sensor (Avago, San Jose, CA, USA) and breakout board (JACK Enterprises, Cookeville, TN, USA). An Arduino Due read data from the sensor and sent a digital output to a USB-6343 DAQ (National Instruments). The Arduino-clockwork library (https://github.com/UniTN-Mechatronics/arduino-clockwork) was used to ensure data were read every 10 ms. The sensor outputs x and y velocities with a resolution of 0.31 mm s−1 (Configuration_1 register was set to 8200 counts per inch). The Arduino sent an 8-bit signal to the DAQ for each axis and so velocity was saturated at ±39.37 mm s−1. For later experiments, to reduce saturation, the output velocity range of the Arduino was shifted so that the output range was −15.5 to 63.24 mm s−1. The sensor was factory-calibrated.
Yaw velocity was not measured because the sensor was placed under the ball; the sensor only measured forward velocity (pitch) and lateral velocity (roll). Orienting behaviors can be measured by monitoring roll (Gaudry et al., 2013) because roll and yaw are correlated; however, because we did not measure yaw, we underestimated the magnitude of turns.
The apparatus and speakers were all contained in a sound-absorbing, light-proof box (Lehnert et al., 2013). The floor consisted of a smooth surface with an optomechanical breadboard. There were no light sources except for the laser used by the ball motion sensor (λ=832–865 nm), which is outside the visible range for D. melanogaster (Salcedo et al., 1999).
Design of sound stimuli
Every figure contains data obtained with a stimulus consisting of 10 pips with a carrier frequency of 225 Hz. Each pip lasted for ∼15 ms (adjusted slightly to be a multiple of half a wavelength). The duration between pip onsets was 34 ms. The amplitude envelope was cosine-shaped with a wavelength equal to the duration of the pip.
For the experiments shown in Fig. 8, additional stimuli were used. Pips were identical to those in other figures except the carrier frequency was 100, 140, 300 or 800 Hz, in addition to 225 Hz. Sustained tone stimuli were delivered at the same frequencies; these had the same total duration as the pip trains (0.322 s) and were also modulated by a cosine-shaped envelope, with a wavelength equaling the duration of a pip from the pip stimulus, so that the tones and pips had the same onset and offset profile. All stimuli were synthesized using MATLAB 2017a and sampled at 40 kHz.
Sound delivery and sound intensity measurements
Sound stimuli were delivered from four speakers (ScanSpeak Discovery 10F/4424G00, 89.5 mm diameter) placed 22 cm from the fly, centered on the horizontal plane of the fly. Stimuli were only delivered from one speaker at a time. These speakers were able to produce the frequencies we used with minimal distortions (Fig. S1). Speakers were driven by either a Crown D-45 amplifier (HARMAN Professional Solutions, Northridge, CA, USA) or an SLA-1 amplifier (Applied Research and Technology, Niagara Falls, NY, USA). During calibration, each speaker was driven by the same amplifier and channel that was used during experiments.
All stimuli were calibrated to produce a peak particle velocity of 1.25 mm s−1 at the fly (88 dB SVL), verified for all speakers and all carrier frequencies (Fig. S1). This value was chosen because it is close to the intensity of male song experienced by females during courtship (Bennet-Clark, 1971; Morley et al., 2018). Sound intensity at the fly's location was measured using a particle velocity microphone (Knowles Electronics NR-23158) and pre-amplifier (Stanford Research Systems SR560) as described by Lehnert et al. (2013). The pre-amplifier amplified (500× gain) and band-pass filtered the signal (6 dB/octave roll-off, 3 Hz and 30 kHz cut-offs). Sound intensity was measured in the same box in which behavioral experiments were performed. The particle velocity microphone was placed in the same position, relative to the speaker, as the fly during behavioral experiments. The front face of the particle velocity microphone was parallel to the front face of the speaker. Data from 10 trials were averaged and the pre-stimulus mean was subtracted. The data were then integrated and high-passed filtered with a 10 Hz cut-off. Peak particle velocity was estimated by taking the mean of the peak from several sound cycles. Finally, the command voltage waveform for each stimulus was adjusted (by rescaling the digital command waveform) until the measured peak particle velocity was within 10% of 1.25 mm s−1 for all stimuli. This was necessary to compensate for the frequency characteristics of the speakers.
Data were analyzed offline using custom routines in MATLAB 2016b. Raw velocities were processed by first converting each of the 8-bit binary vectors, output by the Arduino, to signed integers in units of mm s−1. A mode filter was used to remove errors caused by asynchronous updating of the Arduino's digital output channels. Velocities were integrated to obtain x and y displacements. The x and y displacements were set to 0 at the start of the sound stimulus. The data were then downsampled from 40 kHz to 100 Hz.
Trials were excluded from further analysis if the mean pre-stimulus resultant velocity was below threshold (10 mm s−1; Fig. S2). For most flies, 7 to 32% of trials were excluded for this reason. For two flies, ∼70% of trials were excluded because these flies ran consistently at the beginning of the experiment but stopped running consistently later in the experiment. Trials were also excluded if the velocity exceeded the maximum output from the Arduino: 0.1 to 14% of trials (mean=4%) were excluded for this reason. In the figures, only a portion of the 2 s pre-stimulus period is plotted; movement outside the plotted period affects the mean pre-stimulus resultant velocity that was used to select trials.
Measurements were corrected so that the mean pre-stimulus running direction was straight ahead. Specifically, for each trial, the median x and y displacements during the pre-stimulus period of the surrounding 50 trials were calculated to obtain a ‘median trajectory’. The mean x–y displacement of this ‘median trajectory’ was then calculated, and the angle between this trajectory and a straight-ahead trajectory was measured. This angle was then used to rotate the x and y displacements for that trial. This same angle was also used to rotate the x and y velocities for that trial.
To summarize each experiment in stripchart format, we measured lateral velocity and the decrease in forward velocity at two specific time points. Namely, lateral velocity was measured at stimulus offset. The decrease in forward velocity was computed as the forward velocity just before stimulus onset, minus the forward velocity 120 ms after stimulus onset. In trials where no stimulus was delivered, these values were measured at the equivalent time point within the trial epoch.
Statistical analyses were performed in MATLAB 2016b and R version 3.5.1. For Figs 2E, 5C and 7F, paired two-sided t-tests were performed. For Fig. 7E, a two-sided one-sample t-test was performed on the combined data for the two cardinal speakers (90 and −90 deg).
For Fig. 4C,D, a Welch's ANOVA for unequal variances was performed using the oneway function in the lattice package (version 0.20-35) in R. This test showed there were significant differences between the three conditions for both lateral and forward velocity (P<0.005 in both cases). Given the significant differences, the Games–Howell post hoc test was run using the userfriendlyscience package (version 0.7.1) in R.
For Figs 6C,D and 7H, a linear mixed model was used to model the data, with speaker angle as a fixed effect and fly identity as a random effect (to take account of the repeated-measures design). The linear mixed model was implemented with the lme function in the nlme package (version 3.1-13). To test whether speaker angle had an effect on velocity, we compared this model with a baseline model with the angle fixed effect removed. The equations of the two models are written as follows in R:
baseline_model <- lme(velocity ∼1, random = ∼1 | fly, data = my_data, method = “ML”)
augmented_model <- lme(velocity ∼ angle, random = ∼1 |fly, data = my_data, method = “ML”).
We performed a likelihood ratio test to test whether the augmented model was significantly better than the baseline model (implemented with the built-in ANOVA function). Given that the augmented model was significantly better, we performed post hoc Tukey tests to test which speaker angles had significantly different effects (implemented with the multcomp package version 1.4-8).
To analyze the data shown in Fig. 7G, we used the same linear mixed-model approach except that an extra term was added to both the baseline model and the augmented model. This extra term allows the estimated variances to be different for the data for each speaker angle. The models with this additional term are written in R as:
baseline_model <- lme(velocity ∼ 1, random = ∼1|fly, weights=varIdent(form=∼1|angle), data = my_data, method = “ML”)
augmented_model <- lme(velocity ∼ angle, random = ∼1|fly, weights=varIdent(form=∼1|angle), data = my_data, method = “ML”).
Lateralized sounds elicit phonotaxis as well as acoustic startle
To determine whether walking D. melanogaster alter their locomotor behavior in response to sounds, we placed tethered flies on a spherical treadmill (Fig. 2A), and we delivered sounds from azimuthal angles of 45 and −45 deg (Fig. 2B). In each trial, we delivered a sound waveform from one of the two speakers; sounds consisted of 10 pips at 225 Hz, with an inter-pip interval of 34 ms (Fig. 2C). This stimulus was designed to approximate the pulse song of male D. melanogaster. Conspecific song is a sound likely to be encountered by both females (which we focused on initially) and males (which we tested in separate experiments described later). The sound intensity was 0.125 cm s−1 (88 dB SVL) at the fly's location, which is comparable to the intensity of natural courtship song according to classic theoretical predictions (80–95 dB SVL; Bennet-Clark, 1971) and recent measurements (88–99 dB SVL, median sound level for sine and pulse song, respectively; Morley et al., 2018).
In a cohort of 19 female flies, all individuals turned toward these sound stimuli (Fig. 2D–F). Turning was detectable as early as the second or third pip, and it persisted throughout the pip train. After the offset of the pip train, some flies simply returned to walking straight, while others executed a compensatory turn that partially cancelled their deviation from their initial path. For example, in a fly that had turned toward a sound on the right, sound offset often elicited a left turn (Fig. 2G). This compensatory behavior was observed in some but not all flies.
Behavioral responses were variable from trial to trial (Fig. 3, Fig. S3). In some trials, flies did not turn, and on rare occasions they even turned in the ‘wrong’ direction. Overall, however, lateral velocities were clearly shifted in the direction of the stimulus.
In addition to turning in response to sound, flies also tended to stop walking briefly after sound onset (Fig. 3B). In trial-averaged data, this appears as a decrease in forward velocity (Fig. 2D). This decrease in forward velocity can be dissociated from turning (Fig. S4), and so it is not a mere by-product of turning. In many individual trials, flies briefly stopped walking just after sound onset, and then resumed walking, often turning toward the sound as walking resumed (Fig. 3B). We interpret the initial pause as an ‘acoustic startle’ behavior (Lehnert et al., 2013; Menda et al., 2011). The subsequent turn we call ‘phonotaxis’.
Phonotaxis requires vibration of the distal antennal segment
Next, we examined the role of antennal movement in phonotaxis. The antenna has two mobile joints, a distal joint (a3–a2) and a proximal joint (a2–a1). The distal joint vibrates freely in response to sound and transmits these vibrations to Johnston's organ neurons (Göpfert and Robert, 2002). The proximal joint does not vibrate in response to sound; however, muscular control of the proximal joint can indirectly affect sound-induced vibrations of the distal joint. For example, a flying fly can use its antennal muscles to position an antenna so that it is more sensitive to the sound of the ipsilateral wing, thereby increasing the vibrational response of the antenna to that wing's beating rhythm (Mamiya et al., 2011). More relevant to our experiments, a fly can also use its antennal muscles to change the angle of the antenna relative to the sound source (Mamiya et al., 2011), and this could alter sound-evoked antennal vibrations even if the antenna is not positioned appreciably closer to the sound source.
Therefore, we sought to test how both joints contribute to the auditory behaviors we were studying. We divided sibling flies into three groups. In the first two groups, we used drops of glue to bilaterally immobilize the proximal joint or the distal joint (Fig. 4A). In the last group, we handled and cold-anesthetized the flies just as in the other two groups, but we did not immobilize the antennae (‘sham-glued’).
We found that eliminating voluntary movements of the antennae had no effect on sound-induced turning (Fig. 4A–C). It also had no effect on acoustic startle (Fig. 4A,D). Thus, neither behavior requires muscular control of the antennae.
By contrast, eliminating sound-evoked vibrations of the distal antennal segment completely abolished sound-evoked turning (Fig. 4A–C). It also abolished acoustic startle (Fig. 4A,D). These results indicate that both behaviors are responses to sound-evoked vibrations of the distal antennal segment, and not responses to sound-evoked vibrations of the spherical treadmill (given that insects also have vibration sensors in their legs and/or tarsi; Fabre et al., 2012; Michelsen, 1992).
Turning is contralateral to the antenna with larger vibrations
When a sound is in the front hemifield, it is the contralateral antenna that vibrates more (Morley et al., 2012). For example, a speaker at 45 deg should produce larger vibrations in the left antenna; conversely, a speaker at −45 deg should produce larger vibrations in the right antenna (Fig. 1D). We therefore hypothesized that the nervous system compares vibration amplitudes at the two antennae, and steers away from the antenna with the larger amplitude.
To test this idea, we asked what happens when the speaker is placed directly in front of the fly, but only one antenna is allowed to vibrate. We eliminated vibrations in the other antenna by immobilizing the distal antennal joint. Under these conditions, we found that every fly steered away from the intact antenna (Fig. 5). This result supports the hypothesis that the fly turns away from the antenna with the larger vibration amplitude.
As an aside, we note that this rule – turning away from the antenna with the larger vibration amplitude – also occurs in flying D. melanogaster (Mamiya et al., 2011). However, in that case, the sound source is not an object in the external environment, but the fly's own wing. When the antennae ‘hear’ that the two wings are beating with asymmetric amplitudes, this drives a reflex that amplifies the wingbeat amplitude on the side where it is already larger. The proposed function of this reflex is to reinforce the fly's own ongoing turning maneuver in flight.
Lateralized sounds arriving from the back elicit negative phonotaxis
We next asked what happens when sounds originate from behind the fly. We placed two speakers in the back hemifield, at 135 and −135 deg. These are the two positions in the back hemifield where auditory sensitivity is highest (Morley et al., 2012) and they are the two positions predominately occupied by the wing of a singing male in the coordinate frame of a courted female (Morley et al., 2018). For comparison, we also placed two speakers in the front hemifield (at 45 and −45 deg).
We found that sounds arriving from the front-right and back-left elicited indistinguishable right turns (45 and −135 deg; Fig. 6A–C). Conversely, sounds arriving from the front-left and back-right elicited indistinguishable left turns (−45 and 135 deg). In other words, sounds in the front hemifield elicited positive phonotaxis, whereas sounds in the back hemifield elicited negative phonotaxis. All four stimuli elicited similar acoustic startle responses (Fig. 6A,D), suggesting that all four stimuli had similar perceived intensity.
This pattern of phonotaxis fits with the antenna's vibration amplitude tuning (Morley et al., 2012, 2018). Antennal vibration amplitudes should not change when the sound source moves from 45 to −135 deg (Fig. 6E). The finding that these two speaker positions elicit the same phonotaxis behavior is therefore evidence that phonotaxis depends on vibration amplitude cues alone.
It should be noted that vibration amplitude cues are not the only cues available for phonotaxis. When the speaker position moves from 45 to −135 deg, the direction (phase) of all antennal movements should be inverted (Fig. 6F), and in principle, this could have inverted the fly's behavior. Specifically, we might imagine a rule whereby the fly steers toward the first detectable vibration in either antenna. If the first detectable vibration was rightward for a speaker positioned at 45 deg, then the first detectable vibration would be leftward for a speaker positioned at −135 deg playing the same sound waveform. Our results imply that phonotaxis is not guided by this type of vibration-direction rule. Phonotaxis can be most parsimoniously explained by vibration amplitude cues alone. That said, Drosophila might rely on vibration-direction cues in other contexts, given that there are neurons in the brain that keep track of vibration direction (phase) information (Azevedo and Wilson, 2017).
Sounds from any of the four cardinal directions elicit no phonotaxis
Next, we tried delivering the same sounds from speakers at 90 or −90 deg. We found that both of these speaker locations elicited no phonotaxis (Fig. 7A,B). Importantly, flies were not deaf to these speaker locations, because both stimuli elicited acoustic startle behavior that was similar to the acoustic startle elicited by speakers at 45 and −45 deg, measured in the same flies.
We noticed that the speakers at 90 and −90 deg often evoked small turns, but a given fly typically made small turns in the same direction in both cases. For example, flies 1 and 2 made right turns to both the 90 deg stimulus and the −90 deg stimulus; these flies were also biased rightward in general (i.e. the turn towards the 45 deg stimulus was larger than the turn towards the −45 deg stimulus; Fig. 7A,B). It seems likely that the small turns in response to the 90 and −90 deg stimuli were due to some idiosyncratic ‘handedness’ in each fly, either biological handedness (Buchanan et al., 2015) or else a slight artifactual asymmetry in the way the fly interacted with the spherical treadmill apparatus. When a fly resumed walking after an acoustic startle response, this handedness evidently produced a small nonspecific bias in its walking behavior. The key point here is that no fly turned in opposite directions in response to the 90 and −90 deg stimuli; thus, turning was not guided by the position of the stimulus, meaning it was not phonotaxis.
In a separate set of flies, we compared responses to speakers at 0, 45, 90, and 180 deg. As expected, flies always turned toward the 45 deg speaker. By contrast, speakers at 0, 90, and 180 deg did not elicit consistent turning. All three of the latter stimuli elicited either straight walking or small idiosyncratic turns, with a given fly typically making these idiosyncratic turns in the same direction for all three stimuli (Fig. 7C,D).
In summary, we found no phonotactic response to stimuli arriving from any of the cardinal directions (90, −90, 0 and 180 deg); however, all these stimuli elicited an acoustic startle response, confirming that they are all audible (Fig. 7E–H). Why do flies not phonotax in response to sounds arriving from 90 or −90 deg? A speaker at 90 deg should elicit equal left–right vibration amplitudes; the same should be true for any stimulus arriving from a cardinal direction. What distinguishes these four stimuli is the direction (phase) of antennal vibrations. For example, speakers at 0 and 180 deg should cause the antennae to move toward the midline at the same phase of the sound cycle. By contrast, speakers at 90 or −90 deg should cause the antennae to move toward the midline at opposite phases of the sound cycle (Fig. 7I). Which antenna initially moves towards the midline will depend on whether the speaker is at 90 or –90 deg. Our results indicate that none of these phase differences matter for the behaviors we measured in our experiments. All that seems to matter is the amplitude of antennal vibration, and if left–right amplitudes are equal, there is no systematic tendency to turn relative to the sound source location.
Phonotaxis generalizes to sounds with diverse spectro-temporal features
Thus far, we have used a train of sound pips with a fixed carrier frequency (225 Hz). We initially selected this frequency because it is close to the dominant frequencies in D. melanogaster pulse song (Murthy, 2010). However, phonotaxis might have relevance for other situations beyond courtship. This idea motivated us to test a wider range of sound carrier frequencies (100, 140, 225, 300 and 800 Hz; Fig. 8A). In the same experiments, we also tried varying the temporal structure of the sound stimulus: in addition to delivering pips, we delivered sustained tones (322 ms in duration, the same duration as the pip trains; Fig. 8A). We used a particle velocity microphone to verify that all stimuli had the same intensity at the fly's location. All stimuli were delivered from two speaker positions, 45 and −45 deg.
We observed phonotaxis behavior in response to both pip trains and sustained tones, at every carrier frequency we tested (Fig. 8B–D). Every carrier frequency also elicited acoustic startle behavior (Fig. 8B,E). In general, behavioral responses were similar for all carrier frequencies. The one clear exception was 800 Hz, which evoked weaker responses.
Both males and females display phonotaxis
Courtship is one potential natural situation in which phonotaxis would be relevant. In the context of courtship, females listen to male song (Hall, 1994), but males may also listen to the songs of nearby males (Boekhoff-Falk and Eberl, 2014; Tauber and Eberl, 2002). This motivated us to compare the behavior of males and females.
We returned to our standard sound stimulus (10 pips at 225 Hz, with an inter-pip interval of 34 ms), and we again positioned speakers at 45 and −45 deg. We found that phonotaxis was similar in males and females (Fig. 9A–C), as was acoustic startle behavior (Fig. 9A,D). Thus, these behaviors are not sex specific.
Hearing with one ear
If one ear is transiently plugged in a normal human subject, the subject will consistently mislocalize sounds to the side of the intact ear (Middlebrooks, 2015). The same occurs in crickets and grasshoppers (Moiseff et al., 1978; Ronacher et al., 1986). Crickets and grasshoppers, like humans, have tympanal ears. The tympanum closer to the sound vibrates with larger amplitude and/or leading phase. Thus, in order to orient toward a sound, humans and crickets should turn toward the ear with the larger (and/or leading) response. When one ear is blocked, this rule produces turning toward the intact ear.
By contrast, in unilaterally deafened D. melanogaster, we observed the opposite reaction: flies turned away from the intact side. This tells us that D. melanogaster use a flipped rule: they turn away from the auditory organ with the larger response. This rule makes sense for D. melanogaster, because they have flagellar rather than tympanal auditory organs, and each flagellar auditory organ is optimally stimulated by sound sources in the contralateral front hemifield (Morley et al., 2012, 2018). In short, the flip in auditory mechanics likely explains the flipped outcome of the unilateral deafening experiment.
Ambiguities in binaural cues
Even when both ears are functional, it is still possible to find systematic errors in sound localization. For example, in vertebrates, every azimuthal location in the front hemifield maps onto to another location in the back hemifield that elicits the same inter-aural cues. This can cause front–back ambiguities in perception when the stimulus is a low-frequency pure tone (Rayleigh, 1876; Schnupp et al., 2011).
In D. melanogaster, we would not expect to find front–back ambiguity, because the relevant cues are not symmetric in the front and back. Instead, we would predict a different type of ambiguity. Each antenna has a vibration amplitude tuning curve that is symmetrical about both azimuthal diagonals. Thus, any pair of antennal vibration amplitudes maps onto a set of azimuthal locations that are reflections across the diagonals (Fig. 1D). For example, the same pair of antennal vibration amplitudes maps to 45 deg and also to its reflection across the diagonals (−135 deg), and accordingly, we found indistinguishable behavioral responses to these two stimulus locations. Similarly, 0 deg reflects to 90 deg (across one diagonal), −90 deg (across the other diagonal) and 180 deg (across both diagonals), and again we observed the same behavioral responses to all four of these speaker locations. In short, the behavioral ambiguities we found are just what we would predict from antennal vibration tuning curves (Fig. 1D). Therefore, our findings support the conclusion that antennal vibration amplitudes are the physical cues that specify phonotaxis behavior.
Fine discrimination of nearby sound source locations
High-acuity discrimination of nearby sound source positions has been well documented in crickets. These insects can localize sound sources with an azimuthal precision close to 10 deg (Bailey and Thomson, 1977; Latimer and Lewis, 1986; Pollack, 1982). In this regard, the fly O. ochracea is a particular virtuoso: walking O. ochracea can orient toward sound sources with a precision as fine as 1 deg (Mason et al., 2001).
In the future, it will be interesting to investigate whether Drosophila can also discriminate between sound source locations separated by these small angles. However, successful phonotaxis should not require a fly to precisely identify a sound source location. When a walking insect encounters a lateralized sound, it can simply turn toward the sound until it is no longer lateralized (Bailey and Stephen, 1984). As long as the fly can detect small deviations from the midline in sound source position, it does not need to precisely localize the sound in order to approach it.
Sound versus wind
Drosophila sense the particle velocity component of a sound wave – i.e. the movement of air that accompanies each sound cycle (Göpfert and Robert, 2002; Robert and Hoy, 2007). Wind is also simply the movement of air. However, there are three key differences between sound and wind. First, air particle velocities are lower in sound: a female fly experiences an air speed on the order of 0.1 cm s−1 as she listens to a male courtship song (Bennet-Clark, 1971; Morley et al., 2018), whereas air speeds more than 100× larger are typical of atmospheric conditions in natural environments where Drosophila are active (Budick and Dickinson, 2006). Second, wind is a spectrally broadband non-harmonic stimulus, whereas sound stimuli are narrowband and typically harmonic (Robert and Hoy, 2007). Third, steady wind will produce a large sustained displacement of the antennae, whereas sound produces zero net displacement of the antennae. A corollary of this last point is that wind generates substantial bulk displacement of air particles, whereas sound does not.
These differences between sound and wind are evidently decisive, because walking Drosophila treat wind and sound differently. Here, we showed that walking D. melanogaster make systematic turns toward sounds. However, walking D. melanogaster do not make systematic turns upwind, except when odor is present (Álvarez-Salvado et al., 2018; Bell and Wilson, 2016; Steck et al., 2012). Thus, the fly's behavior clearly discriminates sound from wind. Johnston's organ neurons also discriminate sound from wind (Yorozu et al., 2009), although it should be noted that many Johnston's organ neurons (Mamiya and Dickinson, 2015; Patella and Wilson, 2018) and some central neurons (Chang et al., 2016) respond to both sound and wind.
Phonotaxis in courtship
During courtship, phonotaxis could help a male locate a female. Specifically, a male may turn toward the song of a competing male who is standing near a female (Boekhoff-Falk and Eberl, 2014). A male may also turn toward the sounds that females make during courtship (Ejima and Griffith, 2008; Ewing and Bennet-Clark, 1968).
In contrast, phonotaxis may cause females to turn away in response to male song. This is because a courting male is generally behind the female he is targeting (Hall, 1994; Morley et al., 2018). We find that the generic behavioral response to a sound in the back hemifield is to turn away from the sound. Turning away would fit the observed trend toward ‘female coyness’ during courtship: even virgin females typically display continual mild rejection behaviors in response to male pursuit (Hall, 1994). A male will often initiate song dozens of times before copulation begins (Zhang et al., 2016). Female coyness ensures that females mate only with males who are fit enough to maintain pursuit in the face of mild rejection. Ultimately, song tends to cause virgin females to slow down, but only if the male sings for a long time (Talyn and Dowse, 2004; Coen et al., 2014). In short, negative phonotaxis to sound sources in the back hemifield may not simply be a ‘perceptual error’: it may be an adaptive trait causing females to select fitter males.
Phonotaxis in exploration
D. melanogaster have a set of basic rules for exploring arbitrary visual objects. For example, one rule is to prioritize close visual objects over distant ones (Götz, 1994; Schuster, 1996; Schuster et al., 2002). Another rule is to orient toward visual objects in the front hemifield, while ignoring visual objects in the back hemifield (Horn and Wehner, 1975) or turning away from objects in the back hemifield (Mronz and Strauss, 2008). If objects behind the fly were not deprioritized, then the fly could become permanently ‘captured’ by any object it approached (Bülthoff et al., 1982). Thus, the ‘front-not-back’ rule promotes visual exploration, because it allows the fly to avoid recapture by an unrewarding object it has just turned away from.
Our results suggest a similarity between visually guided walking and sound-guided walking. Namely, we showed that flies turn toward sound objects in the front hemifield, but they turn away from sound objects in the back hemifield. Thus, vision and hearing both use a simple ‘front-not-back’ rule. The potential utility of this rule is the same in both cases: it allows the fly to avoid being recaptured by an unrewarding object it has just turned away from. This again makes the point that negative phonotaxis to sound sources behind the fly may not simply be an ‘error’, because it may be adaptive in some situations.
Neural basis of phonotaxis
In crickets, inter-aural comparisons begin at the level of cells postsynaptic to peripheral auditory afferents. These cells receive antagonistic input from the two ears (Selverston et al., 1985). By analogy, we might imagine that inter-antennal comparisons could occur at the very first stage of auditory processing in the fly brain. Indeed, the first auditory relay in the D. melanogaster brain contains many interhemispheric projections. This relay is called the antennal mechanosensory and motor center (AMMC) (Matsuo et al., 2016).
However, a recent pan-neuronal calcium imaging study showed that the AMMC is unresponsive to vibration of the contralateral antenna; rather, AMMC vibration responses are strictly unilateral (Patella and Wilson, 2018). By contrast, vibration responses in the brain's secondary auditory center (the wedge) are driven by both ipsilateral and contralateral antennae. This result suggests that inter-antennal vibration comparisons might begin within the brain's secondary auditory center.
An interesting – and complicating – consideration is that the mechanical resonant frequency of the antennae depends on stimulus intensity (Göpfert and Robert, 2002). Recall that rotating the azimuthal angle of a sound source generally produces anticorrelated changes in the effective sound intensity at the two antennae (Fig. 1D) (Morley et al., 2012). Therefore, rotating the azimuthal angle of a sound source should generally produce anticorrelated changes in the frequency tuning of the two antennae (Morley et al., 2018). Future work will be needed to understand how this might affect the neural implementation of inter-antennal vibration comparisons.
From phonotaxis to navigation
Ultimately, sound localization cues must be integrated with other sensory cues that provide spatial guidance for walking flies. These guidance cues include visual objects (Horn and Wehner, 1975; Robie et al., 2010; Schuster et al., 2002), global visual motion signals (Götz and Wenking, 1973; Katsov and Clandinin, 2008; Strauss et al., 1997), tactile guidance cues (Ramdya et al., 2015), wind direction cues (Bell and Wilson, 2016; Steck et al., 2012) and instantaneous samples of olfactory spatial gradients (Borst and Heisenberg, 1982; Gaudry et al., 2013).
Meanwhile, sensory guidance cues must also be integrated with the fly's internal representation of its heading direction state (Seelig and Jayaraman, 2015). Ultimately, steering decisions must be governed by flexible ‘policies’ dictating the current preferred heading direction, depending on idiothetic coordinates (Kim and Dickinson, 2017; Neuser et al., 2008; Strauss and Pichler, 1998) and the priorization of guidance cues (Bülthoff et al., 1982; Robie et al., 2017; Schuster et al., 2002). Describing the contributions of individual sensory guidance cues is a step toward understanding navigation as a whole.
We thank Allison Baker Chang for contributing to pilot studies, and members of the Wilson lab for helpful discussions and feedback on the manuscript. We thank Ofer Mazor and Pavel Gorelik (Harvard Medical School Research Instrumentation Core) for technical support.
Conceptualization: A.V.B., R.I.W.; Methodology: A.V.B.; Software: A.V.B.; Validation: A.V.B.; Formal analysis: A.V.B.; Investigation: A.V.B.; Resources: A.V.B.; Data curation: A.V.B.; Writing - original draft: A.V.B., R.I.W.; Writing - review & editing: A.V.B., R.I.W.; Visualization: A.V.B., R.I.W.; Supervision: A.V.B., R.I.W.; Project administration: A.V.B., R.I.W.; Funding acquisition: A.V.B., R.I.W.
This work was supported by a Kennedy Scholarship and a Boehringer Ingelheim Fonds PhD Fellowship (to A.V.B.), and the National Institutes of Health (R01 NS101157 to R.I.W.). R.I.W. is a Howard Hughes Medical Institute Investigator. Deposited in PMC for release after 12 months.
Data and analysis routines are available upon request.
The authors declare no competing or financial interests.