ABSTRACT
We carried out ex vivo and in vivo experiments to explore the functional role of the ventricular folds in sound production in macaques. In the ex vivo experiments, 29 recordings out of 67 showed that the ventricular folds co-oscillated with the vocal folds. Transitions from normal vocal fold oscillations to vocal–ventricular fold co-oscillations as well as chaotic irregular oscillations were also observed. The in vivo experiments indicated that the vocal–ventricular fold co-oscillations were also observed in two macaque individuals. In both ex vivo and in vivo experiments, the vocal–ventricular fold co-oscillations significantly lowered the fundamental frequency. A mathematical model revealed that the lowering of the fundamental frequency was caused by a low oscillation frequency inherent in the ventricular folds, which entrained the vocal folds to their low-frequency oscillations. From a physiological standpoint, the macaques may utilize the ventricular fold oscillations more frequently than humans. The advantages as well as disadvantages of using the ventricular folds as an additional vocal repertory are discussed.
INTRODUCTION
In the study of human speech, the physics and physiology underlying the mechanism of sound production have been thoroughly investigated and are well understood (Baken and Orlikoff, 2000; Titze, 2000; Titze and Alipour, 2006). In contrast, only a limited information is available on the vocalizations in non-human primates. Their sound production mechanism has been inferred mainly from acoustical recordings of calls in natural environments. Nevertheless, anatomical studies have revealed that humans and non-human primates share a striking similarity in their vocal organs, implying that their sound production mechanisms should be also similar (Herbst et al., 2018; Nishimura, 2020; Nishimura et al., 2022).
In human speech, the voiced sounds are produced from flow-induced oscillations of the vocal folds and the generated source sounds excite resonances of the vocal tract (Titze, 2000). In non-human primates, oscillations of the vocal folds as well as the vocal membranes play an important role in sound production (Regner et al., 2010; Alipour et al., 2013; Titze et al., 2016; Herbst et al., 2018; Nishimura et al., 2022). The vocal membrane is an appendage extending from the vocal fold and its sole oscillations or co-oscillations with the vocal fold contribute to the generation of the source sounds (Mergell et al., 1999; Zhang et al., 2019; Nishimura et al., 2022). Hereafter, for simplicity, we refer to the whole system of the vocal fold and the vocal membrane simply as the ‘vocal fold’. Whether the animals utilize the vocal tract resonances in their communications is under debate (Lieberman et al., 1969; Lieberman, 1977; Riede and Zuberbühler, 2003; Fitch et al., 2016) but, at least, they are capable of regulating the vocal tract shape to modify the acoustical characteristics of the vocalized sounds (Nishimura et al., 2003; Reby et al., 2005; Riede et al., 2005).
Here, we focus on another organ, the ventricular folds. In humans, the ventricular folds (also called the ‘vestibular folds’ or ‘false vocal folds’) are located just above the true vocal folds. Although they do not vibrate during normal phonations, recent studies showed that they do vibrate and influence the vocalizations under special circumstances. For instance, in throat singing, growling or shouting, the ventricular folds co-oscillate with the vocal folds and produce low-pitched sounds (Fuks et al., 1998; Lindestad et al., 2001; Sakakibara et al., 2001,, 2002; Bailly et al., 2010). Here, the ventricular folds vibrate at half (or a third) of the vocal frequency, making the whole laryngeal system vibrate at half (or a third) of the frequency. Vibrations of the ventricular folds may also induce voice pathology, in which involuntary oscillations of the ventricular folds interfere with the vocal fold oscillations to produce a hoarse voice (Jackson and Jackson, 1935; Voelker, 1942; Fred, 1962; Lindestad et al., 2004). In animal vocalizations, the role of the ventricular fold oscillations is yet to be clarified. In excised canine larynges, Finnegan and Alipour (2009) observed co-oscillations of the vocal and ventricular folds, which added some irregularity into the acoustic output. Herbst et al. (2020) compared excised larynges of pigs with and without the ventricular folds and observed lowered fundamental frequencies during co-oscillations of the vocal and ventricular folds. The laryngeal anatomy of pigs implies that the ventricular folds may inevitably oscillate together with the vocal folds, thereby making a possible contribution to their vocalized sounds. In an excised larynx experiment on echolocating bats, Håkansson et al. (2022) reported that the ventricular folds are used to expand the vocal range to a low frequency. Despite these ex vivo studies, to our knowledge, no in vivo experiment has been carried out to clarify the potential role of the ventricular folds in animal vocalizations. Moreover, their importance in non-human primates is largely unknown.
The aim of the present study was to examine ventricular fold oscillations in the sound production of macaques and to clarify their acoustic function. To date, macaque vocal communications have been studied based on the acoustic analysis and classification of their calls in a social context (Itani, 1963; Beecher et al., 1979; Le Prell and Moody, 1997; Cheney et al., 1992; Katsu et al., 2016; Bouchet et al., 2017). Herbst et al. (2018) performed ex vivo and in vivo measurements of vocal fold oscillations during various phonations in Japanese macaques and found a similarity of their phonatory mechanism to that of humans. In the present study, we carried out ex vivo and in vivo measurements of the ventricular fold oscillations of rhesus macaques to test our hypothesis that the rhesus macaques utilize ventricular fold oscillations as a variant of vocalizations to produce low pitch sounds. By direct observation of the vocal–ventricular fold system, we examined whether the ventricular folds co-oscillate with the vocal folds. The preceding studies on the singing voice and animal vocalization suggest that the ventricular folds may lower the vocal pitch compared with vocalizations without ventricular fold oscillations (Fuks et al., 1998; Lindestad et al., 2001; Sakakibara et al., 2001, 2002; Bailly et al., 2010; Håkansson et al., 2022). Through ex vivo and in vivo experiments, we examined whether the vocal pitch is indeed lowered in the presence of the ventricular fold oscillations. A computational model was further simulated to elucidate the experimental findings.
MATERIALS AND METHODS
Ex vivo experiments
This study was carried out on six adult rhesus macaques, Macaca mulatta (Zimmermann 1780) (males: nos 1, 2, 4; females: nos 3, 5, 6; see Table 1); their experimental euthanasia was approved for other studies by the Animal Welfare and Animal Care Committee of the Primate Research Institute, Kyoto University (permission 2020-001 for macaques 1, 3, 6; permission 2019-162 for macaque 2; permission 2018-177 for macaques 4 and 5) and was undertaken according to the 3rd edition of the Guide for the Care and Use of Laboratory Primates of the Primate Research Institute. No ethical approval was required for the present study. The laryngeal samples were extracted from the fresh cadavers, flash-frozen in liquid nitrogen, and stored at −80°C at the Primate Research Institute, before they were transferred to Ritsumeikan University, Kusatsu, Japan, where the ex vivo experiment was conducted.
Each larynx was thawed and mounted on a vertical tracheal tube, into which warm humid air (∼37°C, 100% relative humidity) was injected from an air pump (SilentAirCompressor Sc820, Hitachi Koki Co., Ltd, Tokyo, Japan). The flow rate was controlled by a pressure regulator (10202U, Fairchild, Winston-Salem, NC, USA) and a digital mass flow controller (CMQ-V, Azbil, Santa Clara, CA, USA). To induce vocal fold oscillations, the glottal air space was closed by manually adducting the arytenoid cartilages. To further induce ventricular fold vibrations, the supraglottal space just above the true vocal folds was also narrowed.
The dynamics of the vocal and ventricular folds was monitored using a high-speed video camera (Fastcam Nova S6, Photron, Tokyo, Japan; sampling rate: 10,000–20,000 frames s−1; image resolution: 160×256, 256×256 pixels) with a borescope (BAL-72718HT, Shodensha, Osaka, Japan). The acoustic sound and the sound pressure level (SPL) were measured by an omnidirectional microphone (Type 4192, Brüel and Kjær, Tokyo, Japan; frequency range: 3.15 Hz to 20 kHz; sensitivity: 12.5 mV Pa−1) (Nexus conditioning amplifier, Brüel and Kjær; low-pass filter with a cut-off frequency of 100 kHz; high-pass filter with a cut-off frequency of 0.1 Hz) and a sound level meter (Type 2250–A, Brüel and Kjær; frequency range: 5 Hz to 20 kHz; sensitivity: 50 mV Pa−1 ±2 dB), respectively, both located 30 cm from the larynx. The subglottal pressure was monitored using a pressure transducer (differential pressure transducer, PDS 70GA, Kyowa, Osaka, Japan; signal conditioner, CDV 700A, Kyowa; frequency range: 0 Hz to 170 kHz; sensitivity: 0.3612 kPa mV−1), which was mounted flush with the inner wall of the tracheal tube, 2 cm upstream of the excised larynx. All signals were stored on a digital recorder (controller, PXIe-8840; input/output card, BNC-2110; software, LabView; all from National Instruments, Austin, TX, USA) with a sampling frequency of 12.5 kHz. From the high-speed videos, which captured surface movements of the vocal and ventricular folds on medial–lateral and anterior–posterior axes, information on the medial–lateral axis was extracted into the kymogram (Svec and Schutte, 1996; Qiu et al., 2003; Svec et al., 2007) using Matlab software (R2020a, MathWorks, Natick, MA, USA).
To examine the acoustic features (i.e. fundamental frequency, vocal efficiency and onset pressure) as characteristics of the ventricular fold oscillations, unbalanced 2-factor analysis of variance (ANOVA) was performed (Dunn and Clark, 1974) using the anovan function of Matlab. As the two factors, vocalization type (vocal fold oscillation and vocal–ventricular fold co-oscillation) and individuality were considered. The sample size was N=6 for individuality (as six macaque individuals were used); those for the vocalization types are listed in Table 1. When a significant difference was detected, Tukey's honestly significant difference (HSD) test was carried out to assess the individual differences.
In vivo experiments
In vivo experiments were conducted at the Primate Research Institute according to the 3rd edition of the Guide for the Care and Use of Laboratory Primates of the Primate Research Institute (2010). Three adult male rhesus macaques (nos 7, 8, 9) were examined (see Table 2). Experimental euthanasia for each subject was performed for the purpose of other neuroanatomical experiments, in which neural tracers were injected to examine the neuroanatomical bases for the vocalization. These neuroanatomical experiments required euthanasia, but no animals were killed solely for the present studies. Our studies were planned along with the neuroanatomical experiments, and the experimental procedures involved in our studies were approved alongside those of the neuroanatomical experiments (permission 2020-198 for macaque 7; permission 2020-217 and 2021-063 for macaque 8; permission 2021-063 for macaque 9).
Here, a previously described experimental procedure was utilized (see Nishimura et al., 2022, for details) and the same data were reused for two of the three subjects (nos 7, 8). The macaque was anesthetized with ketamine hydrochloride (2.5 mg kg−1 body mass) and medetomidine (0.1 mg kg−1 body mass) and seated in a primate chair with its head fixed in a stereotaxic frame attached to the chair. Following partial removal of the skull, a tungsten microelectrode (FHC Inc., Bowdoin, ME, USA) was inserted into the brain. The periaqueductal gray and its surrounding area – the brain region evoking vocalizations – was mapped using electric stimulation (cathodal pulse trains; typical parameters: 0.2 ms pulse duration, 400 Hz, 0.5 s, 20–250 μA). When the brain was stimulated, the glottal behavior was recorded using the borescope attached to the high-speed camera (sampling rate: 8000–20,000 frames s−1; image resolution: 256×256, 336×384, 384×384, 480×640 pixels). The acoustic signals were captured by a microphone (Okwint; frequency range: 35 Hz to 18 kHz; sensitivity: −30±2 dB) with audio software (Audacity, Audacity Team, https://audacityteam.org/), digitized at 44.1 kHz. At the end of the first stimulation session, a plastic chamber was fixed onto the skull to cover the dural surface.
Computational model
To elucidate the findings of the ex vivo and in vivo experiments, a mathematical model, proposed by Fuks et al. (1998) and Sakakibara et al. (2002), was simulated. The four-mass model (see below) is composed of two sets of two-mass models, where the lower two masses (M1 and M2) and the upper two masses (M3 and M4) represent the vocal and ventricular folds, respectively. The model assumes the following: (1) the left and right vocal (or ventricular) folds move in a symmetric manner; (2) the sub- and supra-glottal systems do not influence the vocal and ventricular fold dynamics; and (3) the glottal pressure exists only below the narrowest part of the glottis and obeys the Bernoulli principle (Steinecke and Herzel, 1995).
To simulate the main features of the vocal–ventricular fold oscillations, parameter values of the model were adopted from the standard setting of the two-mass model (Ishizaka and Flanagan, 1972; Steinecke and Herzel, 1995): m1=0.03125 g; m3=m1/T g; m2=0.0625 g; m4=m2/T g; d1=d3=0.125 cm; d2=d4=0.025 cm; k1=0.08 g ms−2; k3=Tk1 g ms−2; k2=0.008 g ms−2; k4=Tk2 g ms−2; k1,2=k2,1=k3,4=k1,2=k4,3=0.025 g ms−2; c1=3k1; c2=3k2; c3=3k3; c4=3k4; a01=0.0125 cm2; a02=0.0125 cm2; L=0.7 cm. As a parameter to scale the weight and the stiffness of the ventricular folds, the tension parameter T was introduced (Ishizaka and Flanagan, 1972; Steinecke and Herzel, 1995). By this parameter, oscillation frequency of the ventricular folds becomes T times as high as that of the vocal folds. The damping coefficients were determined as ri=2ζi√(miki) (i=1,2,3,4) using a damping ratio of ζi=0.15. Although the standard parameter setting was aimed for the human voice, it has also been widely applied to animal vocalizations (Mergell et al., 1999; Koda et al., 2012). The model was simulated by a 4th order Runge–Kutta algorithm with an integration time step of Δt=0.05 ms.
RESULTS
Ex vivo experiments
First, the laryngeal anatomy was inspected in one macaque individual to confirm that the ventricular fold was indeed located above the vocal fold (see Fig. 1). Then, experiments on flow-induced oscillations were carried out and recorded for six macaque larynges (see Materials and Methods for their preparation). A total of 67 vocalizations were collected (see Table 1). Fig. 2A shows a sequence of high-speed images measured in one instance of the excised larynx experiments. Surface images of the vocal and ventricular folds are successively displayed on the medial–lateral and anterior–posterior axes. In this example, the ventricular folds (white region) did not show any vibratory movement. In contrast, the vocal folds (gray region), which are seen below the ventricular folds, exhibited clear oscillations. They initially touched each other (3 ms), started to open, reached the maximal opening area (15–21 ms), and then started to close.
Frontal section of a hemi-larynx in one macaque. The ventricular fold is located above the vocal fold.
Frontal section of a hemi-larynx in one macaque. The ventricular fold is located above the vocal fold.
Sequence of high-speed images capturing surface movement of a macaque larynx ex vivo. In each image, the ventricular and vocal folds are displayed on the medial–lateral and anterior–posterior axes. The labels indicate the recording time corresponding to the images. (A) Only the vocal folds oscillate, while the ventricular folds do not move much. (B) The vocal and ventricular folds co-oscillate. The vocal fold movement is discernible behind the ventricular folds. See also Movies 1 and 2.
Sequence of high-speed images capturing surface movement of a macaque larynx ex vivo. In each image, the ventricular and vocal folds are displayed on the medial–lateral and anterior–posterior axes. The labels indicate the recording time corresponding to the images. (A) Only the vocal folds oscillate, while the ventricular folds do not move much. (B) The vocal and ventricular folds co-oscillate. The vocal fold movement is discernible behind the ventricular folds. See also Movies 1 and 2.
Fig. 2B shows another instance of the excised larynx experiments from the same individual. Here, both vocal and ventricular folds co-oscillated. The ventricular folds, which were initially open (6 ms), started to close, and the posterior parts collided with each other (24 ms). Then, they started to open again. Below the ventricular folds, the vocal fold oscillation is also discernible. Initially, when the ventricular folds were widely open, the vocal folds were closed (6 ms). In the following images, they started to open, reached their maximal opening (48 ms), and then started to close, implying that the vocal and ventricular folds were in an anti-phase relationship.
Kymograms and the corresponding spectrograms of macaque excised larynx experiments. (A,B) Only the vocal folds oscillate, while the ventricular folds do not move (corresponding to sequential images of Fig. 2A). (C,D) The vocal and ventricular folds co-oscillate (corresponding to sequential images of Fig. 2B). (E,F) Chaos observed in vocal–ventricular fold co-oscillations. See also Movies 1 and 2.
Kymograms and the corresponding spectrograms of macaque excised larynx experiments. (A,B) Only the vocal folds oscillate, while the ventricular folds do not move (corresponding to sequential images of Fig. 2A). (C,D) The vocal and ventricular folds co-oscillate (corresponding to sequential images of Fig. 2B). (E,F) Chaos observed in vocal–ventricular fold co-oscillations. See also Movies 1 and 2.
Ventricular fold ratio as a quantity to detect vocal–ventricular fold co-oscillations. (A) Definition of the ventricular fold ratio. Using a (distance from the minimum opening point to the maximum opening point of the ventricular fold) and b (distance from the glottal middle line to the maximal opening point of the ventricular fold), the ventricular fold ratio is given as Rvent=a/b. (B) Comparison of the fundamental frequency fo between the vocal fold oscillations (i.e. Rvent<0.3) and the vocal–ventricular fold co-oscillations (i.e. Rvent≥0.3) for 6 macaque individuals nos 1–6). (C) Comparison of the phonation onset pressure between the vocal fold oscillations and the vocal–ventricular fold co-oscillations for six macaque individuals. (D) Comparison of the vocal efficiency between the vocal fold oscillations and the vocal–ventricular fold co-oscillations for six macaque individuals.
Ventricular fold ratio as a quantity to detect vocal–ventricular fold co-oscillations. (A) Definition of the ventricular fold ratio. Using a (distance from the minimum opening point to the maximum opening point of the ventricular fold) and b (distance from the glottal middle line to the maximal opening point of the ventricular fold), the ventricular fold ratio is given as Rvent=a/b. (B) Comparison of the fundamental frequency fo between the vocal fold oscillations (i.e. Rvent<0.3) and the vocal–ventricular fold co-oscillations (i.e. Rvent≥0.3) for 6 macaque individuals nos 1–6). (C) Comparison of the phonation onset pressure between the vocal fold oscillations and the vocal–ventricular fold co-oscillations for six macaque individuals. (D) Comparison of the vocal efficiency between the vocal fold oscillations and the vocal–ventricular fold co-oscillations for six macaque individuals.
By setting the threshold value to Rth=0.3, the vocal–ventricular fold oscillations are detected when Rvent>Rth. According to this criterion, n=38 and n=29 instances were classified into the vocal fold oscillations and the vocal–ventricular fold co-oscillations, respectively. In Fig. 4B–D, the fundamental frequency fo, the onset pressure and the vocal efficiency are drawn for six macaque individuals. Compared with the case in which only the vocal folds oscillate, the fundamental frequency fo was significantly lower in the vocal–ventricular fold oscillations in all individuals (2-factor ANOVA: F1,60=31.91, P=4.71×10−7 for vocalization type; F5,60=1.45, P=0.219 for individuality) (Tukey's HSD: difference between the group means=−220.5, confidence interval CI=[−353.2, −87.8], P<3×10−5 for vocalization type; difference between the group means=109.4, CI=[−62.9, 281.6], P>0.585 for individual differences). This result was not strongly influenced by the sex and body mass of the macaque individuals (see Fig. S1).
The onset pressure increased in the vocal–ventricular fold co-oscillations in four individuals. However, this increase was not statistically significant (2-factor ANOVA: F1,60=0.99, P=0.324 for vocalization type; F5,60=2.84, P=0.023 for individuality) (Tukey's HSD: difference between the group means=0.086, CI=[−0.21, 0.38], P>0.997 for vocalization type; difference between the group means=−0.51, CI=[−1.01, −0.009], P>0.042 for individual differences). The effect of vocalization type on the vocal efficiency was also unclear. It increased in the vocal–ventricular fold co-oscillations in four individuals, whereas it decreased in the vocal–ventricular fold co-oscillations in two individuals (2-factor ANOVA: F1,60=0.0036, P=0.952 for vocalization type; F5,60=11.14, P=1.29×10−7 for individuality) (Tukey's HSD: difference between the group means=−0.073, CI=[−4.2, 4.05], P>0.999 for vocalization type; difference between the group means=13.8, CI=[6.75, 20.79], P>5.7×10−7 for individual differences). Our image analysis revealed that, in the vocal–ventricular fold co-oscillations, the phase difference between the vocal and ventricular folds was 3.37±0.41 rad, confirming their anti-phase relationship.
In several experiments, we saw a transition from vocal fold oscillations to vocal–ventricular fold co-oscillations, as the subglottal pressure was gradually increased. After such a transition, the fundamental frequency dropped significantly (see Fig. S2). This provides further evidence that the ventricular fold oscillations have a strong influence on the fundamental frequency.
In addition to the periodic oscillations of the vocal and ventricular folds, irregular oscillations have also been observed in some experiments. One example is displayed in Fig. 3E, where both vocal and ventricular folds show irregular waveforms with a large variability in their cycle-to-cycle periods.
In vivo experiments
In the in vivo experiments, vocalizations were successfully induced by applying electrical stimulation to the periaqueductal gray and its surrounding area of the midbrain in three adult male macaques under anesthesia. The phonatory process was well documented with high-speed videos for 34 vocalizations (see Table 2). As the recorded audio data were not very clear, only the video data were analyzed.
As shown in Fig. S3, both vocal fold oscillations and vocal–ventricular fold co-oscillations were observed in two sequences of the high-speed images. Kymograms in Fig. 5A,C were drawn by extracting line images from the medial–lateral axis of the high-speed images (blue lines in Fig. S3A,B, respectively). In Fig. 5A, only the vocal folds exhibited an oscillation pattern, while the ventricular folds moved only slightly. Fig. 5B indicates that the corresponding fundamental frequency was fo=430 Hz. In Fig. 5C, although the vocal fold movements are partly hidden behind the ventricular folds, both of them oscillated in an anti-phase relationship. Fig. 5D indicates that their fundamental frequency was fo=145 Hz.
Kymograms of in vivo experiments. (A) Only the vocal folds oscillate, while the ventricular folds do not move (corresponding to sequential images of Fig. S3A). (B) Power spectrum indicating fo=430 Hz. (C) The vocal and ventricular folds co-oscillate (corresponding to sequential images of Fig. S3B; see also Movie 3). (D) Power spectrum indicating fo=145 Hz. (E) Chaos was observed during vocal–ventricular fold co-oscillations. (F) Fundamental frequency fo compared between the vocal fold oscillations and the vocal–ventricular fold co-oscillations for two macaque individuals (7 and 9).
Kymograms of in vivo experiments. (A) Only the vocal folds oscillate, while the ventricular folds do not move (corresponding to sequential images of Fig. S3A). (B) Power spectrum indicating fo=430 Hz. (C) The vocal and ventricular folds co-oscillate (corresponding to sequential images of Fig. S3B; see also Movie 3). (D) Power spectrum indicating fo=145 Hz. (E) Chaos was observed during vocal–ventricular fold co-oscillations. (F) Fundamental frequency fo compared between the vocal fold oscillations and the vocal–ventricular fold co-oscillations for two macaque individuals (7 and 9).
Using the ventricular fold ratio Rvent of Eqn 7 applied to the kymograms, 23 and 11 vocal samples were classified as vocal fold oscillations and vocal–ventricular fold co-oscillations, respectively (see Table 2). As shown in Fig. 5F, the fundamental frequency fo was much lower in the vocal–ventricular fold co-oscillations than in the vocal fold oscillations in two individuals (2-factor ANOVA: F1,30=43.45, P=2.70×10−7 for vocalization type; F1,30=13.61, P=0.0009 for individuality) (Tukey's HSD: difference between the group means=−255.2, CI=[−360.5, −149.9], P=1.57×10−6 for vocalization type; difference between the group means=−137.8, CI=[−239.4, −36.2], P=0.005 for individual difference). As no vocal–ventricular fold co-oscillation was observed in macaque 8, its data were not included in the statistical analysis.
Our image analysis of the vocal–ventricular fold oscillations indicated that the phase difference between the vocal and ventricular folds was 3.35±0.83 rad, i.e. anti-phase relationship. As shown in Fig. 5E, irregular oscillations of the vocal and ventricular folds were also observed in some vocalizations.
Simulation study
To reproduce the observed vocal–ventricular fold co-oscillations, the four-mass model (Fig. 6A) was simulated. According to the ex vivo and in vivo experiments, involvement of the ventricular folds in the laryngeal dynamics almost halved the fundamental frequency fo of the vocal fold oscillations. This implies that the oscillation frequency of the ventricular folds is about half of that of the vocal folds. To realize such a situation, the tension parameter was set to T=0.55. As the bifurcation parameters to control the vocal–ventricular fold system, the subglottal pressure Ps and the ventricular fold adduction aprep (i.e. pre-phonatory area of the ventricular folds aprep=a03=a04) were varied. Results of the computational model are depicted in Fig. 6. In Fig. 6B, the fundamental frequency fo and the oscillation amplitudes of the vocal and ventricular folds were drawn by varying the subglottal pressure Ps (here, the ventricular fold adduction was set to aprep=0.025 cm2). As the pressure was increased, the phonation was induced around Ps=0.53 kPa, giving rise to a fundamental frequency of fo=310 Hz. Note that, in this region, only the vocal folds oscillated, while the ventricular folds stayed still (because of the zero amplitude of the ventricular folds). As the pressure was further increased, the fundamental frequency dropped to fo=182 Hz at around Ps=1.26 kPa. In this region, the ventricular folds started to co-oscillate with the vocal folds (as the ventricular fold amplitude became positive). This explains why transitions from the vocal fold oscillations to the vocal–ventricular fold co-oscillations were observed when the subglottal pressure was increased in the ex vivo experiment (see Fig. S2).
Results of the computational model. (A) Schematic illustration of the four-mass model, composed of a pair of two-mass models. The lower two masses (M1 and M2) and the upper two masses (M3 and M4) represent the vocal and ventricular folds, respectively. (B) Dependence of the fundamental frequency fo and oscillation amplitudes of the vocal and ventricular folds on the subglottal pressure Ps. (C–E) Time traces of the opening areas of the vocal (avo) and ventricular (ave) folds. The pre-phonatory opening area of the ventricular folds was set to 0.025 cm2. The subglottal pressure was set to Ps=1.1 kPa in C, Ps=1.4 kPa in D, and Ps=1 kPa in E. The tension parameter T was set to 0.55 in C,D and 0.565 in E. (F) Dependence of the fundamental frequency fo on the subglottal pressure Ps and the pre-phonatory opening area aprep of the ventricular folds.
Results of the computational model. (A) Schematic illustration of the four-mass model, composed of a pair of two-mass models. The lower two masses (M1 and M2) and the upper two masses (M3 and M4) represent the vocal and ventricular folds, respectively. (B) Dependence of the fundamental frequency fo and oscillation amplitudes of the vocal and ventricular folds on the subglottal pressure Ps. (C–E) Time traces of the opening areas of the vocal (avo) and ventricular (ave) folds. The pre-phonatory opening area of the ventricular folds was set to 0.025 cm2. The subglottal pressure was set to Ps=1.1 kPa in C, Ps=1.4 kPa in D, and Ps=1 kPa in E. The tension parameter T was set to 0.55 in C,D and 0.565 in E. (F) Dependence of the fundamental frequency fo on the subglottal pressure Ps and the pre-phonatory opening area aprep of the ventricular folds.
To examine the oscillation patterns, time traces of the opening areas of the vocal (avo) and ventricular (ave) folds were drawn by setting the subglottal pressure to Ps=1.1 kPa in Fig. 6C and Ps=1.4 kPa in Fig. 6D. For the low pressure, only the vocal folds oscillated in Fig. 6C, whereas both the vocal and ventricular folds oscillated for the high pressure in Fig. 6D. The observed anti-phase relationship between the vocal and ventricular folds is consistent with the kymograms of ex vivo and in vivo experiments (see Figs 3C and 5C). As shown in Fig. 6E, chaotic oscillations were also observed when the parameters were set to Ps=1 kPa and T=0.565. This reproduces the irregular oscillations of the vocal and ventricular folds observed ex vivo and in vivo (see Figs 3E and 5E). Finally, in Fig. 6F, the fundamental frequency fo was drawn by varying the subglottal pressure Ps and the ventricular fold adduction aprep. It can be seen that, as the adduction aprep was decreased, the existence domain of the vocal fold oscillations (Fig. 6F, red region) shrank and disappeared for aprep<0.015 cm2. In contrast, the domain of the vocal–ventricular fold co-oscillations (Fig. 6F, green region) was extended monotonously as the adduction aprep was decreased. This is because the ventricular fold oscillations are induced more easily as they are more adducted. In the ex vivo study, occurrence of the vocal–ventricular fold co-oscillation was dependent upon the experimental conditions. The computational model suggests that the degree of ventricular fold adduction could have been one of the primary factors that determined the occurrence of such co-oscillations.
DISCUSSION
We carried out ex vivo and in vivo experiments to explore the functional role of the ventricular folds in the sound production of macaques. In the ex vivo experiments, larynges were extracted from six macaque individuals and their flow-induced oscillations were recorded. Of the 67 recorded sounds, 29 showed the occurrence of vocal–ventricular fold co-oscillations. Chaotic irregular oscillations appeared occasionally. In the in vivo experiments, vocalizations of three macaque individuals were recorded by the high-speed camera. Of the 34 recorded sounds, 11 exhibited vocal–ventricular fold co-oscillations. Chaotic oscillations were also observed. As one of the main features of the vocal–ventricular fold co-oscillations, our analysis revealed that the fundamental frequency was lowered significantly in both ex vivo and in vivo experiments. Our computational model suggests that the ventricular folds vibrate at half the oscillation frequency of the vocal folds to halve the vocal pitch. Regarding the observed irregular dynamics, it is a generic feature of coupled non-linear systems that interacting oscillators can induce bifurcations and chaos (Berg et al., 1986; Glass and Mackey, 2020). It is thus natural that, depending upon the frequency ratio between the vocal and ventricular folds, their interaction may lead to desynchronized chaotic dynamics. The computational model also suggests that one of the key parameters to induce the vocal–ventricular fold co-oscillations is the level of the ventricular fold adduction, as the pressure to generate vocal–ventricular fold co-oscillations was significantly lowered as the ventricular folds were adducted (see Fig. 6F).
The physics and physiology of the observed ventricular fold oscillations are consistent with the human vocalizations, which give rise to vocal–ventricular fold co-oscillations under special circumstances such as singing (Fuks et al., 1998; Lindestad et al., 2001; Sakakibara et al., 2001, 2002; Bailly et al., 2010) and voice pathologies (Jackson and Jackson, 1935; Voelker, 1942; Fred, 1962; Lindestad et al., 2004). One of the physiological characteristics of macaques is that their ventricular space is relatively narrow compared with that of humans. Such a narrow ventricular space may naturally strengthen the ventricular fold adduction and consequently provide a condition to induce ventricular fold oscillations more easily. As an additional note, the vocal fold cover layer in macaques is known to be thinner than that of humans, and the lamina propria is dense in fibrous tissue (Kurita et al., 1983; Riede, 2010). Mucosal waves may not be formed so strongly on this thin and hard cover. Under such a weak mucosal wave propagation, the ventricular fold oscillations could be of help for strengthening the laryngeal oscillations. These physiological properties suggest that macaques may utilize the ventricular folds more often than humans do in their vocalizations.
The advantage for macaques of using the ventricular folds is the ability to produce sounds with low fundamental frequencies fo in their vocal repertoires. In terms of animal social behavior, such low-frequency vocalizations are closely related to the hypothesis that fo may provide an acoustic cue to the vocalizer's body size (Morton, 1977). Studies of interspecific size–frequency allometry demonstrated a strong inverse relationship between body size and vocalization frequency in primate species (Hauser, 1993; Bowling et al., 2017) and mammal species (Charlton and Reby, 2016). It has been also suggested that, rather than the body size, the larynx size is more directly correlated to fo in primate vocalizations (Garcia et al., 2017) (this may explain, for example, the relatively high fo of bonobos, the body size of which deviates from that expected from the acoustic allometry; Grawunder et al., 2018). Within a single species, the inverse relationship between body size and fo has been confirmed in a number of animals including toads and frogs (Martin, 1972; Davies and Halliday, 1978; Ryan, 1988), hamadryas baboons (Pfefferle and Fischer, 2006), Japanese macaques (Inoue, 1988), human males (Evans et al., 2006) and others (Ey et al., 2007). Several studies, however, reported that no clear inverse relationship was observed between the body size and fo, e.g. in humans (Lass and Brown, 1978; Künzel, 1989) and macaques living in two different regions (Tanaka et al., 2006). This could be partly due to the variability of call types (Itani, 1963; Green, 1975). Furthermore, the fo range is determined not only by the size of the vocal folds (Garcia et al., 2017) but also by the laryngeal muscles that elongate the vocal folds and by the non-linear tissue properties (Titze et al., 2016). As shown in the present study, low-frequency vocalizations induced by the ventricular folds may provide an additional factor that allows further variation in the fo range of the macaques.
Another point to note is that the interaction of the ventricular folds with the vocal folds may lead more frequently to the occurrence of voice instabilities, which could make precise control of the vocal pitch more difficult. A similar effect occurs by the interaction of the vocal membranes and the vocal folds in non-human primates including macaques (Nishimura et al., 2022). It has been argued, however, that such chaotic phonations are ubiquitous aspects of vocal repertoires in many primate species and may even have an evolutionary significance (Fitch et al., 2002). The ventricular folds may contribute to the production of such potentially important vocalizations.
As reported recently, the vocal membranes contribute to animal voice production, especially the addition of high-frequency components when they oscillate alone (Nishimura et al., 2022; Håkansson et al., 2022). Although the present study did not focus on the detailed movements of the vocal membranes, we did see a tendency, when the vocal membranes solely oscillated as the main vocal source, for a more drastic drop in fo to be induced by the ventricular folds than when the vocal membranes did not participate much in the vocalization. It will be important in future work to clarify the interaction between the vocal membranes and the ventricular folds. Another issue is that, in the present study, the in vivo recordings were conducted on subjects under anesthesia. It would be of interest to examine the macaque ventricular folds under their freely moving conditions. Utilization of the ventricular folds in the vocalizations of other species should be also investigated. The acoustical role of the ventricular folds in animal communications, in terms of either animal social behavior or evolution, should be further discussed. Finally, in a growl voice of humans, not only the ventricular folds but also other parts such as the aryepiglottic folds have been shown to interact with the vocal folds (Sakakibara et al., 2004). Investigating the influence of such vocal apparatus may provide additional insight into the animal vocalization, e.g. screaming.
Footnotes
Author contributions
Conceptualization: T.N., I.T.T.; Methodology: R.M., S.M., A.K., T.N., I.T.T.; Validation: T.N., I.T.T.; Formal analysis: R.M., T.Y., I.T.T.; Investigation: R.M., T.Y., M.K., S.M., A.K., Y.K., K.N., T.N., I.T.T.; Resources: S.M., A.K., T.N.; Data curation: I.T.T.; Writing - original draft: I.T.T.; Writing - review & editing: T.N., I.T.T.; Visualization: I.T.T.; Supervision: T.N., I.T.T.; Project administration: T.N., I.T.T.; Funding acquisition: T.N., I.T.T.
Funding
This work was partially supported by Grant-in-Aid for Scientific Research (nos 17H06313, 19H01002, 20K11875, 23H03424) from the Japan Society for the Promotion of Science (JSPS).
Data availability
All data obtained and analyzed in the present study are available from the BioStudies database (https://www.ebi.ac.uk/biostudies/) under accession number S-BSST1008. The computer source code to simulate the vocal and ventricular fold co-oscillations was written in the C programming language and is available online at https://github.com/isaotokuda/Vocal_Ventricular_Folds.
References
Competing interests
The authors declare no competing or financial interests.