In humans, speech is a complex process that requires the coordinated involvement of various components of the phonatory system, which are monitored by the central nervous system. The larynx in particular plays a crucial role, as it enables the vocal folds to meet and converts the exhaled air from our lungs into audible sounds. Voice production requires precise and sustained exhalation, which generates an air pressure/flow that creates the pressure in the glottis required for voice production. Voluntary vocal production begins in the laryngeal motor cortex (LMC), a structure found in all mammals, although the specific location in the cortex varies in humans. The LMC interfaces with various structures of the central autonomic network associated with cardiorespiratory regulation to allow the perfect coordination between breathing and vocalization. The main subcortical structure involved in this relationship is the mesencephalic periaqueductal grey matter (PAG). The PAG is the perfect link to the autonomic pontomedullary structures such as the parabrachial complex (PBc), the Kölliker–Fuse nucleus (KF), the nucleus tractus solitarius (NTS), and the nucleus retroambiguus (nRA), which modulate cardiovascular autonomic function activity in the vasomotor centers and respiratory activity at the level of the generators of the laryngeal-respiratory motor patterns that are essential for vocalization.
1. Introduction
Communication in humans is exceptionally complex, encompassing cultural and social aspects, while communication in animals is primarily linked to survival and reproduction. While both share the ability to convey information, the form and scope of this communication vary significantly. Several concepts related to this communication process are important to distinguish. Phonation refers to the process of producing sounds through the vibration of vocal folds to create basic sounds. Vocalization, on the other hand, involves the articulation or production of specific sounds to communicate, representing a combination of sounds. Finally, speech is the most comprehensive and complex expression of language, encompassing meaningful communication through words and phrases, representing the highest level in the hierarchy of vocal communication.
Vocalization in animals is a complex phenomenon involving an intricate network of brain connections that coordinate the production of characteristic sounds. The ability to make sounds and communicate is crucial for most species, including humans. Communication in mammals is essential for transmitting information related to reproduction, location, identity, and the presence of predators, making it a survival-linked activity. When discussing the ability of some mammal species to develop vocalizations, it is important to consider that the structures and mechanisms they employ for this purpose are diverse and vary significantly
[1].
In mammals, the main structure that allows phonation is the larynx, which is a stable structure with moving parts that, in humans, is composed of six cartilages, three of which are paired. In rats, another cartilage, the laryngeal alar cartilage, has been identified. While both humans and rodents utilize the larynx for vocalizations, there are differences in their vocalization mechanisms. Rats and mice possess muscles and cartilages not found in humans, enabling them to produce ultrasounds
[2]. Despite these structural differences, there are similarities in the neural circuits involved in controlling phonation and both species share similar vocal fold properties due to their similar composition
[3].
Laryngeal activity in rodents is controlled by the vagus nerve for intrinsic musculature and the hypoglossal nerve, pharyngeal plexus, and cervical nerves for extrinsic musculature. The vagus nerve, a component of the autonomic nervous system, is subject to cortical influence to some extent, allowing voluntary regulation of certain laryngeal functions such as phonation. It consists of the superior laryngeal nerve, whose external branch provides motor innervation to the cricothyroid muscle involved in vocal cord tension
[4], and the internal branch offers somatosensory innervation to the glottis and the area above it for airway protection. The recurrent laryngeal nerve, another division of the vagus nerve, provides motor innervation to the remaining intrinsic laryngeal muscles, excluding the cricothyroid muscle, and sensory innervation to the upper trachea and lower glottis
[5].
Within the human larynx, a distinction can be made between intrinsic and extrinsic musculature. The intrinsic muscles influence the opening and closing dynamics of the vocal folds as well as their elongation or contraction during phonation. Conversely, the extrinsic muscles are responsible for mobilizing the larynx for its ascent or descent. Intrinsic musculature includes the cricothyroid muscle, thyroarytenoid muscle, posterior and lateral cricoarytenoid muscles, and transverse and oblique arytenoid muscles. In rats, two bellies of the thyroarytenoid muscle (lateral and medial), an alar cricoarytenoid muscle, and a superior cricoarytenoid muscle have also been identified
[2].
Various aspects of vocalization have been studied in different mammals, from rats to apes, aiming to better understand the development of speech in humans. Vocalizations require coordination between phonation and respiration, involving a large neural network that includes both anterior brain regions and the brainstem
[6][7][8] (
Figure 1).
In mammals, the respiratory system not only supplies the necessary oxygen to meet metabolic demands throughout life but also plays a fundamental role in the entire process of vocalization
[9]. The muscles of the rib cage provide the force necessary for inspiration, with the diaphragm and external intercostal muscles serving as the main inspiratory muscles
[10][11][12]. Expiration typically occurs passively when these inspiratory muscles relax, allowing air to be expelled. However, during activities such as vocalization, expiration can become an active process when the expiratory muscles of the abdominal region, specifically the internal intercostal muscles and abdominal muscles, are activated
[10][12][13]. The respiratory system acts as a bellows, with its musculature controlled by motor neurons in the ventral horn of the thoracic and upper lumbar spinal cord. The diaphragm, for example, is controlled by the motoneurons of the phrenic nucleus. Impairment of the functionality of these muscles can disrupt the airflow process, affecting activities such as breathing, phonation, and speech
[14][15][16]. In this airflow regulation, the laryngeal muscles also play a significant role. They expand to allow the passage of air into the lungs at the onset of inspiration, while at the end of this inspiration, they contract to decrease the airflow increasing expiratory time
[17].
2. Central Autonomic Network Involved in the Control of Respiration and Vocal Emission
A precise sequential contraction of the respiratory muscles is crucial and it requires control from different nuclei located in the brainstem. This control is necessary to optimize airflow by adapting muscular control to meet respiratory or vocalization requirements
[18]. The vocalization in humans shares a common physiological and acoustic foundation with the vocal emission of other vertebrates. However, in humans, enhanced neural control of laryngeal muscles contributes to the stability of oscillations in the laryngeal vocal folds, regulating airflow from the lungs
[6]. This enhanced stability in human phonation is attributed to an evolutionary reduction in anatomical complexity, such as the loss of structures like laryngeal sacs or vocal membranes present in non-human primates
[19].
The coordination of vocalization processes is orchestrated by the central nervous system, with an intricate network extending from the laryngeal motor cortex to mesencephalic and pontomedullary regions. Key structures involved in the central control of vocalization include the periaqueductal grey matter (PAG), parabrachial complex (PBc), and Kölliker–Fuse nucleus (KF), nucleus ambiguus (nA), and nucleus retroambiguus (nRA). Subsequent research found that bilateral lesions in the PAG caused mutism, not only in animals but also in humans
[6][20][21][22]. Studies on cats confirmed the importance of the caudal half of the mesencephalon and lesions in the PAG in vocal emission. Stimulations of the PAG and nRA produce vocal emissions, with observed neuronal activity within PBc during vocalization
[23][24].
Additionally, the researchers and others have demonstrated the role of several pontine nuclei in the control of the sympathoexcitatory response evoked from hypothalamic and mesencephalic regions in rats. Specifically, the researchers have shown how both the PB-KF complex and A5 area maintain a functional relationship and modulate the defence responses evoked by the dorsomedial hypothalamic nucleus and PAG stimulation, with glutamate as the neurotransmitter involved
[25][26][27]. It was shown that a blockade of the lateral parabrachial (lPB) and A5 area attenuated or abolished the cardiorespiratory response evoked from the dorsomedial hypothalamic nucleus or dorsolateral periaqueductal grey matter (dlPAG), while the medial parabrachial region and Kölliker–Fuse (mPB-KF) only mediated the cardiovascular response
[28][29][30] (
Figure 2).
Figure 2. Extracellular recordings of three putative cells recorded from the A5 region. (
a) Silent neuron (
upper trace, four superimposed sweeps). The
lower trace shows constant latency responses (four superimposed sweeps) to the dlPAG stimulation. (
b) Spontaneously active cell (
upper trace, five superimposed sweeps). The
lower trace shows excitations with short latency responses from dlPAG stimulation (five superimposed sweeps). Instantaneous respiratory rate (
upper trace, rpm), respiratory flow (mL/s), pleural pressure (cmH
2O), instantaneous heart rate (bpm), and blood pressure (mmHg) in a spontaneously breathing rat showing the cardiorespiratory response evoked on dlPAG stimulation (
c) before the microinjection of muscimol in the A5 region (50 nL over 5 s) and (
d) after the microinjection of muscimol in the A5 region (50 nL over 5 s). Black line shows the onset of the dlPAG electrical stimulation (5 s). Researchers’ figure modified from
[30].
This dichotomy in the modulation of the respiratory response between mPB-KF and lPB was reflected in another study whereby lPB activation modified the respiratory and laryngeal responses, resulting in an increase in subglottic pressure and a decrease in respiratory rate and phrenic nerve activity. Conversely, activation of the mPB-KF and A5 area resulted in the opposite response, decreasing subglottic pressure and increasing respiratory rate and phrenic nerve activity
[31] (
Figure 3). This assigns an important role to these structures not only at the sympathoexcitatory level but also in laryngeal and respiratory control, as they can modify subglottic pressure levels and adjust them to the animal’s respiratory requirements. The aforementioned fits in perfectly with the historical role assigned to the mPB-KF in respiratory control, referred to as the pneumotaxic center. Actually, both constitute the pontine respiratory group (PRG), which is involved in the switch-off from inspiration to expiration through the reciprocal connections with the dorsal respiratory group (DRG) and ventral respiratory group (VRG) within the medulla oblongata
[32]. The rhythmic activity required to drive inspiration is carried out by a neural network located within the pre-Bötzinger complex (pre-BötC)
[33], which is triggered in the inspiration phase and indirectly drives the inspiratory muscles
[9][34]. This inspiration is also determined by the activation of the KF
[35]. Finally, the lateral parafacial nucleus is activated during active expiration
[9].
Figure 3. Laryngeal and respiratory responses to (
a) electrical stimulation in the mPB (
b) electrical stimulation in the lPB and (
c) glutamate microinjection in the A5 region. Phrenic nerve discharge, respiratory airflow, pleural pressure, subglottic pressure, and integrated phrenic nerve discharge show an expiratory facilitatory response with an increase in subglottic pressure during electrical stimulation (20 μA, 0.4 ms pulses, 50 Hz for 5 s) in the medial parabrachial nucleus, an inspiratory facilitatory response with a decrease in subglottic pressure during electrical stimulation (10 μA, 0.4 ms pulses, 50 Hz for 5 s) in the lateral parabrachial nucleus, and an expiratory facilitatory response with an increase in subglottic pressure during a glutamate injection (10 nL over 5 s) in the A5 region. The arrow shows the onset of the injection. Researchers’ figure modified from
[31].
3. Vocalization in Apes: Connectivity between the PAG and the Laryngeal Motor Cortex
Voluntary voice production in humans involves sound modulation and directly depends on the laryngeal motor cortex, situated in the dorsal portion of the ventral zone of the primary motor cortex. Its direct connection with laryngeal motor neurons of the nA/nRA governs the laryngeal muscles for learned vocal pattern emission
[7]. However, it is demonstrated that during vocal emissions, there is concurrent activation of the voluntary and involuntary systems
[36]. Automatic involuntary activation of the pathway originating in the primary motor cortex and passing through the PAG and cVRG is necessary to impart appropriate emotional character to vocal emissions. This necessitates the activation of the pathway from the laryngeal motor cortex directly to corticomedullary fibers, activating motor neurons for facial, mouth, tongue, larynx, and pharynx control for word and phrase production
[37].
In apes, the exploration of pathways governing voluntary and involuntary vocalizations has developed a model that explicates vocal control through two hierarchically organized and related networks. Involuntary vocalizations, such as crying or laughing, are regulated by limbic mechanisms distinct from those governing voluntary vocalizations or speech
[6]. These emotional expressions are directed by the emotional system, comprised of specific pathways targeting the brain stem and spinal cord
[36][38].
Results obtained from the squirrel monkey suggest that the system includes the cingulate gyrus, PAG, and various pontine and medullary nuclei
[39]. Vocal control ultimately hinges on the primary motor area, a bilateral structure responsible for laryngeal control and orofacial musculature
[40], alongside activation of the superior temporal gyrus to compensate for alterations in auditory feedback during phonation. Two feedback loops, involving the basal ganglia and cerebellum, provide the motor cortex with the necessary information for executing motor commands in phonation. However, these structures appear unnecessary to produce innate vocal patterns. Hence, in humans, the neural system that is involved in emotional expressions like laughter and crying includes different regions from the anterior cingulate cortex, the PAG, nRA, and nA, yet its contribution appears less pivotal in deliberate expressions such as voice and speech, where cortical control seems to exert greater influence
[41]. Furthermore, the speech production system involves predominant activation of the left hemisphere, encompassing the superior temporal gyrus, anterior insula, basal ganglia, and cerebellum. For this production, the activity of the cingulate gyrus and PAG is also necessary, to varying degrees, to associate emotional character with vocal production
[42].
The PAG receives projections from upper limbic regions and cortical areas, including the anterior cingulate gyrus, insula, and orbitofrontal cortex. It maintains connections with the cVRG, which, in turn, has direct access to motor neurons governing vocalization. Specifically, it controls motor neuron groups governing the soft palate, pharynx, larynx, diaphragm, intercostal, abdominal, and pelvic muscles. The primary objective is to regulate/modify intra-abdominal, intrathoracic, and subglottic pressure, critical for vocalization generation
[43][44][45]. In apes, vocalizations, in addition to PAG activation, can be elicited by electrical stimulation of various brain regions, such as the hypothalamus, amygdala, bed nucleus of the stria terminalis, orbitofrontal cortex, and anterior cingulate gyrus, all strongly connected to the PAG. The prerequisite is the integrity of the PAG
[46]. Contrarily, stimulation of areas not connected to the PAG, such as the motor or premotor cortex, fails to induce vocalizations
[47], underscoring the pivotal role of the PAG in vocalization in primates and humans. The coordination extends to partial vocalizations generated by activating caudal PAG levels through its connection with the cVRG
[48][49][50].
4. Clinical Implications
All these central structures described above share the commonality of mediating autonomic responses to environmental stress and supporting vocalization. Recent studies demonstrate that laryngeal microstructure and its innervation undergo similar changes during development in rodents and humans
[51]. Central circuits responsible for vocalization exhibit deregulation in certain central speech disorders, such as spasmodic dysphonia due to laryngeal dystonia, paradoxical laryngeal adduction movements, or muscle tension dysphonia between others. A better description and understanding of the individual contribution of each nucleus in the mentioned network on the central control of laryngeal motoneurons would not only enhance our understanding of normal phonatory control but also contribute to a better understanding of central alterations occurring in this type of vocal-related disorders that are described in more detail below.
Specifically, laryngeal respiratory apnea often constitutes a clinically severe manifestation, as seen in newborn apnea or central sleep apneas, caused by immaturity or abnormalities in the central respiratory control in individuals, leading to an exaggerated response of the laryngeal adduction reflex
[52][53]. Moreover, it is also known that spasmodic dysphonia, a focal form of dystonia, is a neurological voice disorder characterized by involuntary “spasms” of the vocal folds, resulting in speech interruptions and affecting voice quality. The cause of spasmodic dysphonia is unknown, although there is some consensus that it involves a central nervous system alteration, particularly in motor control, as alterations in the basal ganglia, cerebellum, and sensorimotor cortex circuitry have been described, along with structural changes in corticobulbar and corticospinal tracts, which are the nerve tracts in contact with bulbar neurons responsible for phonation
[54]. Paradoxical laryngeal adduction movements are characterized by the adduction or approximation of the vocal folds during the respiratory cycle (especially during the inspiratory phase), causing obstruction of the laryngeal airway. The resulting dyspnea and stridor are often confused with asthma but do not respond to treatment with steroids and bronchodilators, as glottic narrowing is independent of bronchial lumen caliber. The origin of this intermittent interruption of transglottic airflow due to paradoxical laryngeal adduction remains to be elucidated. It has been associated with laryngeal irritation by agents such as gastroesophageal reflux or acute severe stress
[55]. Muscle tension dysphonia corresponds to a vocal disorder provoked by inappropriate laryngeal muscle use. This pathology involves increased muscle tension in the larynx and, more specifically, insufficient relaxation of the posterior cricoarytenoid muscle controlled by the nA (abductor of the vocal folds) during the phonation process. There is also an imbalance of synergistic and antagonistic muscle forces, the persistence of which produces organic alterations at the level of the vocal folds, exacerbating the condition associated with stress
[56].