Denoising Techniques for High-Performance MEMS Microphones: Comparison
Please note this is a comparison between Version 1 by Zhuoyue Zheng and Version 2 by Catherine Yang.
The MEMS (Micro-Electro-Mechanical Systems) microphone is a representative device among the MEMS family, which has attracted substantial research interest, and those tailored for human voice have earned distinct success in commercialization. With the advancement of microphone technology and the market, microphones have become carriers for various intelligent applications, therefore demanding higher criteria in noise suppression. For instance, a hearing aid should suppress ambient noise while delivering relevant sounds to the user. Additionally, in specific environments, the noise performance of microphones is crucial for communication and intelligent voice-related needs such as voice activation and speech recognition.
  • MEMS microphone
  • environmental noise-cancelling
  • directional microphone
  • capacitive microphone

1. Introduction

The microphone, a device capable of sensing acoustic vibrations and transducing them into electrical signals, has experienced an exponential evolution since its emergence. Extending from its origins, it has expanded into various domains including commercial communication, medical applications, industrial usage, and both surveillance and military sectors.
The inception of the electret condenser microphone (ECM) marked a pivotal milestone, pioneered by Gerhard Sessler and James West in the early 1980s [1]. Subsequently, the advancement of MEMS (Micro-Electro-Mechanical Systems) technology coupled with strides in materials science drove a shift in this area, enabling the possibility of the miniaturization of macro-counterparts. This revolution not only reduced the size factor of the devices but also propelled substantial innovations in the operational mechanisms of microphones, offering newfound perspectives and increased design flexibility [2].
The merits of miniaturization are distinct, particularly in rendering a better noise performance, reduced power consumption, and multifunctional capabilities for electronic devices. As such, MEMS microphones are prevailing in consumer electronics like smartphones, tablets, wearables, computers, automobiles, and IoT devices, and are expected to see a consecutive growth in demand. According to a report from Maximize Market Research, MEMS Microphones Market is worth USD 1.82 billion in 2022 and is expected to hit USD 3.97 billion by 2029 at a CAGR of 11.8 percent [3].
Presently, the most economically competitive transduction mechanisms of MEMS microphones in use encompass capacitive, piezoresistive [4], and piezoelectric [5] types. Among these, capacitive microphones stand out for offering relatively higher signal strength, reaching up to hundreds of microvolts. However, they struggle with additional power consumption and exhibit a relatively high sensitivity to environmental factors such as dust and humidity, a complex manufacturing process, and could be more susceptible to electromagnetic interference. Despite these drawbacks, the simplicity of their structural design and compatibility with CMOS technology allows them to retain the leading position of capacitive MEMS microphones in the commercial market. In contrast, although piezoresistive microphones provide a wider dynamic range, they struggle with limitations regarding sensitivity and power consumption.
Utilizing piezoelectric-based materials to facilitate acoustic sensing presents the advantage of passive devices without additional input power, rendering them well-suited for portable applications. Furthermore, their robust design, making them less susceptible to environmental factors. requires simpler manufacturing process compared to capacitive MEMS microphones. However, achieving a comparable performance with capacitive microphones has posed a persistent challenge, in particular, concerning CMOS-compatible materials such as aluminum nitride (AlN) or zinc oxide (ZnO) [6]. In addition, the operational range of piezoelectric MEMS microphones is contingent upon the material used, indicating a promising potential in the future market. Collectively, microphone applications take into account a delicate balance between the demands of various usage scenarios, process intricacies, reliability, and environmental sustainability.

2. Utilizing the Resonant Responses of Membranes

Due to inherent resonant responses in membrane devices, strategically configuring mechanical structures and quality factors (Q-factors) theoretically enables passive noise filtering and desired signal amplification. Reger et al. [7][60] implement such practices by detailing piezoelectric MEMS microphones leveraging aluminum nitride (AlN). They fine-tuned the resonant frequency by suspending a diaphragm using etched tethers anchored to the boundary. Inevitably mentioning that a flat frequency response of microphones is of great importance for accurately reproducing speech characteristics. Although distorted speech may not suit most speech recognition-based applications, the zero-power-consumption feature of the piezoelectric principles suggests that passive filters might find appropriate usage in certain wake-up applications.
In addition to the inherent resonance of thin films for filtering, acoustic resonators have been explored [8][61]. Kusano [9][62], among others, took inspiration from the human cochlea, employing a 3D-printed spiral-shaped structure. The structure is in conjunction with a microphone to filter and select a specific frequency range while suppressing others through resonance and anti-resonance frequencies. However, the attenuation level can significantly reduce the quality factor of the resonance, potentially limiting its suitability to specific application scenarios. Moreover, the relatively large size of the assembled device is adverse for applications aimed at miniaturized microphones.

3. Utilizing BF-Compliant Directional Microphones

Among the noise reduction schemes for MEMS microphones, beamforming has proven a highly effective technique. Such a method suppresses noise by weighting audio signals from various directions, particularly enhancing specific sound directions while suppressing others. Presently, beamforming primarily relies on omnidirectional microphone arrays, in which each output undergoes digital signal processing (DSP) techniques to manipulate specific time delays and phase adjustments. This necessitates an additional DSP module in the interface or ASIC circuitry. Furthermore, integrating microphone arrays into compact packages poses significant challenges. To tackle those problems, implementing noise suppression on the basis of the mechanical or sensor system design with directional selectivity would further drive device miniaturization. For instance, bi-directional sound sensors are able to achieve acoustic beamforming in practice and the directional characteristics can be easily changed according to the weighted sum of the signals acquired from only a pair of sensors [10][63].
As previously mentioned, whether through piezoelectric or capacitive transductions, teetertotter-style microphones inherently possess directional selectivity and can generate bi-polar directional patterns. This is attributed to both sound pressure intensity and sound pressure-gradient information, and can be described as
 
in which the first part and the second part of the equation describe the omnidirectional load and gradient load separately. The Oromia Ochracea-inspired teetertotter-style microphone that was initially proposed by Mills et al. [11][12][13][14][15][42,43,44,45,46], Refs. [16][17][18][19][64,65,66,67], enabled the creation of an eight-shaped polar pattern. Subsequent research involved adjustments in the relative sizes of the two wings [20][68], varying diaphragm thickness to modulate sensitivity [21][69], and integrating a force feedback setup to manage thermal–mechanical noise and active Q control [22][70]. For more comprehensive insights into this subject, additional relevant literature can be found in the review paper compiled by Ishfaque [23][7].
The teetertotter-style microphone primarily operates within two resonant frequencies (two vibration modes) and their adjacent bands, thus restricting the sensor’s working bandwidth (usually <1 kHz). Considering the fact that signals below 1 kHz are crucial for speech applications and environmental noise localization [24][71]. As illustrated in Figure 12, Zhang et al. [25][72] achieved low-frequency applications at 500 Hz and 2 kHz by adjusting the central axis position of the device to modify resonant frequencies, and they utilized piezoelectric detection and capacitive auxiliary detection. Ren et al. further optimized Zhang’s work by tuning the two modal frequencies to 395 Hz and 739 Hz therefore leveraging the high vibration sensitivity of the fiber-optic Fabry-Perot interferometer (FPI) at the diaphragm’s distal edge [26][73]. These advancements aim to implement cost-effective miniature directional microphones with exceptional low-frequency Sound Source Localization (SSL) capability.
Figure 12. SEM images of the asymmetric microphone (Reprinted with permission from Ref. [25]).
SEM images of the asymmetric microphone (Reprinted with permission from Ref. [72]).
Whether operating with piezoelectric or capacitive transduction, the teetertotter-style microphone primarily faces challenges toward poor signal-to-noise ratio, narrow frequency bandwidth, and insufficiently flat responses [27][74]. Despite capacitive mechanism-based sensors having certain limitations in terms of device space compared to piezoelectric ones [28][37], they offer an alternative approach in terms of achieving a low-frequency sound pressure-referred noise floor and frequency selectivity.
Inspired by the human cochlea, Kang et al. [29][75] proposed a bipolar (figure-of-8 pattern) directional sound sensor using 16 cantilevers operating under a resonant mode, as illustrated in Figure 213. Like the previous work of Baumgartel [30][49], these cantilevers have respective resonance frequencies and separately acquire signals to then combine them for sound sensing and cover a frequency range of 100 Hz to 8000 Hz, and overcome directional ambiguities introduced by bipolar directionality using a Canted Angle Design [10][63]. Another merit of cantilevers is the relatively low processing requirements since simple signal processing holds significant relevance for subsequent applications such as human voice localization and control design for wearable devices. However, in Kang’s work frequency-response ripples across for approximately 15 dB in the magnitude of sensitivity, occupying a significant proportion relative to the sensitivity data (−20 dB to −40 dB). Nonetheless, Kang’s work undoubtedly offers valuable insights for the subsequent design of directional microphones.
Figure 213. Design of proposed sensor. (a) Simulation model and results of cantilever displacements by a sound wave. (b) Bipolar directivity of proposed sensor (Reprinted with permission from Ref. [29][75]).

4. Other Applications in Noise Cancelation

For more specific applications, such as in-vehicle noise reduction, directional signal selection can also be achieved at the packaging level. As illustrated in Figure 314a, Yoo et al. [31][23] presented a unidirectional microphone that enables the suppression of noise signals from undesired directions. As illustrated in Figure 314b, The directional characteristic of the microphone is realized by attaching a porous SU-8 filter to facilitate a delay in one of the two acoustic ports on the package. Experimental data indicated the proposed unidirectional MEMS microphone along with the devised packaging shows a front-back ratio of 27.1 dB, resulting in an effective suppression of fixed-directional noise. However, its drawback compared to device-based design lies in the challenge of controlling directional selection through circuits. Additionally, Packaging level filtering does not necessitate a high manufacturing requirement for the sensor but high demands on the manufacturing for controlling the hole ratio of the filter and the packaging assembly.
Figure 314. Directional MEMS microphone package: (a) Outside and inside images of package and (b) SU-8 filter attached on upper hole of package (Reprinted with permission from Ref. [32][30]).
Apart from suppressing specific directional noise, research also focuses on noise suppression in different frequency bands of omnidirectional microphones. Such designs primarily target applications akin to hearing aids. Although noises can be reduced using analog filters or digital signal processing [33][76] or a resonant microphone array (RMA), they cannot eliminate the original noises that directly get into the ear.
Hence, to address noise leakage, active noise cancellation (ANC) is carried out by picking up the noise in a specific frequency band [34][50]. Liu et al. [35][77] presented ANC based on MEMS RMA, and it demonstrated a better noise reduction level compared to flat band microphones. They used two sets of resonant microphone arrays composed of multiple piezoelectric cantilever microphones with different resonance frequencies covering two frequency ranges: one between 0.8 kHz to 5 kHz for vocal sensing and the other between 5 kHz to 9 kHz for ANC. The ANC was implemented with an analog inverter, digital phase compensator, digital adaptive filter, and deep learning technique, which outperformed the digital adaptive filter.
As the garnered results embodied, in all the tested cases, the word error rate improved with ANC, and the best performance was attained around the resonance frequencies of the resonant microphones. This suggests a way wherein specific frequencies’ active noise cancellation can be achieved through mechanical structural design.
ScholarVision Creations