These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.


PUBMED FOR HANDHELDS

Search MEDLINE/PubMed


  • Title: Comparison of ideal mask-based speech enhancement algorithms for speech mixed with white noise at low mixture signal-to-noise ratios.
    Author: Graetzer S, Hopkins C.
    Journal: J Acoust Soc Am; 2022 Dec; 152(6):3458. PubMed ID: 36586840.
    Abstract:
    The literature shows that the intelligibility of noisy speech can be improved by applying an ideal binary or soft gain mask in the time-frequency domain for signal-to-noise ratios (SNRs) between -10 and +10 dB. In this study, two mask-based algorithms are compared when applied to speech mixed with white Gaussian noise (WGN) at lower SNRs, that is, SNRs from -29 to -5 dB. These comprise an Ideal Binary Mask (IBM) with a Local Criterion (LC) set to 0 dB and an Ideal Ratio Mask (IRM). The performance of three intrusive Short-Time Objective Intelligibility (STOI) variants-STOI, STOI+, and Extended Short-Time Objective Intelligibility (ESTOI)-is compared with that of other monaural intelligibility metrics that can be used before and after mask-based processing. The results show that IRMs can be used to obtain near maximal speech intelligibility (>90% for sentence material) even at very low mixture SNRs, while IBMs with LC = 0 provide limited intelligibility gains for SNR < -14 dB. It is also shown that, unlike STOI, STOI+ and ESTOI are suitable metrics for speech mixed with WGN at low SNRs and processed by IBMs with LC = 0 even when speech is high-pass filtered to flatten the spectral tilt before masking.
    [Abstract] [Full Text] [Related] [New Search]