Application of Adversarial Domain Adaptation to Voice Activity Detection

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Voice Activity Detection (VAD) is becoming an essential front-end component in various speech processing systems. As those systems are commonly deployed in environments with diverse noise types and low signal-to-noise ratios (SNRs), an effective VAD method should perform robust detection of speech region out of noisy background signals. In this paper, we propose applying an adversarial domain adaptation technique to VAD. The proposed method trains DNN models for a VAD task in a supervised manner, simultaneously mitigating the problem of area mismatch between noisy and clean audio stream in a unsupervised manner. The experimental results show that the proposed method improves robust detection performance in noisy environments compared to other DNN-based model learned with hand-crafted acoustic feature.

Original languageEnglish
Title of host publicationIntelligent Systems and Applications - Proceedings of the 2021 Intelligent Systems Conference, IntelliSys
EditorsKohei Arai
PublisherSpringer Science and Business Media Deutschland GmbH
Pages823-829
Number of pages7
ISBN (Print)9783030821982
DOIs
StatePublished - 2022
Event Intelligent Systems Conference, IntelliSys 2021 - Virtual, Online
Duration: 2 Sep 20213 Sep 2021

Publication series

NameLecture Notes in Networks and Systems
Volume296
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference Intelligent Systems Conference, IntelliSys 2021
CityVirtual, Online
Period2/09/213/09/21

Keywords

  • Domain adversarial adaptation
  • Generative adversarial network
  • VAD
  • Voice activity detection

Fingerprint

Dive into the research topics of 'Application of Adversarial Domain Adaptation to Voice Activity Detection'. Together they form a unique fingerprint.

Cite this