Top Message
Top Message
Back to Home Page  |  Settings   |  Sign In
Web Education
1 2
Pages
|
Viewing 1-10 of 16 total results
Enhanced voice activity detection in kernel subspace domain
This paper proposes a voice activity detection (VAD) method in a kernel subspace domain to improve the performance of the kernel-based VAD. A linear transform matrix that can simultaneously diagonalize the two covariance matrices using kernel principal component analysis is presented to generate the kernel subspace. The likelihood ratio test based on Gaussian distributions is applied for the ......
 Noise Cancellation in Presence of Transient Noise using ...
Keywords: Gaussian mixture model, spectral clustering, transient noise, log like hood algorithm, voice activity detection 1. Introduction 1Voice and unvoice classification in an unsolved problem in speech processing and affects divers applications including robust speech recognition discontinuous transmission, Real -Time speech communication on ......
An improved noise-robust voice activity detector based on ...
In this paper, HSMM is introduced to explicitly model the duration distribution. Signals composed of speech and noise are regarded as a time duration hidden Markov chain with two states (speech and non-speech), and are modeled by the HSMM λ = (A, B, τ, π) shown in Fig. 1.Similar to conventional HMM-based VADs, three critical aspects including speech features, feature distributions and ...
A Robust Voice Activity Detection Based on Noise ...
A robust voice activity detector (VAD) is expected to increase the accuracy of ASR in noisy environments. This study focuses on how to extract robust information for designing a robust VAD. To do so, we construct a noise eigenspace by the principal component analysis of the noise covariance matrix....
Robust voice activity detection based on noise eigenspace
Robust voice activity detection based on noise eigenspace Dongwen Ying 1; 2, Yu Shi , Xugang Lu , Jianwu Dang and Frank Soong2 1School of Information Science, Japan Advanced Institute of Science and Technology 2Microsoft Research Asia (Received 5 March 2007, Accepted for publication 1 June 2007) Abstract: In this study, we propose a voice activity detector (VAD) based on a noise eigenspace,...
Features for voice activity detection: a comparative ...
Voice activity detection usually addresses a binary decision on the presence of speech for each frame of the noisy signal. Approaches that locate speech portions in time and frequency domain, such as speech presence probability (SPP) or ideal binary mask (IBM) estimation, can be considered as extensions of VAD that exceed the scope of this article....
 Supervised/Unsupervised Voice Activity Detectors for Text ...
2. Voice Activity Detectors The voice activity detection (VAD) problem considers detecting the presence of speech in an utterance. A VAD usually has the following three modules [1]: 1. Feature extraction: The objective of this module is to extract discriminative features from the observed signal for detection....
 Youngmoon Jung, Yeunju Choi, Hoirin Kim School of ...
world environments by using self-adaptive soft VAD. Index Terms— speaker verification, voice activity detec-tion, unsupervised domain adaptation, soft VAD 1. INTRODUCTION Speaker verification (SV) is the task of verifying a person’s claimed identity based on his or her voice. An important component of a practical SV system is voice ......
https://arxiv.org/pdf/1909.11886v1.pdf
Average Rating (0 votes)
An Empirical Mode Decomposition-based detection and ...
In this work, we present a detection process using empirical mode decomposition (EMD). EMD is an adaptive tool that breaks down time-domain signals into amplitude modulated and frequency modulated (AM-FM) components called intrinsic mode functions (IMFs). EMD is the foundation of the Hilbert-Huang Transform (HHT) (Huang, 2014 9. Huang, N. E ...
Advice for Audio classifier based on Voice Activity Detection
I am writting a program to classify recorded audio phone calls files (wav) which contain atleast some Human Voice or Non Voice (only DTMF, Dialtones, ringtones, noise). I tried implementing simple VAD (voice activity detector) using ZCR (zero crossing rate) & calculating Energy, but these parameters confuse with DTMF, Dialtones files with Voice....
1 2
Pages
|