Details
Poster
Presenter(s)
Display Name
Ning Pu
- Affiliation
-
AffiliationTsinghua University
- Country
-
CountryChina
Abstract
An analog-intensive voice feature extraction method based on time-domain filtering is proposed in this paper, which solves the computational complexity problem in Mel Frequency Cepstral Coefficients (MFCCs) extraction. Compared with MFCC features, the estimated power consumption of the proposed feature extraction is reduced by approximately 4×. Besides, a binarized neural network model called DSXNOR-Net with low parameters is proposed together with the GE2E loss function for speaker verification. The feature extraction method and DSXNOR-Net are verified by Python. The total estimated power consumption is reduced by approximately 2× and 4× for SV and KWS, respectively.