论文部分内容阅读
在基于模板匹配的关键词识别中,提出采用深层神经网络的中间层特征(bottleneck,BN)作为特征输入,将其取代传统的声学参数来生成后验概率图.首先采用传统语音识别的过程训练一个中间层很窄的深层神经网络,将所有的语音特征经过这个神经网络后得到稳健的BN特征;然后利用混合高斯模型将BN特征转化成后验概率图;在识别过程中,利用后验概率图作为特征参数,采用简化的分段动态时间规整算法实现关键词匹配.在TIMIT数据库上,相对于采用传统感知线性参数的系统,采用BN特征的系统,识别准确率有30%的提升.
In the keyword recognition based on template matching, the bottleneck (BN) feature of deep neural network is proposed as feature input to replace the traditional acoustic parameters to generate a posteriori probability map.Firstly, the traditional speech recognition process training A deep neural network with a very narrow middle layer can get all the features of the speech through this neural network to obtain a robust BN feature. Then, the Gaussian mixture model is used to convert the BN features into a posteriori probability map. In the recognition process, the posterior probability As a feature parameter, a simplified segmentation dynamic time warping algorithm is used to achieve keyword matching.In the TIMIT database, compared with the system using the traditional perceived linear parameters, the system with BN features can improve the recognition accuracy by 30%.