论文标题

ATSS-NET:通过基于注意的神经网络分离目标扬声器

Atss-Net: Target Speaker Separation via Attention-based Neural Network

论文作者

Li, Tingle, Lin, Qingjian, Bao, Yuanyuan, Li, Ming

论文摘要

最近,已将基于深度学习的目标扬声器分离引入了卷积神经网络(CNN)和长期短期记忆(LSTM)模型。在本文中,我们在频谱图域中提出了一个基于注意力的神经网络(ATSS-NET)。与CNN-LSTM体系结构相比,它允许网络计算每个特征相位层之间的相关性,并使用较浅的层提取更多特征。实验结果表明,我们的ATSS-NET比语音滤光器产生的性能更好,尽管它仅包含一半参数。此外,我们提出的模型还表明了语音增强的有希望的表现。

Recently, Convolutional Neural Network (CNN) and Long short-term memory (LSTM) based models have been introduced to deep learning-based target speaker separation. In this paper, we propose an Attention-based neural network (Atss-Net) in the spectrogram domain for the task. It allows the network to compute the correlation between each feature parallelly, and using shallower layers to extract more features, compared with the CNN-LSTM architecture. Experimental results show that our Atss-Net yields better performance than the VoiceFilter, although it only contains half of the parameters. Furthermore, our proposed model also demonstrates promising performance in speech enhancement.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源