《電子技術(shù)應(yīng)用》
您所在的位置:首頁 > 其他 > 設(shè)計(jì)應(yīng)用 > 面向船載遠(yuǎn)程會議的麥克風(fēng)陣列高精度DOA估計(jì)
面向船載遠(yuǎn)程會議的麥克風(fēng)陣列高精度DOA估計(jì)
2022年電子技術(shù)應(yīng)用第3期
劉雨佶1,2,3,童 峰1,2,3,陳東升1,2,3,盧榮富4,馮萬健4
1.廈門大學(xué) 水聲通信與海洋信息技術(shù)教育部重點(diǎn)實(shí)驗(yàn)室,福建 廈門361002;2.廈門大學(xué) 海洋與地球?qū)W院,福建 廈門361002; 3.廈門大學(xué)深圳研究院,廣東 深圳518000;4.廈門億聯(lián)網(wǎng)絡(luò)技術(shù)股份有限公司,福建 廈門361000
摘要: 隨著船舶智能化水平提高,船載遠(yuǎn)程會議系統(tǒng)對提高應(yīng)急處理能力、推進(jìn)船岸一體化網(wǎng)絡(luò)建設(shè)有重要意義,麥克風(fēng)陣列是保證遠(yuǎn)程會議系統(tǒng)語音效果和支持多模態(tài)交互的重要語音前端。但船舶艙室狹小尺寸一方面導(dǎo)致只能采用小尺寸麥陣,另一方面小艙室導(dǎo)致的強(qiáng)混響以及嘈雜艙室噪聲也使傳統(tǒng)麥克風(fēng)陣列算法性能嚴(yán)重下降??紤]船舶艙室復(fù)雜環(huán)境下小尺寸麥陣DOA估計(jì)場景,提出了一種輕量級Mask-DOA估計(jì)神經(jīng)網(wǎng)絡(luò)模型。該方法在DOA估計(jì)神經(jīng)網(wǎng)絡(luò)引入Mask算法降低噪聲和混響的干擾,并提取增強(qiáng)后的GCC-PHAT作為網(wǎng)絡(luò)特征,從而在小尺寸麥陣上實(shí)現(xiàn)高精度DOA估計(jì)。仿真和實(shí)驗(yàn)結(jié)果表明,所提出的Mask-DOA模型面對復(fù)雜的船舶艙室環(huán)境更魯棒,泛化能力更強(qiáng)。
中圖分類號: TN912.3
文獻(xiàn)標(biāo)識碼: A
DOI:10.16157/j.issn.0258-7998.212108
中文引用格式: 劉雨佶,童峰,陳東升,等. 面向船載遠(yuǎn)程會議的麥克風(fēng)陣列高精度DOA估計(jì)[J].電子技術(shù)應(yīng)用,2022,48(3):32-36,77.
英文引用格式: Liu Yuji,Tong Feng,Chen Dongsheng,et al. High precision DOA estimation of microphone array for shipboard teleconferencing[J]. Application of Electronic Technique,2022,48(3):32-36,77.
High precision DOA estimation of microphone array for shipboard teleconferencing
Liu Yuji1,2,3,Tong Feng1,2,3,Chen Dongsheng1,2,3,Lu Rongfu4,F(xiàn)eng Wanjian4
1.Key Laboratory of Underwater Acoustic Communication and Marine Information Technique of the Ministry of Education, Xiamen University,Xiamen 361002,China; 2.College of Earth and Ocean Sciences,Xiamen University,Xiamen 361002,China; 3.Shenzhen Research Institute of Xiamen University,Shenzhen 518000,China; 4.Xiamen Yilian Network Technology Co.,Ltd.,Xiamen 361000,China
Abstract: With the improvement of ship intelligence level, shipboard teleconferencing system is of great significance to improve the emergency handling capacity and promote the construction of shipboard integrated network. Microphone array is an important voice front-end to ensure the voice effect as well as the multi-mode interaction of teleconferencing system. However, while the small size of ship cabins leads to the adoption of small-size array, strong reverberation caused by small cabins and noisy cabin noise also seriously degrade the performance of traditional microphone array algorithm. Considering the direction of arrival(DOA) estimation scenario of small-size array in complex environment of ship cabin, a lightweight Mask-DOA estimation neural network model is proposed in this paper. With this method, Mask algorithm is introduced into the DOA estimation neural network to reduce the noise and reverb interference, then the enhanced GCC-PHAT is extracted as the network feature, so as to realize the high-precision DOA estimation on the small-size microphone array. Simulation and experimental results show that the Mask-DOA model proposed in this paper is more robust and has better generalization ability in the complex environment of ship cabin.
Key words : direction of arrival estimation;ship cabin noise and reverberation environment;neural network;time-frequency masking

0 引言

    船載遠(yuǎn)程會議系統(tǒng)在船舶智能化方面發(fā)揮著顯著作用,特別是可提高應(yīng)急處理能力,推進(jìn)船岸一體化網(wǎng)絡(luò)建設(shè)。近些年來,船載遠(yuǎn)程會議監(jiān)測系統(tǒng)發(fā)展迅速[1-3]。麥克風(fēng)陣列通過提供準(zhǔn)確波達(dá)方向(Direction Of Arrival,DOA)估計(jì)可實(shí)現(xiàn)語音增強(qiáng)處理,同時還可以為遠(yuǎn)程會議系統(tǒng)攝像機(jī)提供說話人方位信息,實(shí)現(xiàn)多模態(tài)交互,已成為遠(yuǎn)程會議系統(tǒng)的重要語音前端[4-5]。

    一般遠(yuǎn)程會議場所較為理想,因此往往采用較大的麥克風(fēng)陣列以保證DOA估計(jì),提高語音增強(qiáng)性能和多模態(tài)交互效果。但是,船載遠(yuǎn)程會議所在船舶艙室屬于非常典型復(fù)雜聲學(xué)場景。一方面,艙室尺寸狹小,既造成嚴(yán)重混響,也導(dǎo)致無法方便容納尺寸較大的遠(yuǎn)程會議麥克風(fēng)陣列;另一方面,受嚴(yán)重船舶艙室噪聲干擾[6],包括由各個艙室有限的空間里集中了非常多的電氣設(shè)備與發(fā)動機(jī)等設(shè)備造成嚴(yán)重的內(nèi)部噪聲,以及其他艦船噪聲、海浪等導(dǎo)致的外部噪聲。這些都將使得船舶艙室聲學(xué)特性變得復(fù)雜,對麥陣DOA估計(jì)提出了更高的挑戰(zhàn)。

    近些年,隨著人工智能的發(fā)展,Xiao等人提出利用多層感知機(jī)(Multilayer Perceptron,MLP)來進(jìn)行DOA估計(jì)[7],利用深層網(wǎng)絡(luò)與大數(shù)據(jù)來提高DOA估計(jì)準(zhǔn)確率,遠(yuǎn)遠(yuǎn)超過傳統(tǒng)DOA估計(jì)算法。Diaz-Guerra等人利用帶相位變換導(dǎo)向響應(yīng)功率特征作為特征,建立神經(jīng)網(wǎng)絡(luò)模將DOA估計(jì)任務(wù)轉(zhuǎn)化為回歸問題[8]。Nguyen等人使用具有多任務(wù)學(xué)習(xí)功能的2D卷積神經(jīng)網(wǎng)絡(luò)從短時空間偽譜魯棒地估計(jì)聲源的數(shù)量和到達(dá)方法[9],這種方法減少了神經(jīng)網(wǎng)絡(luò)學(xué)習(xí)聲音類別和方向信息之間不必要的關(guān)聯(lián),加速模型的收斂。




本文詳細(xì)內(nèi)容請下載:http://ihrv.cn/resource/share/2000003998。




作者信息:

劉雨佶1,2,3,童  峰1,2,3,陳東升1,2,3,盧榮富4,馮萬健4

(1.廈門大學(xué) 水聲通信與海洋信息技術(shù)教育部重點(diǎn)實(shí)驗(yàn)室,福建 廈門361002;2.廈門大學(xué) 海洋與地球?qū)W院,福建 廈門361002;

3.廈門大學(xué)深圳研究院,廣東 深圳518000;4.廈門億聯(lián)網(wǎng)絡(luò)技術(shù)股份有限公司,福建 廈門361000)




wd.jpg

此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。