基于多重注意力引導(dǎo)的人群計(jì)數(shù)算法
網(wǎng)絡(luò)安全與數(shù)據(jù)治理 2022年 第1期
楊倩倩,何 晴,彭思凡,殷保群
(中國(guó)科學(xué)技術(shù)大學(xué) 信息科學(xué)技術(shù)學(xué)院,安徽 合肥 230026)
摘要: 針對(duì)實(shí)際場(chǎng)景中存在的人群非均勻分布問(wèn)題,提出了一種基于多重注意力引導(dǎo)的人群計(jì)數(shù)算法。首先,基于輕量級(jí)金字塔切分注意力機(jī)制構(gòu)建了自頂向下的特征融合路徑,旨在促進(jìn)高層語(yǔ)義信息和低層空間細(xì)節(jié)的融合,生成高級(jí)語(yǔ)義和空間細(xì)節(jié)兼?zhèn)涞母哔|(zhì)量特征圖;然后,提取并融合多尺度上下文信息,以此生成關(guān)注于不同密度分布模式的注意力權(quán)重圖;最后,通過(guò)注意力權(quán)重圖指導(dǎo)密度回歸網(wǎng)絡(luò)識(shí)別不同分布狀態(tài)下的行人目標(biāo),增強(qiáng)模型對(duì)密度變化的適應(yīng)性,生成高質(zhì)量人群密度圖。在ShanghaiTech、UCF_QNRF和JHU-CROWD++三個(gè)數(shù)據(jù)集上進(jìn)行了大量的實(shí)驗(yàn)來(lái)說(shuō)明所提算法的先進(jìn)性。
中圖分類號(hào): TP309
文獻(xiàn)標(biāo)識(shí)碼: A
DOI: 10.20044/j.csdg.2097-1788.2022.01.017
引用格式: 楊倩倩,何晴,彭思凡,等. 基于多重注意力引導(dǎo)的人群計(jì)數(shù)算法[J].網(wǎng)絡(luò)安全與數(shù)據(jù)治理,2022,41(1):108-116.
文獻(xiàn)標(biāo)識(shí)碼: A
DOI: 10.20044/j.csdg.2097-1788.2022.01.017
引用格式: 楊倩倩,何晴,彭思凡,等. 基于多重注意力引導(dǎo)的人群計(jì)數(shù)算法[J].網(wǎng)絡(luò)安全與數(shù)據(jù)治理,2022,41(1):108-116.
Multi-attention convolutional network for crowd counting
Yang Qianqian,He Qing,Peng Sifan,Yin Baoqun
(School of Information Science and Technology,University of Science and Technology of China,Hefei 230026,China)
Abstract: Aiming at the problem of non-uniform crowd distribution in practical scenes, this paper proposes a crowd counting algorithm based on multi-attention mechanism. A top-down feature fusion path is constructed based on the lightweight pyramid split attention mechanism, which aims to promote the fusion of high-level semantic features and low-level spatial details, resulting in high-quality feature maps with both semantics and spatial details. Then multi-scale context information is extracted and fused to generate attention weight maps that focus on different density distribution patterns. At last, the density regression network is guided by the attention weight maps to identify pedestrian targets in different distributions, enhancing the model′s adaptability to density variation, so as to generate high-quality crowd density maps. Abundant experiments on three datasets including ShanghaiTech, UCF_QNRF and JHU-CROWD++ were conducted to prove the effectiveness of the proposed network.
Key words : crowd counting;density map estimation;attention mechanism;feature pyramid network
0 引言
由于人群所在的位置和行動(dòng)軌跡具有主觀性強(qiáng)、自由度高的特點(diǎn),監(jiān)控視頻采集的圖像包含大量雜亂分布的人群,不同局部區(qū)域的人群密度差異巨大,增大了人群計(jì)數(shù)算法的估計(jì)難度。如圖1所示,在同一人群場(chǎng)景中,多個(gè)局部區(qū)域表現(xiàn)為人口極度聚集,而部分區(qū)域人口稀疏甚至是孤立的個(gè)體,難以預(yù)測(cè)的行人位置將導(dǎo)致密度圖中不同位置的密度值之間存在巨大差異,對(duì)算法感知不同密度分布模式的能力提出了更高的要求。為解決上述問(wèn)題,本文提出基于多重注意力引導(dǎo)的人群計(jì)數(shù)算法,將特征金字塔機(jī)制和注意力機(jī)制相結(jié)合,促進(jìn)語(yǔ)義信息和空間細(xì)節(jié)的融合,并通過(guò)注意力圖引導(dǎo)模型生成對(duì)應(yīng)不同分布狀態(tài)的密度圖。
本文詳細(xì)內(nèi)容請(qǐng)下載:http://ihrv.cn/resource/share/2000004620
作者信息:
楊倩倩,何 晴,彭思凡,殷保群
(中國(guó)科學(xué)技術(shù)大學(xué) 信息科學(xué)技術(shù)學(xué)院,安徽 合肥 230026)
此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。