办公室玩弄娇喘秘书在线观看,亚洲精品国产aa,国产精品一级毛片卡在线看

融合用戶(hù)興趣建模的智能推薦算法研究

信息技術(shù)與網(wǎng)絡(luò)安全 11期

洪志理，賴(lài) 俊，曹雷，陳希亮

(陸軍工程大學(xué) 指揮控制工程學(xué)院，江蘇南京210007)

摘要： 強(qiáng)化學(xué)習(xí)被越來(lái)越多地應(yīng)用到推薦系統(tǒng)中。提出一種基于DDPG融合用戶(hù)動(dòng)態(tài)興趣建模的推薦方法（DDPG-LA），使用LSTM網(wǎng)絡(luò)提取用戶(hù)的長(zhǎng)期興趣，利用注意力機(jī)制方法提取用戶(hù)的短期興趣，將兩種興趣結(jié)合作為智能體的狀態(tài)。同時(shí)，在LSTM網(wǎng)絡(luò)中加入狀態(tài)增強(qiáng)單元，以加速模型對(duì)于用戶(hù)長(zhǎng)期興趣的建模，在注意力機(jī)制中加入緩解推薦延遲的模塊來(lái)解決該方法應(yīng)用于推薦系統(tǒng)中時(shí)所產(chǎn)生的缺陷。在Movelines的兩個(gè)數(shù)據(jù)集上對(duì)模型進(jìn)行實(shí)驗(yàn)，同時(shí)在各種測(cè)試指標(biāo)上與傳統(tǒng)方法進(jìn)行比較，結(jié)果顯示所提出的算法更具優(yōu)越性。

關(guān)鍵詞： 強(qiáng)化學(xué)習(xí) 推薦系統(tǒng) DDPG DDPG-LA LSTM

中圖分類(lèi)號(hào)： TP18
文獻(xiàn)標(biāo)識(shí)碼： A
DOI： 10.19358/j.issn.2096-5133.2021.11.006
引用格式：洪志理，賴(lài)俊，曹雷，等. 融合用戶(hù)興趣建模的智能推薦算法研究[J].信息技術(shù)與網(wǎng)絡(luò)安全，2021，40(11)：37-48.

Research on intelligent recommendation algorithm integrating user interest modeling

Hong Zhili，Lai Jun，Cao Lei，Chen Xiliang

(Command & Control Engineering College，Army Engineering University of PLA，Nanjing 210007，China)

Abstract： Reinforcement learning is more and more applied to recommendation system. This paper proposes a recommendation method based on DDPG and user dynamic interest modeling(DDPG-LA). It uses LSTM network to extract user′s long-term interest and attention mechanism to extract user′s short-term interest. The two kinds of interest are combined as the state of agent. At the same time, the state enhancement unit is added to LSTM network to accelerate the modeling of users′ long-term interest, and the module to alleviate the recommendation delay is added to the attention mechanism to solve the defects when the method is applied to the recommendation system. In this paper, the model is tested on two data sets of Movelines, and compared with the traditional methods in various test indexes, the results show that the proposed algorithm has more advantages.

Key words : reinforcement learning; recommendation system；DDPG；DDPG-LA；LSTM；attention mechanism；long-term interest；short-term interest

0 引言

推薦系統(tǒng)[1]，作為大數(shù)據(jù)時(shí)代方便人們?cè)邶嫶蟮目蛇x項(xiàng)目中快速準(zhǔn)確定位到自己感興趣物品的工具，基本思想是通過(guò)構(gòu)建模型從用戶(hù)的歷史數(shù)據(jù)中提取用戶(hù)和物品的特征，利用訓(xùn)練好的模型對(duì)用戶(hù)有針對(duì)地推薦物品。

近年來(lái)隨著強(qiáng)化學(xué)習(xí)的快速發(fā)展，將強(qiáng)化學(xué)習(xí)應(yīng)用于推薦系統(tǒng)的研究越來(lái)越受到關(guān)注，首次將深度強(qiáng)化學(xué)習(xí)應(yīng)用于推薦系統(tǒng)的探索模型是DRN[2]，為深度強(qiáng)化學(xué)習(xí)在推薦系統(tǒng)中的應(yīng)用構(gòu)建了基本框架，圖1所示為基于深度強(qiáng)化學(xué)習(xí)的推薦系統(tǒng)框圖。

目前基于深度強(qiáng)化學(xué)習(xí)的推薦系統(tǒng)研究已有諸多研究成果，如童向榮[3]等人將DQN應(yīng)用于以社交網(wǎng)絡(luò)為基礎(chǔ)的信任推薦系統(tǒng)中，應(yīng)用于智能體學(xué)習(xí)用戶(hù)之間信任度的動(dòng)態(tài)表示，并基于這種信任值來(lái)為用戶(hù)做推薦；劉帥帥[4]將DDQN應(yīng)用于電影推薦中來(lái)解決推薦精確度低、速度慢以及冷啟動(dòng)等問(wèn)題；Munemasa[5]等人將DDPG算法應(yīng)用于店鋪推薦，來(lái)解決用戶(hù)數(shù)據(jù)稀疏問(wèn)題；Zhao[6]等人將Actor-Critic算法應(yīng)用于列表式推薦，來(lái)解決傳統(tǒng)推薦模型只能將推薦過(guò)程建模為靜態(tài)過(guò)程的問(wèn)題。上述研究成果以及未在此羅列的眾多研究均是利用強(qiáng)化學(xué)習(xí)本身的性質(zhì)來(lái)解決推薦問(wèn)題，很少?gòu)耐扑]角度出發(fā)考慮問(wèn)題。

本文詳細(xì)內(nèi)容請(qǐng)下載：http://ihrv.cn/resource/share/2000003846

作者信息：

洪志理，賴(lài) 俊，曹雷，陳希亮

(陸軍工程大學(xué) 指揮控制工程學(xué)院，江蘇南京210007)

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容