《電子技術(shù)應(yīng)用》
您所在的位置:首頁 > 人工智能 > 设计应用 > 具有关系敏感嵌入的知识库错误检测
具有关系敏感嵌入的知识库错误检测
2020年信息技术与网络安全第10期
缪 琦,杨昕悦
辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛125105
摘要: 准确性与质量对于知识库而言尤为重要,尽管已经有很多关于知识库不完整性的研究,但是很少有工作者考虑到对于知识库存在的错误进行检测,按照传统方法通常无法有效捕捉知识库中错误事实内在相关性。本文提出了一种知识库具有关系敏感嵌入式方法NSIL,以获取知识库各关系之间的相关性,从而检查出知识库中的错误,以此提高知识库的准确性与质量。该方法分为相关性处理和错误检测两阶段。在相关性处理阶段,使用NSIL的相关函数以分值形式获取各关系之间的相关度;在错误检测阶段,基于相关度分值进行错误检测,对于缺失主体或客体的三元组进行缺失成分预测。最后在知识库之一Freebase生成的基准数据集“FB15K”上进行了广泛验证,证明了该方法在知识库错误知识检测方面有着很高的性能。
關(guān)鍵詞: 知识库 嵌入模型 错误检测
中圖分類號: TP183
文獻標(biāo)識碼: A
DOI: 10.19358/j.issn.2096-5133.2020.10.005
引用格式: 繆琦,楊昕悅. 具有關(guān)系敏感嵌入的知識庫錯誤檢測[J].信息技術(shù)與網(wǎng)絡(luò)安全,2020,39(10):23-27,37.
Knowledge base error detection with relation sensitive embedding
Miao Qi,Yang Xinyue
School of Electronic and Information Engineering,Liaoning Technical University,Huludao 125105,China
Abstract: Accuracy and quality are very important for the knowledge base. Although there have been many researches on the incompleteness of knowledge base, few workers consider the detection of errors in the knowledge base. According to the traditional methods, it is usually unable to effectively capture the internal correlation of errors in the knowledge base, so as to check the errors. In this paper, a relational sensitive embedded method NSIL for knowledge base is proposed to obtain the correlation among the relationships between them, so as to check out the errors in the knowledge base, so as to improve the accuracy and quality of the knowledge base. This method is divided into two stages: correlation processing and error detection. In the correlation processing stage, correlation function of NSIL is used to obtain the correlation degree of each relationship in the form of score; in the error detection stage, error detection is based on the score of correlation degree, and missing component prediction is carried out for the triplet of missing subject or object. At last, the method is verified on the benchmark data set "FB15K" which is generated by Freebase, one of the largest knowledge bases. It is proved that the method has high performance in knowledge base error detection.
Key words : knowledge base;embedding model;error detection

0 引言

    如今,知識庫已經(jīng)成為各種研究和應(yīng)用越來越重要的和常用的數(shù)據(jù)源,如語義搜索、實體鏈接、問答系統(tǒng)和自然語言處理等。為了使龐大數(shù)據(jù)庫更易于操作,研究者提出了一種新的研究方向——知識庫嵌入。關(guān)鍵思想是嵌入KB(Knowledge Base)組件,包括將實體和關(guān)系轉(zhuǎn)化為連續(xù)的向量空間,從而簡化操作,同時保留KB原有的結(jié)構(gòu)。實體和關(guān)系嵌入能進一步應(yīng)用于各種任務(wù)中,如KB補全、關(guān)系提取、實體分類和實體解析。雖然龐大的知識庫中有數(shù)以億計的事實,但是在信息爆炸的時代遠(yuǎn)遠(yuǎn)不夠。大部分的研究工作聚焦知識庫對缺失邊的擴充,很少有人考慮到其中過時的、不正確的信息[1-3]。許多擴充知識庫研究將事實投射到k維向量空間,通過聚類來找到關(guān)系的相關(guān)性,很難實現(xiàn)高效有效處理。




本文詳細(xì)內(nèi)容請下載:http://ihrv.cn/resource/share/2000003133




作者信息:

繆  琦,楊昕悅

(遼寧工程技術(shù)大學(xué) 電子與信息工程學(xué)院,遼寧 葫蘆島125105)

此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。