《電子技術(shù)應(yīng)用》
您所在的位置:首頁 > 其他 > 設(shè)計(jì)應(yīng)用 > 基于加權(quán)判別隨機(jī)鄰域嵌入的故障特征提取算法
基于加權(quán)判別隨機(jī)鄰域嵌入的故障特征提取算法
信息技術(shù)與網(wǎng)絡(luò)安全 12期
夏麗莎1,劉 兵2
(1.上海理工大學(xué) 管理學(xué)院,上海200093;2.武漢科技大學(xué) 信息工程學(xué)院,湖北 武漢430081)
摘要: 針對(duì)大數(shù)據(jù)維數(shù)高、非線性強(qiáng)、噪聲敏感、故障特征信息冗余、部分歷史數(shù)據(jù)類別標(biāo)記信息可獲取等特點(diǎn),對(duì)適用于非線性數(shù)據(jù)的t-SNE無監(jiān)督流形學(xué)習(xí)方法進(jìn)行改進(jìn),提出一種基于加權(quán)判別隨機(jī)鄰域嵌入的故障特征提取算法。在原始高維空間和相應(yīng)的低維子空間定義包含類別信息的數(shù)據(jù)相似度,使用Manhattan距離作為度量方式以增大數(shù)據(jù)相對(duì)距離差,基于距離遠(yuǎn)近關(guān)系進(jìn)行相似度加權(quán),由此充分利用類別標(biāo)記約束指導(dǎo)降維,使得類間更分散而類內(nèi)更緊湊。結(jié)合KNN方法的UCI仿真數(shù)據(jù)集分類實(shí)驗(yàn)與KDD99網(wǎng)絡(luò)故障診斷實(shí)驗(yàn),表明該改進(jìn)故障特征提取算法能夠?qū)崿F(xiàn)更有效的故障診斷。
中圖分類號(hào): TP277
文獻(xiàn)標(biāo)識(shí)碼: A
DOI: 10.19358/j.issn.2096-5133.2021.12.005
引用格式: 夏麗莎,劉兵. 基于加權(quán)判別隨機(jī)鄰域嵌入的故障特征提取算法[J].信息技術(shù)與網(wǎng)絡(luò)安全,2021,40(12):26-31,39.
Fault feature extraction method based on weighted discriminative stochastic neighbor embedding
Xia Lisha1,Liu Bing2
(1.School of Business,University of Shanghai for Science and Technology,Shanghai 200093,China; 2.School of Information Science and Engineering,Wuhan University of Science and Technology,Wuhan 430081,China)
Abstract: In this paper, considering the high dimensionality, strong non-linearity, noise sensitivity, fault feature information redundancy and category label accessibility for big data, a novel method named Weighted Discriminative Stochastic Neighbor Embedding(WDSNE) is proposed for fault features extraction. This WDSNE method is an improvement based on the t-SNE unsupervised manifold learning method for non-linear data. Firstly, the data similarity between the original high-dimensional space and corresponding low-dimensional subspace is defined together with category information. Secondly, the Manhattan distance is selected as the distance measure in order to enhance the relative distance difference. Thirdly, the weighted data similarity is re-defined according to the Manhattan distance distribution. As a result, the class label information can be fully utilized as constraints to guide dimensionality reduction. This will make the inter-class more decentralized and the intra-class more compact. Experiments based on both UCI dataset and KDD99 network fault dataset demonstrate the diagnosis effectiveness of the improved fault features extraction method.
Key words : category information;stochastic neighbor embedding;weighted distance;fault features extraction

0 引言

隨著互聯(lián)網(wǎng)等新一代信息技術(shù)在各領(lǐng)域的融合創(chuàng)新,大數(shù)據(jù)成為行業(yè)智能化的關(guān)鍵內(nèi)容,對(duì)相應(yīng)技術(shù)及應(yīng)用具有重要推動(dòng)作用。在故障診斷領(lǐng)域,這些實(shí)時(shí)產(chǎn)生的大數(shù)據(jù)能提供有力依據(jù),但同時(shí)往往伴隨維數(shù)災(zāi)難,導(dǎo)致計(jì)算復(fù)雜度高、存儲(chǔ)量大和算法性能衰減等問題產(chǎn)生,成為影響效果的絆腳石,需要借助一系列特征提取方法進(jìn)行數(shù)據(jù)降維預(yù)處理,將高維空間數(shù)據(jù)投影至低維子空間,從而降低數(shù)據(jù)冗余度,提升故障診斷效率。

早期的特征提取方法基于線性假設(shè),即假設(shè)數(shù)據(jù)來源于全局線性空間且變量間相互獨(dú)立,以主成分分析、獨(dú)立元分析、多維尺度方法、線性判別分析為典型代表。其中主成分分析方法以最小化特征信息丟失為目標(biāo),適用于呈高斯分布的原始數(shù)據(jù);獨(dú)立元分析方法以最大化屬性獨(dú)立性為目標(biāo),可以處理非高斯分布的原始數(shù)據(jù);多維尺度方法基于樣本相似度低維可視化,與主成分分析和線性判別分析同屬于無監(jiān)督特征提取方法;線性判別分析方法以提高分類準(zhǔn)確率為目標(biāo),適用于處理高斯分布數(shù)據(jù),隸屬有監(jiān)督特征提取方法。



本文詳細(xì)內(nèi)容請(qǐng)下載:http://ihrv.cn/resource/share/2000003892






作者信息:

夏麗莎1,劉  兵2

(1.上海理工大學(xué) 管理學(xué)院,上海200093;2.武漢科技大學(xué) 信息工程學(xué)院,湖北 武漢430081)


此內(nèi)容為AET網(wǎng)站原創(chuàng),未經(jīng)授權(quán)禁止轉(zhuǎn)載。