国产精品婷婷久久久久久,国产精品美女久久久浪潮av,草草国产,人妻精品久久无码专区精东影业

子空間離群點(diǎn)數(shù)據(jù)挖掘系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[獨(dú)家原創(chuàng)].doc

   
約33頁DOC格式手機(jī)打開展開

子空間離群點(diǎn)數(shù)據(jù)挖掘系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[獨(dú)家原創(chuàng)],子空間離群點(diǎn)數(shù)據(jù)挖掘系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)1.47萬字自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過校內(nèi)系統(tǒng)檢測(cè),重復(fù)率低,僅在本站獨(dú)家出售,大家放心下載使用摘要 離群數(shù)據(jù)挖掘是數(shù)據(jù)挖掘中的主要研究?jī)?nèi)容之一,通過離群數(shù)據(jù)挖掘,能夠發(fā)現(xiàn)一些真實(shí)的、但又出乎人們意外的知識(shí),可以揭示稀有事件和現(xiàn)象,發(fā)現(xiàn)有趣的模式。近些年來,離群數(shù)據(jù)挖掘成為信息科學(xué)...
編號(hào):99-423106大小:1.39M
分類: 論文>計(jì)算機(jī)論文

內(nèi)容介紹

此文檔由會(huì)員 淘寶大夢(mèng) 發(fā)布

子空間離群點(diǎn)數(shù)據(jù)挖掘系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)

1.47萬字
自己原創(chuàng)的畢業(yè)論文,已經(jīng)通過校內(nèi)系統(tǒng)檢測(cè),重復(fù)率低,僅在本站獨(dú)家出售,大家放心下載使用

摘要 離群數(shù)據(jù)挖掘是數(shù)據(jù)挖掘中的主要研究?jī)?nèi)容之一,通過離群數(shù)據(jù)挖掘,能夠發(fā)現(xiàn)一些真實(shí)的、但又出乎人們意外的知識(shí),可以揭示稀有事件和現(xiàn)象,發(fā)現(xiàn)有趣的模式。近些年來,離群數(shù)據(jù)挖掘成為信息科學(xué)中一個(gè)活躍的分支,在數(shù)據(jù)庫、數(shù)據(jù)挖掘、機(jī)器學(xué)習(xí)和統(tǒng)計(jì)學(xué)等領(lǐng)域受到廣泛關(guān)注。
隨著數(shù)據(jù)獲取手段的發(fā)展,表示現(xiàn)實(shí)世界的數(shù)據(jù)越來越復(fù)雜,“豐富的數(shù)據(jù)與貧乏的知識(shí)”問題也日漸突出,這些數(shù)據(jù)背后隱藏著許多有用的信息和知識(shí),如何獲取這些知識(shí)和信息,促使了對(duì)數(shù)據(jù)挖掘技術(shù)的廣泛研究。然而這些數(shù)據(jù)的維數(shù)普遍都非常高,數(shù)據(jù)的高維性是最棘手的,這對(duì)已有的離群數(shù)據(jù)挖掘算法是一個(gè)挑戰(zhàn),針對(duì)這一問題,本課題基于子空間的離群數(shù)據(jù)挖掘方法,先把高維數(shù)據(jù)投影到低維子空間,然后在子空間中觀察數(shù)據(jù),并利用微粒群算法搜索稀疏子空間和最優(yōu)劃分,進(jìn)而確定離群數(shù)據(jù)。主要針對(duì)高維數(shù)據(jù)集中的離群數(shù)據(jù)挖掘問題進(jìn)行了研究,研究?jī)?nèi)容主要包括以下幾個(gè)方面:
1. 給出了一種基于基于距離的關(guān)聯(lián)子空間離群點(diǎn)挖掘算法。第一類是先搜索所有的關(guān)聯(lián)子空間,然后在關(guān)聯(lián)子空間中進(jìn)行離群點(diǎn)挖掘,如HiCS。二類是先確定給定數(shù)據(jù)點(diǎn)的關(guān)聯(lián)子空間集合,然后計(jì)算相應(yīng)的離群度。這種方式通常會(huì)更加有意義,可以更好的解釋數(shù)據(jù)點(diǎn)離群的原因,如OUTRES。
2. 給出了一種基于微粒群和子空間的離群數(shù)據(jù)挖掘算法,該算法的核心思想是針對(duì)實(shí)際應(yīng)用中,對(duì)于高維數(shù)據(jù)的異常行為通常只發(fā)生在屬性子集上,而與其余維幾乎沒有關(guān)系。算法首先將高維數(shù)據(jù)投影到低維子空間,計(jì)算每個(gè)子空間的稀疏系數(shù),把子空間稀疏系數(shù)作為子空間異常程度的度量。采用帶有變異算子的PSO算法來搜索子空間。
在上述研究的基礎(chǔ)上,以eclipse為開發(fā)工具,設(shè)計(jì)并實(shí)現(xiàn)離群數(shù)據(jù)挖掘系統(tǒng),對(duì)軟件模塊功能、關(guān)鍵技術(shù)進(jìn)行詳細(xì)描述。

關(guān)鍵詞 離群數(shù)據(jù) 子空間 數(shù)據(jù)挖掘

Design and Implementation of Management System of Outlier Mining Algorithms Based on Subspace
Abstract outlier mining is one of the most important topic in data mining.outlier mining can help people discover true and unexpected information,and has aroused the interest of the many researchers.most traditional methods of outlier mining regard outliers from overall point of view .so it is difficult to find bias data or outliers in subspace.this paper studies outliers mining in subspace by partitioning high dimensional space into low dimensional subspace.main researches are as follows:
(1)HiCS will search for high-contrast sub-space as a subspace outlier mining preprocessing step, and then the various high-contrast subspace outlier score integrate, to get the final results will be sorted outliers,HiCS search subspace from the overall situation, not determined its associated sub-space for each data point.。
(2) An outlier mining algorithm based on PSO (Particle swarm optimization)and subspace is presented .the algorithm regards outlier subspace swarm,and searches for outlier subspace with mutational PSO algorithm according to sparsity coefficient of subspace.data in outlier subspace is regard as outlier.finally,the experiment results validate the PSO algorithm by taking the star spectra data from the lamost project.
(3) Local outlier mining algorithm based in subspace partitioning is presented .firstly ,data set is divided into the disjoint subspace.merits of partition are measured by skew of partition,and the best partition of the subspace is searched by using the PSO.secondly,the local outlier is measured by its SPLOF value.finally,experimental results show that the PSO-LOF algorithm does not depend on user’s parameters ,and has scalability and high efficiency by taking spectral data as data set.
(4) On the base of the above ,the outlier mining system based on subspace is designed and implemented by using ECLIPSE as development tools .its function modules and key technology are elaborated.



Key words Outlier;subspace;data mining

目 錄
第一章 緒論 1
1.1 研究背景 1
1.2 研究現(xiàn)狀 1
1.3 研究?jī)?nèi)容 2
1.4 論文結(jié)構(gòu) 2
第二章 相關(guān)技術(shù) 3
2.1 數(shù)據(jù)挖掘技術(shù) 3
2.2 JAVA技術(shù) 6
2.3 Eclipse 開發(fā)工具 6
第三章 基于距離的關(guān)聯(lián)子空間離群算法 8
3.1 Hics算法 8
3.2 outres算法 13
3.3 LOF算法 16
第四章 基于微粒群和子空間的離群數(shù)據(jù)挖掘算法 18
4.1 引言 18
4.2 PSO算法 18
第五章 基于子空間的離群數(shù)據(jù)挖掘系統(tǒng)的實(shí)現(xiàn) 22
5.1 系統(tǒng)功能模塊 22
5.2 主界面 22
5.3 運(yùn)行結(jié)果分析 23
第六章 總結(jié)與展望 27
6.1 結(jié)論 27
6.2 展望 27
致 謝 28
參考文獻(xiàn) 29