一种基于属性值变化程度定权的聚类算法A weighted clustering algorithm based on attribute's variation
杨扬,许厚泽,常军
摘要(Abstract):
针对经典K-means聚类算法以欧氏距离作为相似度判断法则进行聚类划分,而未考虑聚类对象的各属性值对聚类划分的影响程度存在差异的问题,该文提出了一种基于属性值变化程度定权的聚类算法。通过采用Iris dataset数据进行实验,该算法相对于其他聚类算法获得了更好的聚类效果,且该算法适用于生物物种分类、遥感影像识别等工作领域,能提高聚类运算的精准度。
关键词(KeyWords): 聚类算法;K-means;定权;属性值
基金项目(Foundation): 国家自然科学基金项目(41374021)
作者(Author): 杨扬,许厚泽,常军
DOI: 10.16251/j.cnki.1009-2307.2018.05.001
参考文献(References):
- [1]HAN J,KAMBER M,TUNG A K.Spatial clustering methods in data mining:a survey[C]//Geographic Data Mining and Knowledge Discovery,Research Monographs in GIS.[S.l.]:Taylor and Francis,2001.
- [2]RUI X U,DONALD C,WUNSCH.Clustering[M].New Jersey:Wiley-IEEE Press,2009.
- [3]MACQUEEN J B.Some methods for classification and analysis of multivariate observations[C]//In Proc.of5th Berkeley Symposium on Mathematical Statistics and Probability.[S.l.]:[s.n.],1967:281-297.
- [4]BEZDEK J C,ROBERT E,WILLAM F.FCM:the fuzzy C-means clustering algorithm[J].Computers&Geosciences,1984,10:191-203.
- [5]KAUFM A N,ROUSSEE U W.Clustering by means of medoids[C]//Statistical Data Analysis Based on the L 1-Norm and Related Methods,First International Conference.[S.l.]:[s.n.],1987:405-416.
- [6]RAYMOND T N,JIAWEI H.CLARANS:a method for clustering objects for spatial data mining[J].IEEE Transactions on Knowledge and Data Engineering,2002,10(5):1013-1016.
- [7]HUANG Z X.Clustering large data sets with mixed numeric and categorical values[C]//Proceeding of the First Pacific-Asia Conference on Knowledge Discovery and Data Mining.[S.l.]:[s.n.],1997.
- [8]魏令峰.K-means聚类算法的改进与应用[D].沈阳:东北大学,2014:21-29.(WEI Lingfeng.K-means algorithm improvement and applicatio[D].Shenyang:Northeastern Universit,2014:21-29.)
- [9]宋旭东,朱文辉,邱占芝.大数据K-means聚类挖掘优化算法[J].大连交通大学学报,2015 36(3):91-94.(SONG Xuedong,ZHU Wenhui,QIU Zhanzhi.Big data K-means clustering mining optimization algorithm[J].Journal of Dalian Jiaotong University,2015 36(3):91-94.)
- [10]张阳,何丽,朱颢东.一种改进的K-means动态聚类算法[J].重庆师范大学学报(自然科学版),2016 30(1):97-101.(ZHANG Yang,HE Li,ZHU Haodong.An improved K-means dynamic clustering algorithm[J].Journal of Chongqing Normal University(Natural Science),2016,30(1):97-101.)
- [11]MUGDHA J,CHAKRADHAR V.Adapting K-means for clustering in big data[J].International Journal of Computer Applications,2014,101(1):19-24.
- [12]AMIR A,SAROSH H.K-Harmonic means type clustering algorithm for mixed datasets[J].Applied Soft Computing,2016,48:39-49.
- [13]郭靖.对K-means聚类算法欧氏距离加权系数的研究[J].网络安全技术与应用,2016(10):94-95.(GUO Jing.Research on euclidean distance weighting coefficient of K-means clustering algorithm[J].Network Security Technology&Application,2016(10):94-95.)
- [14]李婷婷.改进K-means聚类算法的研究[D].合肥:安徽大学,2015:17-25.(LI Tingting.The research of K-means clustering algorithm improvement[D].Hefei:Anhui University,2015:17-25.)
- [15]韩岩,李晓.加速大数据聚类K-means算法的改进[J].计算机工程与设计,2015,36(5):1317-1320.(HAN Yan,LI Xiao.Improved accelerating large data K-means clustering algorithm[J].Computer Engineering and Design,2015 36(5):1317-1320.)
- [16]周润物,李智勇,陈少淼,等.面向大数据处理的并行优化抽样聚类K-means算法[J].计算机应用,2016,36(2):311-315,329.(ZHOU Runwu,LI Zhiyong,CHEN Shaomiao,et al.Parallel optimization sampling clustering K-means algorithm for big data processing[J].Journal of Computer Applications,2016,36(2):311-315,329.)
- [17]贺嘉楠,高云龙,王宏杰,等.基于最小距离乘积K-means算法的改进[J].吉林大学学报(信息科学版),2015,33(5):564-569.(HE Jianan,GAO Yunlong,WANG Hongjie,et al.Improved K-means algorithm based on min-distance product[J].Journal of Jilin University(Information Science Edition),2015,33(5):564-569.)