国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 編程 > Python > 正文

Python聚類算法之DBSACN實例分析

2019-11-25 17:01:37
字體:
來源:轉載
供稿:網友

本文實例講述了Python聚類算法之DBSACN。分享給大家供大家參考,具體如下:

DBSCAN:是一種簡單的,基于密度的聚類算法。本次實現中,DBSCAN使用了基于中心的方法。在基于中心的方法中,每個數據點的密度通過對以該點為中心以邊長為2*EPs的網格(鄰域)內的其他數據點的個數來度量。根據數據點的密度分為三類點:

核心點:該點在鄰域內的密度超過給定的閥值MinPs。
邊界點:該點不是核心點,但是其鄰域內包含至少一個核心點。
噪音點:不是核心點,也不是邊界點。

有了以上對數據點的劃分,聚合可以這樣進行:各個核心點與其鄰域內的所有核心點放在同一個簇中,把邊界點跟其鄰域內的某個核心點放在同一個簇中。

# scoding=utf-8import pylab as plfrom collections import defaultdict,Counterpoints = [[int(eachpoint.split("#")[0]), int(eachpoint.split("#")[1])] for eachpoint in open("points","r")]# 計算每個數據點相鄰的數據點,鄰域定義為以該點為中心以邊長為2*EPs的網格Eps = 10surroundPoints = defaultdict(list)for idx1,point1 in enumerate(points):  for idx2,point2 in enumerate(points):    if (idx1 < idx2):      if(abs(point1[0]-point2[0])<=Eps and abs(point1[1]-point2[1])<=Eps):        surroundPoints[idx1].append(idx2)        surroundPoints[idx2].append(idx1)# 定義鄰域內相鄰的數據點的個數大于4的為核心點MinPts = 5corePointIdx = [pointIdx for pointIdx,surPointIdxs in surroundPoints.iteritems() if len(surPointIdxs)>=MinPts]# 鄰域內包含某個核心點的非核心點,定義為邊界點borderPointIdx = []for pointIdx,surPointIdxs in surroundPoints.iteritems():  if (pointIdx not in corePointIdx):    for onesurPointIdx in surPointIdxs:      if onesurPointIdx in corePointIdx:        borderPointIdx.append(pointIdx)        break# 噪音點既不是邊界點也不是核心點noisePointIdx = [pointIdx for pointIdx in range(len(points)) if pointIdx not in corePointIdx and pointIdx not in borderPointIdx]corePoint = [points[pointIdx] for pointIdx in corePointIdx] borderPoint = [points[pointIdx] for pointIdx in borderPointIdx]noisePoint = [points[pointIdx] for pointIdx in noisePointIdx]# pl.plot([eachpoint[0] for eachpoint in corePoint], [eachpoint[1] for eachpoint in corePoint], 'or')# pl.plot([eachpoint[0] for eachpoint in borderPoint], [eachpoint[1] for eachpoint in borderPoint], 'oy')# pl.plot([eachpoint[0] for eachpoint in noisePoint], [eachpoint[1] for eachpoint in noisePoint], 'ok')groups = [idx for idx in range(len(points))]# 各個核心點與其鄰域內的所有核心點放在同一個簇中for pointidx,surroundIdxs in surroundPoints.iteritems():  for oneSurroundIdx in surroundIdxs:    if (pointidx in corePointIdx and oneSurroundIdx in corePointIdx and pointidx < oneSurroundIdx):      for idx in range(len(groups)):        if groups[idx] == groups[oneSurroundIdx]:          groups[idx] = groups[pointidx]# 邊界點跟其鄰域內的某個核心點放在同一個簇中for pointidx,surroundIdxs in surroundPoints.iteritems():  for oneSurroundIdx in surroundIdxs:    if (pointidx in borderPointIdx and oneSurroundIdx in corePointIdx):      groups[pointidx] = groups[oneSurroundIdx]      break# 取簇規模最大的5個簇wantGroupNum = 3finalGroup = Counter(groups).most_common(3)finalGroup = [onecount[0] for onecount in finalGroup]group1 = [points[idx] for idx in xrange(len(points)) if groups[idx]==finalGroup[0]]group2 = [points[idx] for idx in xrange(len(points)) if groups[idx]==finalGroup[1]]group3 = [points[idx] for idx in xrange(len(points)) if groups[idx]==finalGroup[2]]pl.plot([eachpoint[0] for eachpoint in group1], [eachpoint[1] for eachpoint in group1], 'or')pl.plot([eachpoint[0] for eachpoint in group2], [eachpoint[1] for eachpoint in group2], 'oy')pl.plot([eachpoint[0] for eachpoint in group3], [eachpoint[1] for eachpoint in group3], 'og')# 打印噪音點,黑色pl.plot([eachpoint[0] for eachpoint in noisePoint], [eachpoint[1] for eachpoint in noisePoint], 'ok')  pl.show()

運行效果截圖如下:

希望本文所述對大家Python程序設計有所幫助。

發表評論 共有條評論
用戶名: 密碼:
驗證碼: 匿名發表
主站蜘蛛池模板: 清河县| 平利县| 靖州| 甘泉县| 博野县| 太仓市| 阳新县| 兴业县| 深圳市| 同仁县| 石嘴山市| 新疆| 成都市| 吴堡县| 会昌县| 醴陵市| 霍州市| 绥江县| 大庆市| 荆门市| 璧山县| 丰顺县| 临朐县| 罗江县| 垫江县| 铅山县| 赤峰市| 永平县| 昭平县| 武宁县| 本溪市| 白水县| 安顺市| 赫章县| 汾阳市| 浏阳市| 新建县| 德江县| 三门峡市| 余姚市| 勃利县|