国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁(yè) > 編程 > Python > 正文

Python實(shí)現(xiàn)k-means算法

2020-01-04 15:49:08
字體:
供稿:網(wǎng)友

本文實(shí)例為大家分享了Python實(shí)現(xiàn)k-means算法的具體代碼,供大家參考,具體內(nèi)容如下

這也是周志華《機(jī)器學(xué)習(xí)》的習(xí)題9.4。

數(shù)據(jù)集是西瓜數(shù)據(jù)集4.0,如下

編號(hào),密度,含糖率
1,0.697,0.46
2,0.774,0.376
3,0.634,0.264
4,0.608,0.318
5,0.556,0.215
6,0.403,0.237
7,0.481,0.149
8,0.437,0.211
9,0.666,0.091
10,0.243,0.267
11,0.245,0.057
12,0.343,0.099
13,0.639,0.161
14,0.657,0.198
15,0.36,0.37
16,0.593,0.042
17,0.719,0.103
18,0.359,0.188
19,0.339,0.241
20,0.282,0.257
21,0.784,0.232
22,0.714,0.346
23,0.483,0.312
24,0.478,0.437
25,0.525,0.369
26,0.751,0.489
27,0.532,0.472
28,0.473,0.376
29,0.725,0.445
30,0.446,0.459

算法很簡(jiǎn)單,就不解釋了,代碼也不復(fù)雜,直接放上來:

# -*- coding: utf-8 -*- """Excercise 9.4"""import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport sysimport randomdata = pd.read_csv(filepath_or_buffer = '../dataset/watermelon4.0.csv', sep = ',')[["密度","含糖率"]].values########################################## K-means ####################################### k = int(sys.argv[1])#Randomly choose k samples from data as mean vectorsmean_vectors = random.sample(data,k)def dist(p1,p2):  return np.sqrt(sum((p1-p2)*(p1-p2)))while True:  print mean_vectors  clusters = map ((lambda x:[x]), mean_vectors)   for sample in data:    distances = map((lambda m: dist(sample,m)), mean_vectors)     min_index = distances.index(min(distances))    clusters[min_index].append(sample)  new_mean_vectors = []  for c,v in zip(clusters,mean_vectors):    new_mean_vector = sum(c)/len(c)    #If the difference betweenthe new mean vector and the old mean vector is less than 0.0001    #then do not updata the mean vector    if all(np.divide((new_mean_vector-v),v) < np.array([0.0001,0.0001]) ):      new_mean_vectors.append(v)      else:      new_mean_vectors.append(new_mean_vector)    if np.array_equal(mean_vectors,new_mean_vectors):    break  else:    mean_vectors = new_mean_vectors #Show the clustering resulttotal_colors = ['r','y','g','b','c','m','k']colors = random.sample(total_colors,k)for cluster,color in zip(clusters,colors):  density = map(lambda arr:arr[0],cluster)  sugar_content = map(lambda arr:arr[1],cluster)  plt.scatter(density,sugar_content,c = color)plt.show()

運(yùn)行方式:在命令行輸入 python k_means.py 4。其中4就是k。
下面是k分別等于3,4,5的運(yùn)行結(jié)果,因?yàn)橐婚_始的均值向量是隨機(jī)的,所以每次運(yùn)行結(jié)果會(huì)有不同。

Python,kmeans

Python,kmeans

Python,kmeans

以上就是本文的全部?jī)?nèi)容,希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持VEVB武林網(wǎng)。


注:相關(guān)教程知識(shí)閱讀請(qǐng)移步到python教程頻道。
發(fā)表評(píng)論 共有條評(píng)論
用戶名: 密碼:
驗(yàn)證碼: 匿名發(fā)表
主站蜘蛛池模板: 县级市| 商洛市| 平罗县| 龙井市| 文山县| 平定县| 观塘区| 嵊州市| 奉新县| 淮阳县| 乌鲁木齐县| 迭部县| 杭州市| 建昌县| 沛县| 青冈县| 平顺县| 织金县| 广东省| 贵德县| 姚安县| 西华县| 乌什县| 全南县| 东丽区| 宜春市| 鹿邑县| 监利县| 双鸭山市| 谢通门县| 习水县| 边坝县| 乐安县| 永安市| 门头沟区| 衡水市| 邵阳市| 沙河市| 长宁区| 北宁市| 南平市|