python KNN MNIST 手写数字识别

    技术2022-07-15  74

    利用KNN算法做手写数字识别,数据集用到了MNIST。 KNN(最近邻算法),根据距离最近的K个标签中的多数值确定该数据的标签。 最主要的算法实现:

    def knn(k,train_images,train_labels,test_images,test_labels): errorCount = 0.0 # 记录错误个数 m=test_images.shape[0] #m=100 for i in range(m): classifierResult =classify(test_images[i],train_images,train_labels, k) # 调用k近邻法的分类器函数,进行判决 print ("the classifier %d came back with:%d, the real answer is: %d" % (i+1,classifierResult, test_labels[i])) if (classifierResult != test_labels[i]): errorCount += 1.0 print ("\nthe total number of errorsis: %d" % errorCount) print ("\nthe total error rate is:%f" % (errorCount/float(m))) def classify(testOne, dataSet, labels, k): dataSetSize = dataSet.shape[0] diffMat = tile(testOne, (dataSetSize,1))-dataSet sqDiffMat = diffMat**2 sqDistances = sqDiffMat.sum(axis=1)# 欧式距离 distances = sqDistances**0.5 sortedDistIndicies = distances.argsort()# 对训练结果中的欧式距 classCount={} # 离进行排序 for i in range(k): # 由距离最小的k个点通过多数表决法判别出结果 voteIlabel =labels[sortedDistIndicies[i]] classCount[voteIlabel] =classCount.get(voteIlabel,0) + 1 sortedClassCount =sorted(classCount.items(), key=operator.itemgetter(1), reverse=True) return sortedClassCount[0][0]

    所有代码的实包括数据集已经打包,赚点积分,自取。 https://download.csdn.net/download/qq_35498696/12569928

    Processed: 0.008, SQL: 9