人工智能实战：识别分类货船与游轮

技术2022-07-13 71

实验环境：TENSORFLOW2.0+PYCHARM+CUDA 实验过程：模型1 实验初期我们最开始首先设计了简单的图片分类模型。该模型分为以下几个模块： Mymodel_001.h5是训练集训练得到的模型 Testing.py是测试模型准确度的程序 Training.py是训练模型的程序训练集，验证集和测试集则为以下几个文件夹： test_set 是测试集 training_set是训练集 validation_set是验证集具体设计如下：分类的数据主要是分为有船和无船两类，所以三个数据集分别有两个子集： Non-ship代表没有船 Ship则代表有船（这些图片都是利用谷歌浏览器的插件： Fatkun,从谷歌图片和百度图片上面下载的。）收集好数据集后，接下来的工作是设计神经网络的模型。由于第一次亲手设计，我们只使用了几个隐藏层：该模型仅有三个隐藏层和池化层交接组成，再连接一个512的全连接层后，输出一个由sigmoid激活的浮点数。这个浮点数越接近1则有船的机率越大，越接近0则更小。数据集和理论模型完成后，接下来就是代码实现。代码实现：编程全程使用python程序语言。最开始是导入设计所需要的各个模块：

import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D from tensorflow.keras.preprocessing.image import ImageDataGenerator import os import matplotlib.pyplot as plt import glob

首先以文件夹的形式导入数据集：

trainingDir = r'D:\project_material\shipReco\training_set\\'# 训练集路径 validationDir = r'D:\project_material\shipReco\validation_set\\'# 验证集路径 trainShipDir = os.path.join(trainingDir, 'ships')# 训练集子集船的路径 trainNonShipDir = os.path.join(trainingDir, 'non-ships')#训练集子集无船的路径 num_ship_tr = len(os.listdir(trainShipDir))# 训练集船的数量 num_non_ship_tr = len(os.listdir(trainNonShipDir))# 训练集无船的数量 val_ShipDir = os.path.join(validationDir, 'ships')# 验证集子集船的路径 val_NonShipDir = os.path.join(validationDir, 'non-ships')# 验证集子集无船的路径 num_ship_val = len(os.listdir(val_ShipDir))# 验证集船的数量 num_non_ship_val = len(os.listdir(val_NonShipDir))# 验证集无船的数量

导入文件后，定义全局超参数：

batch_size = 48 # 每次循环的图片数 epochs = 20 # 总共训练几个循环 IMG_HEIGHT = 170 # 模型输入图像高度 IMG_WIDTH = 170 # 模型输入图像宽度

ImageDataGenerator生成训练集图片：

image_gen_train = ImageDataGenerator(rescale=1./255, rotation_range=15, width_shift_range=.10,height_shift_range=.10, fill_mode='nearest', horizontal_flip=True, zoom_range=0.5) # 生成训练数据 image_gen_val = ImageDataGenerator(rescale=1./255,fill_mode='nearest',) # 生成验证数据

利用flow_from_directory的方法扩充数据集：

image_gen_train = ImageDataGenerator(rescale=1./255, rotation_range=15, width_shift_range=.10,height_shift_range=.10,fill_mode='nearest', horizontal_flip=True, zoom_range=0.5) # 生成训练数据 val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size, directory=validationDir,target_size=(IMG_HEIGHT, IMG_WIDTH), class_mode='binary') # 生成验证数据

接下来则是定义先前设计的模型，使用了tensorflow.keras.layers中封装好的层：

model = Sequential([ Conv2D(16, 3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)), MaxPooling2D(), Conv2D(32, 3, padding='same', activation='relu'), MaxPooling2D(), Conv2D(64, 3, padding='same', activation='relu'), MaxPooling2D(), Flatten(), Dense(512, activation='relu'), Dense(1, activation='sigmoid') ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

定义完模型后，使用fit_generator()训练模型：

total_train = num_ship_tr + num_non_ship_tr # 训练总数 total_val = num_ship_val + num_non_ship_val # 验证总数 history = model.fit_generator( train_data_gen, steps_per_epoch=total_train // batch_size, # 每步的训练的数据数 epochs=epochs, validation_data=val_data_gen, validation_steps=total_val // batch_size # 每步的验证的数据数 ) model.save('my_model_001.h5') #保存训练好的模型

到这里模型就可以训练了，但是为了获得直观的图像，导入matplotlib模块画出拟合过程：

acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs_range = range(epochs) plt.figure(figsize=(8, 8)) plt.subplot(1, 2, 1) plt.plot(epochs_range, acc, label='Training Accuracy') plt.plot(epochs_range, val_acc, label='Validation Accuracy') plt.legend(loc='lower right') plt.title('Training and Validation Accuracy') plt.subplot(1, 2, 2) plt.plot(epochs_range, loss, label='Training Loss') plt.plot(epochs_range, val_loss, label='Validation Loss') plt.legend(loc='upper right') plt.title('Training and Validation Loss') plt.show()

接下来是测试集的代码：

IMG_HEIGHT = 170 IMG_WIDTH = 170 testDir = r'D:\project_material\shipReco\test_set\\'[:-1] testShipDir = os.path.join(testDir, r'ships\\'[:-1]) testNonShipDir = os.path.join(testDir, r'non-ships\\'[:-1])

使用imageDataGenerator会打乱顺序，需要定义一个函数将测试集的图片张量化：

def collectImg(directory): imgMatrix = [] file_names = os.listdir(directory) # imgMatrix用来存放转换的图片（三位张量） # files_names读取directory中的文件 for i in file_names: # 循环文件名 file_path = os.path.join(directory, i) # 合并路径名和文件名 if file_path[-3:] == "txt": # 跳过txt文件 continue if file_path[-3:] == "gif": # 跳过gif文件 continue img = cv2.imread(file_path) # 用Opencv读取图片 img = img/255 # 将图片数据1/255化 img = cv2.resize(img, (IMG_HEIGHT, IMG_WIDTH)) # 重调整图片大小 imgMatrix.append(img.tolist()) # 连接到图片张量上

接下来导入模型测试：

wholeN = 0 wholeS = 0 model = tf.keras.models.load_model('my_model_001.h5') # 读取模型文件 imgS = collectImg(testShipDir) # 调用先前的函数转化有船的测试集 imgS_array = np.array(imgS) # np矩阵化 predictionShip = model.predict(imgS_array) # 使用模型估计 imgN = collectImg(testNonShipDir) # 转化无船的模型 imgN_array = np.array(imgN) # np矩阵化 predictionNon = model.predict(imgN_array) # 使用模型估计 for each in predictionShip: # 计算正确率： wholeS = wholeS + 1 - each accS = 1 - wholeS / len(predictionShip) for each in predictionNon: wholeN = wholeN + each accN = 1 - wholeN / len(predictionNon)

最后输出：

print("For images with ships the accuracy is:",accS) print("For images with no ships the accuracy is:",accN) print("The accuracy for whole testing set is:",1 - (wholeS + wholeN)/(len(predictionShip)+len(predictionNon)))

训练完毕后输出如下图像：如图可知训练集准确率大概在84%左右，验证集大概在80%以上。训练集损失大概有0.3左右，验证集则在0.4左右最后利用测试模块测试的结果是：测试集的最终准确率是70.4%。

Processed: 0.011, SQL: 9