2、下载预训练模型并进行迁移学习

技术2022-07-10 230

训练模型

1、前言1.1 本文章约定1.2 开始条件1.3 视频教程1.4 整个流程概览 2、最终成果展示3、建立项目文件夹结构4、标注图片4.1 修改 predefined_classes.txt 文件4.2 拷贝图片到 train 目录4.3 使用 LabelImg 标注图片4.4 剪切 train 文件夹里 10% 的图片及标注到 eval 目录4.5 拷贝图片到 test 目录 5、将 xml 标注文件转化为 tfrecord 文件5.1 创建标签映射文件 pbtxt5.2 将 xml 转换为 csv 文件5.3 将 csv 文件转化为 tfrecord 文件 6、修改预训练模型的配置文件7、训练模型7.1 解压预训练模型7.2 开始训练7.3 使用 TensorBoard 观察训练过程（非必须操作）7.4 评估模型（非必须操作）7.5 使用 TensorBoard 观察评估结果（非必须操作） 8、导出模型9、用模型做目标检测

1、前言

1.1 本文章约定

操作：表示你需要跟着说明进行相应的操作。输入命令 xxxxxx：表示你需要在控制台键入命令。

1.2 开始条件

本教程需要先搭建深度学习目标检测开发环境，如果还没有搭建好环境，请先看： 1、搭建深度学习图像识别开发环境 https://blog.csdn.net/lemon4869/article/details/106818808

本教程需要的文件，请提交下载好，再开始动手实践：链接：https://pan.baidu.com/s/1iBvOLAUamBTd70xRf5m2Ag 提取码：5bt7

1.3 视频教程

视频教程请看： https://www.bilibili.com/video/BV13Z4y1p7di/

1.4 整个流程概览

整个训练模型的流程为：

2、最终成果展示

一张图片输出一个窗口，并用矩形框框出识别到的物体，在合适位置用文字标出该类别及置信度。

3、建立项目文件夹结构

操作按下面的目录结构创建文件夹，其中 cats_dogs 目录及子目录需要你自己手动创建。

-tf_train -addons （COCO API 及标注软件） -models （TensorFlow Obejct Detection API 所有） -scripts （一些脚本） -preprocssing （存放预处理的脚本） -workspaces （工作目录） -cats_dogs （项目目录） -annotations （存放图像的标注文件） -images （存放图片） -train （训练图片） -eval （评估图片） -test （测试图片） -pre_trained_model （存放预训练模型） -trianed_frozen_models （存放冻结图） -training （存放训练过程文件）

4、标注图片

4.1 修改 predefined_classes.txt 文件

操作打开 tf_train\addons\windows_v1.8.0\data\predefined_classes.txt 文件，把里面预定义标签改为 cat 和 dog。

4.2 拷贝图片到 train 目录

操作解压 images.zip ，把 train 目录下的图片及标注文件全部复制到 tf_train\workspaces\cats_dogs\images\train 目录下。

4.3 使用 LabelImg 标注图片

操作打开 tf_train\addons\windows_v1.8.0\labelImg.exe 程序，打开训练图片所在目录，然后进行标注。（我们复制图片时也把标注文件给复制过来了，所以打开目录就已经有标注了。）

4.4 剪切 train 文件夹里 10% 的图片及标注到 eval 目录

操作剪切 train 文件夹里 10% 的图片及标注到 eval 目录，这部分图片用来评估模型。猫、狗的图片原本各 100 张，但是我们抽取 10%，也就是各 10 张，用来评估模型。

4.5 拷贝图片到 test 目录

操作之前解压 images.zip 出来有 test 目录，把 test 目录里面的图片复制到 tf_train\workspaces\cats_dogs\images\test 目录下。test 目录里的照片用来测试模型。

5、将 xml 标注文件转化为 tfrecord 文件

首先将手工标注得到的 *.xml 标注文件转换成一个 *.csv 文件，然后再将得到的 *.csv 文件转换成 *.tfrecord 文件。

*.xml：单个图片的标注文件。*.csv：包含所有图片标注信息的文件。*.tfrecord：包含所有图片标注信息的文件，并且是二进制文件，TensorFlow 内核很多数据处理机制都是基于 TFRecord 文件做的优化。

5.1 创建标签映射文件 pbtxt

操作新建 tf_train\workspaces\cats_dogs\annotations\label_map.pbtxt 文件，然后写入以下内容：

item { id: 1 name: 'cat' } item { id: 2 name: 'dog' }

5.2 将 xml 转换为 csv 文件

操作将下载的两个脚本文件 xml_to_csv.py 和 generate_tfrecord.py 拷贝到 tf_train\scripts\preprocessing 目录下。

操作打开处于 tf_gpu 环境的控制台。

输入命令注意：两个脚本都依赖与 pandas 库，所以需要先安装 pandas 库： conda install pandas

输入命令进入 tf_train\scripts\preprocessing 目录，运行：先将 train 用的 *.xml 标记文件转换成 csv，输入命令： python xml_to_csv.py -i=D:\tf-train\workspaces\cats_dogs\images\train -o=D:\tf_train\workspaces\cats_dogs\annotations\train_labels.csv

输入命令其次是 eval 文件夹的： python xml_to_csv.py -i=D:\tf-train\workspaces\cats_dogs\images\eval -o=D:\tf_train\workspaces\cats_dogs\annotations\eval_labels.csv

5.3 将 csv 文件转化为 tfrecord 文件

输入命令先将 train 用的 csv 标记文件转换成 tfrecord，输入命令： python generate_tfrecord.py --label0=cat --label=dog --csv_input=D:\tf-train\workspaces\cats_dogs\annotations\train_labels.csv --output_path=D:\tf-train\workspaces\cats_dogs\annotations\train.tfrecord --img_path=D:\tf-train\workspaces\cats_dogs\images\train

输入命令其次是 eval 用的，输入命令： python generate_tfrecord.py --label0=cat --label=dog --csv_input=D:\tf-train\workspaces\cats_dogs\annotations\eval_labels.csv --output_path=D:\tf-train\workspaces\cats_dogs\annotations\eval.tfrecord --img_path=D:\tf-train\workspaces\cats_dogs\images\eval

最终 annotations 目录下应该有 5 个文件：

label_map.pbtxttrain_labels.csvtrain.trfrecordeval_labels.csveval.tfrecord

如下图所示：

6、修改预训练模型的配置文件

操作首先复制 tf_train\models\reserach\object_detection\samples\configs\ssd_inception_v2_coco.config 到 tf_train\workspaces\cats_dogs\training 目录下，然后进行修改：

第 9 行改为： num_classes: 2 第 136 行改为： batch_size: 12 第 151 行改为： fine_tune_checkpoint: "E:/1-tf_train/workspaces/cats_dogs/pre_trained_model/ssd_inception_v2_coco_2018_01_28/model.ckpt" 第 157 行改为： num_steps: 200 第 170 行改为： input_path: "E:/1-tf_train/workspaces/cats_dogs/annotations/train.tfrecord" 第 172 行改为： label_map_path: "E:/1-tf_train/workspaces/cats_dogs/annotations/label_map.pbtxt" 第 176 行改为： num_examples: 20 第 179 行改为： max_evals: 1 第 184 行改为： input_path: "E:/1-tf_train/workspaces/cats_dogs/annotations/eval.tfrecord" 第 186 行改为： label_map_path: "E:/1-tf_train/workspaces/cats_dogs/annotations/label_map.pbtxt"

7、训练模型

操作首先打开环境为 tf_gpu 的控制台，然后进入 cats_dogs 目录。下面的命令都在 cats_dogs 目录下进行操作。

7.1 解压预训练模型

操作解压 ssd_inception_v2_coco_2018_01_28.tar.gz 文件到 tf_train\workspaces\cats_dogs\pre_trained_model 目录下。

7.2 开始训练

操作复制 tf_train\models\research_detection\legacy\train.py 到 tf_train\worlspaces\cats_dogs 目录下，然后打开改文件，添加以下代码在 main 函数最前面：

# GPU 按需分配 config = tf.compat.v1.ConfigProto(allow_soft_placement=True) config.gpu_options.per_process_gpu_memory_fraction = 0.3 tf.compat.v1.keras.backend.set_session(tf.compat.v1.Session(config=config))

输入命令 python train.py --logtostderr --train_dir=training --pipeline_config_path=training\ssd_inception_v2_coco.config 训练完成截图：知识点：

–logtostderr：日志输出到标准错误输出设备，也就是屏幕–train_dir：用于存放训练过程产生的文件–pipeline_config_path：模型的配置文件

7.3 使用 TensorBoard 观察训练过程（非必须操作）

输入命令 tensorboard --logdir=training --host=127.0.0.1

然后打开浏览器，输入：127.0.0.1:6006，即可访问 tensorboard。

7.4 评估模型（非必须操作）

操作打开 tf_train\models\research\object_detection\utils\object_detection_evaluation.py，进行以下修改：

第 213 行改为： category_name = str(category_name, 'utf-8') 第 351 行改为： category_name = str(category_name, 'utf-8')

操作复制 tf_train\models\research\object_detection\legacy\eval.py 到 tf_train\workspaces\cats_dogs 目录下，然后打开改文件，添加以下代码在 main 函数最前面：

输入命令 python eval.py --logtostderr --checkpoint_dir=training --eval_dir=evaluation --pipeline_config_path=training\ssd_inception_v2_coco.config

7.5 使用 TensorBoard 观察评估结果（非必须操作）

输入命令 tensorboard --logdir=evaluation --host=127.0.0.1 然后打开浏览器，输入：127.0.0.1:6006，即可访问 tensorboard，查看评估结果。

8、导出模型

操作首先将 tf_train\models\research\object_detection\export_inference_graph.py 复制到 tf_train\workspaces\cats_dogs 目录下。输入命令 python export_inference_graph.py --input_type=image_tensor --pipeline_config_path=training\ssd_inception_v2_coco.config --trained_checkpoint_prefix=training\model.ckpt-100 --output_directory=trained_frozen_models\cats_dogs_model

得到的冻结图模型：

9、用模型做目标检测

操作在 tf_train\workspaces\cats_dogs 目录下新建文件 object_detection_example_2.py，复制内容进去：

# object_detection_example_1.py演示一个完整的推理(Inference)过程 # ----------------------------------------------------------- # 第一步，导入相关的软件包 # ----------------------------------------------------------- import numpy as np import os import tensorflow as tf import matplotlib.pyplot as plt from PIL import Image from utils import label_map_util from utils import visualization_utils as vis_util from utils import ops as utils_ops # 检查tensorflow 版本，须≥1.12.0 from pkg_resources import parse_version if parse_version(tf.__version__) < parse_version('1.12.0'): raise ImportError("parse_version:Please upgrade your TensorFlow to V1.12.* or higher") print("The version of installed TensorFlow is {0:s}".format(tf.__version__)) config = tf.compat.v1.ConfigProto(allow_soft_placement=True) config.gpu_options.per_process_gpu_memory_fraction = 0.3 tf.compat.v1.keras.backend.set_session(tf.compat.v1.Session(config=config)) # ----------------------------------------------------------- # 第二步，导入模型ssd_inception_v2_coco_2018_01_28到内存 # ssd_inception_v2_coco_2018_01_28文件夹应与本程序放在models\research\object_detection文件夹下 # ----------------------------------------------------------- MODEL_NAME = 'E:/1-tf_train/workspaces/cats_dogs/trained_frozen_models/cats_dogs_model' PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb' PATH_TO_LABELS = os.path.join('annotations', 'label_map.pbtxt') detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') # ----------------------------------------------------------- # 第三步，导入标签映射文件(Label map)，这样假如神经网络输出'5',我 # 们就知道对应的是'airplane' # ----------------------------------------------------------- category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True) # ----------------------------------------------------------- # 第四步，执行推理(Inference)，检测图片中的对象 # ----------------------------------------------------------- # ## 导入图像数据到numpy array 子程序 def load_image_into_numpy_array(image): (im_width, im_height) = image.size return np.array(image.getdata()).reshape((im_height,im_width,3)).astype(np.uint8) # ## 从单张图片中检测对象子程序 # ## 图片名称：image1.jpg, image2.jpg,存放在 # ## models\research\object_detection\test_images文件夹下 PATH_TO_IMAGES_DIR = 'images/test' TEST_IMAGE_PATHS = [os.path.join(PATH_TO_IMAGES_DIR, '{0:d}.jpg'.format(i)) for i in range(1,11)] # 显示图像的尺寸，单位inches IMAGE_SIZE = (12, 8) def run_inference_for_single_image(image, graph): with graph.as_default(): with tf.Session() as sess: ops = tf.get_default_graph().get_operations() all_tensor_names = {output.name for op in ops for output in op.outputs} tensor_dict = {} for key in ['num_detections', 'detection_boxes', 'detection_scores', 'detection_classes', 'detection_masks']: tensor_name = key + ':0' if tensor_name in all_tensor_names: tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name) if 'detection_masks' in tensor_dict: detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0]) detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0]) real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32) detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1]) detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1]) detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks( detection_masks, detection_boxes, image.shape[1], image.shape[2]) detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5), tf.uint8) tensor_dict['detection_masks'] = tf.expand_dims(detection_masks_reframed, 0) image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0') # 运行推理(Inference) output_dict = sess.run(tensor_dict,feed_dict={image_tensor: image}) output_dict['num_detections'] = int(output_dict['num_detections'][0]) output_dict['detection_classes'] = output_dict['detection_classes'][0].astype(np.int64) output_dict['detection_boxes'] = output_dict['detection_boxes'][0] output_dict['detection_scores'] = output_dict['detection_scores'][0] if 'detection_masks' in output_dict: output_dict['detection_masks'] = output_dict['detection_masks'][0] return output_dict for image_path in TEST_IMAGE_PATHS: image = Image.open(image_path) image_np = load_image_into_numpy_array(image) # 扩展维度，因为模型要求图像的形状为：[1, None, None, 3] image_np_expanded = np.expand_dims(image_np, axis=0) # 运行检测程序. output_dict = run_inference_for_single_image(image_np_expanded, detection_graph) # 可视化检测结果. vis_util.visualize_boxes_and_labels_on_image_array( image_np, output_dict['detection_boxes'], output_dict['detection_classes'], output_dict['detection_scores'], category_index, instance_masks=output_dict.get('detection_masks'), use_normalized_coordinates=True, line_thickness=8) plt.figure(figsize=IMAGE_SIZE) plt.imshow(image_np) plt.show()

输入命令 python object_detection_example_2.py

得到结果：

Processed: 0.014, SQL: 10