當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

TensorFlow：实战Google深度学习框架（六）图像数据处理

發布時間：2023/12/15 pytorch 37 豆豆

生活随笔收集整理的這篇文章主要介紹了 TensorFlow：实战Google深度学习框架（六）图像数据处理小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

第七章圖像數據處理
- 7.1 TFRecord輸入數據格式
  - - TensorFlow提供了一種統一的格式來存儲數據——TFRecord格式
  - 7.1.1 TFRecord格式介紹
  - 7.1.2 TFRecord樣例程序
- 7.2 圖像數據處理
  - 7.2.1 TensorFlow圖像處理函數
    - 1. 圖像編碼處理
    - 2. 圖像大小調整
    - 3. 圖像翻轉
    - 4. 圖像色彩調整
    - 5. 處理標注框
  - 7.2.2 圖像預處理完整樣例
- 7.3 多線程輸入數據處理框架
  - 7.3.1 隊列與多線程
  - 7.3.2 輸入文件隊列
  - 7.3.3 組合訓練數據（batching）
  - 7.3.4 輸入數據處理框架
  - 總結

第七章圖像數據處理

第6章中詳細介紹了卷積神經網絡，并提到了通過卷積神經網絡給圖像識別技術帶來了突破性進展，本章從另外一個維度來進一步提升圖像識別的精度以及訓練的速度。

在很多圖像識別問題中，光照、對比度等外界因素會對識別效果造成很大的影響，所以本章介紹如何對圖像數據進行預處理使得訓練得到的神經網絡模型盡可能小的被無關因素所影響。

復雜的預處理過程可能導致訓練效率下降，為了減小預訓練對訓練速度的影響，本章也將介紹TensorFlow中多線程處理輸入數據的解決方案。

7.1 TFRecord輸入數據格式

TensorFlow提供了一種統一的格式來存儲數據——TFRecord格式

7.1.1 TFRecord格式介紹

tf.train.Example Protocol Buffer：TFRecord文件中的數據都是通過該格式儲存的。

下列代碼給出了tf.train.Example的定義

message Example{Features features=1;};message Features{map<string,Feature> feature=1;};message Feature{oneof kind{BytesList bytes_list=1;FloatList bytes_list=2;Int64List bytes_list=3;} };

tf.train.Example中包含了一個從屬性名稱到取值的字典：
屬性名稱：一個字符串
屬性取值：字符串（BytesList）、實數列表（FloatList）、整數列表（Int64List）

7.1.2 TFRecord樣例程序

1. 將mnist輸入數據轉化為TFRecord格式

import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import numpy as np# 1. 將輸入轉化成TFRecord格式并保存 # 定義函數轉化變量類型。 def _int64_feature(value):return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))def _bytes_feature(value):return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))# 讀取mnist數據。 mnist = input_data.read_data_sets("../../datasets/MNIST_data",dtype=tf.uint8, one_hot=True) images = mnist.train.images labels = mnist.train.labels pixels = images.shape[1] num_examples = mnist.train.num_examples# 輸出TFRecord文件的地址。 filename = "Records/output.tfrecords" # 需要存在Records目錄 writer = tf.python_io.TFRecordWriter(filename) for index in range(num_examples):image_raw = images[index].tostring()example = tf.train.Example(features=tf.train.Features(feature={'pixels': _int64_feature(pixels),'label': _int64_feature(np.argmax(labels[index])),'image_raw': _bytes_feature(image_raw)}))writer.write(example.SerializeToString()) writer.close() print("TFRecord文件已保存。")

2. 讀取TFRecord文件中的數據

reader = tf.TFRecordReader() filename_queue = tf.train.string_input_producer(["Records/output.tfrecords"]) _,serialized_example = reader.read(filename_queue)# 解析讀取的樣例。 features = tf.parse_single_example(serialized_example,features={'image_raw':tf.FixedLenFeature([],tf.string),'pixels':tf.FixedLenFeature([],tf.int64),'label':tf.FixedLenFeature([],tf.int64)})images = tf.decode_raw(features['image_raw'],tf.uint8) labels = tf.cast(features['label'],tf.int32) pixels = tf.cast(features['pixels'],tf.int32)sess = tf.Session()# 啟動多線程處理輸入數據。 coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess,coord=coord)for i in range(10):image, label, pixel = sess.run([images, labels, pixels])

7.2 圖像數據處理

7.2.1 TensorFlow圖像處理函數

1. 圖像編碼處理

圖像在存儲時并不是直接記錄像素矩陣中的數字，而是記錄了壓縮編碼之后的結果，所以要將一張圖像還原成一個三維矩陣，需要解碼的過程。TensorFlow提供了對jpeg和png格式圖像的編碼/解碼函數。

使用TensorFlow中對jpeg格式圖像的編碼/解碼函數

# matplotlib.pyplot是一個python畫圖工具 import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:# 將圖像使用的jpeg的格式解碼從而得到圖像對應的三維矩陣# TensorFlow還提供了tf.image.decode_png函數對png格式的圖像進行解碼# 解碼之后的結果為一個張量，在使用它的取值之前需要明確調用運行的過程img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 使用pyplot得到圖像plt.imshow(img_data.eval())plt.show()# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)# 將表示一張圖像的三維矩陣重新按照jpeg個數編碼并存到文件中# 打開該圖，可以得到和原圖一樣的圖像encode_image=tf.image.encode_jpeg(img_data) #輸入必須為uint8形式的，不然會報錯with tf.gfile.GFile("E:\\Opencv Image\\an.jpg",'wb') as f:f.write(encode_image.eval())

2. 圖像大小調整

因為獲取的圖像大小不固定，但神經網絡輸入節點的個數是固定的，所以在將圖像像素作為輸入提供給神經網絡之前，需要先將圖像大小進行統一。

1.通過算法調整，使得得到的新圖像盡量保存原始圖像的所有信息

TensorFlow實現：提供了四種不同的算法，并封裝到了tf.image.resize_images函數

tf.image.resize_images(images, new_height, new_width, method=0) # Resize images to new_width, new_height using the specified method.

tf.image.resize_images函數中method對應的取值

Method取值圖像大小調整算法

0	雙線性差值法（Bilinear interpolation）
1	最近鄰法（Nearest neighbour interpolation）
2	雙三次差值法（Bicubic interpolation）
3	面積差值法（Area interpolation）

代碼實現：

# matplotlib.pyplot是一個python畫圖工具 import matplotlib.pyplot as plt import tensorflow as tf import numpy as np# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:# 將圖像使用的jpeg的格式解碼從而得到圖像對應的三維矩陣# TensorFlow還提供了tf.image.decode_png函數對png格式的圖像進行解碼# 解碼之后的結果為一個張量，在使用它的取值之前需要明確調用運行的過程img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 使用pyplot得到圖像# plt.imshow(img_data.eval())# plt.show()# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)# 將表示一張圖像的三維矩陣重新按照jpeg個數編碼并存到文件中# 打開該圖，可以得到和原圖一樣的圖像# encode_image=tf.image.encode_jpeg(img_data)# with tf.gfile.GFile("E:\\Opencv Image\\an.jpg",'wb') as f:# f.write(encode_image.eval())with tf.Session() as sess:resized = tf.image.resize_images(img_data, [300, 300], method=3)print(img_data.get_shape())# TensorFlow的函數處理圖片后存儲的數據是float32格式的，需要轉換成uint8才能正確打印圖片。print( "Digital type: ", resized.dtype)angelababy2 = np.asarray(resized.eval(), dtype='uint8')# tf.image.convert_image_dtype(rgb_image, tf.float32)plt.imshow(angelababy2)plt.show() 原始圖像原始圖像


雙線性差值	最近鄰

雙三次插值	面積插值

2. 裁剪和填充

tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)# Crops and/or pads an image to a target width and height. # Resizes an image to a target width and height by either centrally cropping the image or # padding it evenly with zeros.# image:原始圖像 # target_height, target_width：目標大小 # 如果原始圖像的尺寸大于目標圖像：自動截取居中的部分 # 如果原始圖像的尺寸小于目標圖像：四周全0填充 import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)with tf.Session() as sess:croped = tf.image.resize_image_with_crop_or_pad(img_data, 1000, 1000)padded = tf.image.resize_image_with_crop_or_pad(img_data, 500, 500)plt.imshow(croped.eval())plt.show()plt.imshow(padded.eval())plt.show() 填充到1000*1000裁剪到500*500

通過比例調整圖像大小——截取中間的部分

crop = tf.image.central_crop(image, central_fraction=0.5) # 第一個參數：原始圖像 # 第二個參數：調整比例，(0,1]直接的實數 import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)# 5. 截取中間50%的圖片 with tf.Session() as sess:central_cropped = tf.image.central_crop(img_data, 0.5)plt.imshow(central_cropped.eval())plt.show()

3. 圖像翻轉

實現圖像的上下翻轉、左右翻轉及對角線翻轉

import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)# 6. 翻轉圖片 with tf.Session() as sess:# 上下翻轉# flipped1 = tf.image.flip_up_down(img_data)# 左右翻轉# flipped2 = tf.image.flip_left_right(img_data)# 對角線翻轉transposed = tf.image.transpose_image(img_data)plt.imshow(transposed.eval())plt.show()# 以一定概率上下翻轉圖片。# flipped = tf.image.random_flip_up_down(img_data)# 以一定概率左右翻轉圖片。# flipped = tf.image.random_flip_left_right(img_data)

4. 圖像色彩調整

1. 亮度

import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)plt.imshow(img_data.eval())plt.show()# 將圖片的亮度-0.5。# adjusted = tf.image.adjust_brightness(img_data, -0.5)# 將圖片的亮度+0.5adjusted = tf.image.adjust_brightness(img_data, 0.5)# 在[-max_delta, max_delta)的范圍隨機調整圖片的亮度。# adjusted = tf.image.random_brightness(img_data, max_delta=0.6)plt.imshow(adjusted.eval())plt.show()

|
亮度增加0.5

2. 對比度

import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)plt.imshow(img_data.eval())plt.show()# 將圖片的對比度-5# adjusted = tf.image.adjust_contrast(img_data, -5)# 將圖片的對比度+5adjusted = tf.image.adjust_contrast(img_data, 5)# 在[lower, upper]的范圍隨機調整圖的對比度。# adjusted = tf.image.random_contrast(img_data, lower, upper)plt.imshow(adjusted.eval())plt.show()

|
對比度加5

3. 色相

import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)adjusted = tf.image.adjust_hue(img_data, 0.1)# adjusted = tf.image.adjust_hue(img_data, 0.3)# adjusted = tf.image.adjust_hue(img_data, 0.6)# adjusted = tf.image.adjust_hue(img_data, 0.9)# 在[-max_delta, max_delta]的范圍隨機調整圖片的色相。max_delta的取值在[0, 0.5]之間。# adjusted = tf.image.random_hue(image, max_delta)plt.imshow(adjusted.eval())plt.show()

4. 飽和度

import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)print(img_data.eval())# 將數據的類型轉化成實數方便后續處理img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)# 將圖片的飽和度-5。adjusted = tf.image.adjust_saturation(img_data, -5)# 將圖片的飽和度+5。# adjusted = tf.image.adjust_saturation(img_data, 5)# 在[lower, upper]的范圍隨機調整圖的飽和度。# adjusted = tf.image.random_saturation(img_data, lower, upper)# 將代表一張圖片的三維矩陣中的數字均值變為0，方差變為1。# adjusted = tf.image.per_image_standardization(img_data)plt.imshow(adjusted.eval())plt.show()

TensorFlow還提供了API來完成圖像標準化的過程，將亮度均值變為0，方差變為1

# 將代表一張圖片的三維矩陣中的數字均值變為0，方差變為1。 adjusted = tf.image.per_image_standardization(img_data)

5. 處理標注框

在很多圖像識別的數據集中，圖像中需要關注的物體通常會被標注框圈出來，利用下述函數實現：

tf.image.draw_bounding_boxs(images,boxes,name=None)# images：是 [batch, height, width, depth] 形狀的四維矩陣， # 數據類型為 float32、half 中的一種，第一個值batch是因為處理的是一組圖片。# boxes：形狀 [batch, num_bounding_boxes, 4] 的三維矩陣， # num_bounding_boxes 是標注框的數量，標注框由四個數字標示 [y_min, x_min, y_max, x_max]，數組類型為float32 # 例如：tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]]) # shape 為 [1,2,4] 表示一張圖片中的兩個標注框； # tf.constant([[[ 0. 0. 1. 1.]]]) 的 shape 為 [1,1,4]表示一張圖片中的一個標注框# name：操作的名稱（可選）。

代碼示例：

import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()with tf.Session() as sess:img_data=tf.image.decode_jpeg(image_raw_data)# 先將圖像縮小，可讓標注框更為清楚img_data=tf.image.resize_images(img_data,180,260,methed=1)# tf.image.draw_bounding_boxes的輸入加一個batch，也就是多張圖像組成的思維矩陣# 所以要將解碼后的加一維度# tf.image.draw_bounding_boxes輸入為實數batched = tf.expand_dims(tf.image.convert_image_dtype(img_data, tf.float32), 0)boxes = tf.constant([[[0.01, 0.2, 0.5, 0.7],[0.25, 0.4, 0.32, 0.55]]])# [0.05, 0.05, 0.9, 0.7]，（y min，x min，y max，x max）坐標點的相對位置，和原始大小相乘result=tf.image.draw_bounding_boxes(batched,boxes)plt.subplot(121), plt.imshow(img_data.eval())plt.subplot(122), plt.imshow(result[0].eval())plt.show()

隨機截取圖像上有信息含量的部分也是提高模型健壯性的一種方式，可以使得訓練得到的模型不受識別物體的大小的影響。

tf.image.sample_distorted_bounding_box( image_size, bounding_boxes, seed=None, seed2=None, min_object_covered=None, aspect_ratio_range=None, area_range=None, max_attempts=None, use_image_if_no_bounding_boxes=None, name=None)# image_size：是包含 [height, width, channels] 三個值的一維數組。數值類型必須是 uint8，int8，int16，int32，int64 中的一種。# bounding_boxes：是一個 shape 為 [batch, N, 4] 的三維數組，數據類型為float32，第一個batch是因為函數是處理一組圖片的，N表示描述與圖像相關聯的N個邊界框的形狀，而標注框由4個數字 [y_min, x_min, y_max, x_max] 表示出來。例如：tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]]) 的 shape 為 [1,2,4] 表示一張圖片中的兩個標注框；tf.constant([[[ 0. 0. 1. 1.]]]) 的 shape 為 [1,1,4]表示一張圖片中的一個標注框# seed：（可選）數組類型為 int，默認為0。如果任一個seed或被seed2設置為非零，隨機數生成器由給定的種子生成。否則，由隨機種子生成。# seed2：（可選）數組類型為 int，默認為0。第二種子避免種子沖突。# min_object_covered：（可選）數組類型為 float，默認為 0.1。圖像的裁剪區域必須包含所提供的任意一個邊界框的至少 min_object_covered 的內容。該參數的值應為非負數，當為0時，裁剪區域不必與提供的任何邊界框有重疊部分。# aspect_ratio_range：（可選）數組類型為 floats 的列表，默認為 [0.75, 1.33] 。圖像的裁剪區域的寬高比（寬高比=寬/高）必須在這個范圍內。# area_range：（可選）數組類型為 floats 的列表，默認為 [0.05, 1] 。圖像的裁剪區域必須包含這個范圍內的圖像的一部分。# max_attempts：（可選）數組類型為 int，默認為100。嘗試生成圖像指定約束的裁剪區域的次數。經過 # max_attempts 次失敗后，將返回整個圖像。# use_image_if_no_bounding_boxes：（可選）數組類型為 bool，默認為False。如果沒有提供邊框，則用它來控制行為。如果為True，則假設有一個覆蓋整個輸入的隱含邊界框。如果為False，就報錯。# name：操作的名稱（可選）。

代碼實現：

import matplotlib.pyplot as plt import tensorflow as tf# 讀取圖像的原始數據 image_raw_data = tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg", 'rb').read()with tf.Session() as sess:img_data = tf.image.decode_jpeg(image_raw_data)print(img_data.eval())img_data = tf.image.resize_images(img_data, (330, 200), method=1)boxes = tf.constant([[[0.01, 0.2, 0.5, 0.7], [0.25, 0.4, 0.32, 0.55]]])# 隨機圖像截取begin, size, bbox_for_draw = tf.image.sample_distorted_bounding_box(tf.shape(img_data), bounding_boxes=boxes,min_object_covered=0.1)batched = tf.expand_dims(tf.image.convert_image_dtype(img_data, tf.float32), 0)image_with_box = tf.image.draw_bounding_boxes(batched, bbox_for_draw)distorted_image = tf.slice(img_data, begin, size)plt.imshow(distorted_image.eval())plt.show()

7.2.2 圖像預處理完整樣例

# 《TensorFlow實戰Google深度學習框架》07 圖像數據處理 # win10 Tensorflow1.0.1 python3.5.3 # CUDA v8.0 cudnn-8.0-windows10-x64-v5.1 # filename:ts07.03.py # 圖像預處理完整樣例import tensorflow as tf import numpy as np import matplotlib.pyplot as plt# 1. 隨機調整圖片的色彩，定義兩種順序 def distort_color(image, color_ordering=0):if color_ordering == 0:image = tf.image.random_brightness(image, max_delta=32./255.)image = tf.image.random_saturation(image, lower=0.5, upper=1.5)image = tf.image.random_hue(image, max_delta=0.2)image = tf.image.random_contrast(image, lower=0.5, upper=1.5)else:image = tf.image.random_saturation(image, lower=0.5, upper=1.5)image = tf.image.random_brightness(image, max_delta=32./255.)image = tf.image.random_contrast(image, lower=0.5, upper=1.5)image = tf.image.random_hue(image, max_delta=0.2)return tf.clip_by_value(image, 0.0, 1.0)# 2. 對圖片進行預處理，將圖片轉化成神經網絡的輸入層數據 # 給定一張解碼的圖像、目標尺寸、及圖像上的標注圖，此函數可以對給出的圖像進行預處理 # 輸入：原始訓練圖像 # 輸出：神經網絡模型的輸入層 # 注意：此處只處理模型的訓練數據，對預測數據不需要使用隨機變換的步驟 def preprocess_for_train(image, height, width, bbox):# 查看是否存在標注框，如果沒有標注框，則認為圖像就是整個需要關注的部分if bbox is None:bbox = tf.constant([0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])# 轉換圖像的張量類型if image.dtype != tf.float32:image = tf.image.convert_image_dtype(image, dtype=tf.float32)# 隨機的截取圖片中一個塊，減小物體大小對圖像識別算法的影響bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(tf.shape(image), bounding_boxes=bbox, min_object_covered=0.1)bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(tf.shape(image), bounding_boxes=bbox, min_object_covered=0.1)distorted_image = tf.slice(image, bbox_begin, bbox_size)# 將隨機截取的圖片調整為神經網絡輸入層的大小，大小調整的算法是隨機選擇的distorted_image = tf.image.resize_images(distorted_image, [height, width], method=np.random.randint(4))# 隨機左右翻轉圖像distorted_image = tf.image.random_flip_left_right(distorted_image)# 使用一種隨機的順序調整圖像的色彩distorted_image = distort_color(distorted_image, np.random.randint(2))return distorted_image# 3. 讀取圖片 image_raw_data = tf.gfile.FastGFile("E:\\Opencv Image\\dog.jpg", "rb").read() with tf.Session() as sess:img_data = tf.image.decode_jpeg(image_raw_data)boxes = tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])# 運行6次獲得6種不同的圖像for i in range(6):result = preprocess_for_train(img_data, 299, 299, boxes)plt.imshow(result.eval())plt.show()

7.3 多線程輸入數據處理框架

上述的預處理方法可以減小無關因素對圖像識別模型效果的影響，但這些復雜的操作會減慢整個訓練過程，為了避免預處理帶來的影響，TensorFlow提供了多線程處理輸入數據的框架。

經典輸入數據處理流程圖

7.3.1 隊列與多線程

隊列是計算圖上“有狀態”的節點，其他節點可以修改其內容，也就是其他隊列可以把新元素插入到隊列后端（rear），也可以將隊列前端（front）的元素刪除。

1. TensorFlow中提供了 FIFOQueue 和 RandomShuffleQueue 兩種隊列

對于隊列來說，修改隊列狀態的操作主要有Enqueue、EnqueueMany、Dequeue，他們需要獲取隊列指針，而非普通的值，如此才能修改隊列內容，在python API中，它們就是隊列的方法，例如：q.enqueue()

EnqueueMany：隊列的初始化
Dequeue：出隊
Enqueue：入隊

FIFOQueue——先進先出隊列

import tensorflow as tf#創建一個先進先出隊列，指定隊列中可以保存兩個元素，并指定類型為整形 q=tf.FIFOQueue(2,"int32")#使用enqueue_many函數來初始化隊列中的元素 #和變量初始化類似，在使用隊列之前需要明確調用這個初始化過程 init=q.enqueue_many(([0,10],))#使用Dequeue函數將隊列中的第一個元素出隊列，該元素的值將被存在變量x中 x=q.dequeue()#將得到的值加1 y=x+1#將加1后的值重新加入隊列 q_inc=q.enqueue([y])with tf.Session() as sess:#運行初始化隊列操作init.run()for _ in range(5):# 運行q_inc將執行數據出隊列、出隊的元素+1、重新加入隊列的整個過程v,_ =sess.run([x,q_inc])# 打印出隊元素的取值print(v)

輸出：

# 隊列開始有[0,10]兩個元素，第一個出隊的為0，加1之后再次入隊得到的隊列為[10,1]； # 第二次出隊的為10，加1之后為11，得到的隊列為[1,11]...0 10 1 11 2

RandomShuffleQueue——會將隊列中的元素打亂，每次出隊列操作得到的是當前隊列中的隨機某個元素

神經網絡訓練中更希望使用的訓練數據盡量隨機，所以該方法更多用。

2. 隊列的作用

如上所示，隊列是一種數據結構
同時也是異步張量取值的一個重要機制（如多個線程可以同時向一個隊列中寫元素，或者同時讀取一個隊列中的元素）

3. 多線程協同功能

tf.Coordinator、tf.QueueRunner兩個類來完成多線程協同的功能

tf.Coordinator主要用于協同多個線程一起停止（并提供了should_stop、request_stop、join三個函數）
工作過程：
聲明一個tf.Coordinator的類，并將該類傳入每一個創建的線程中去
啟動的線程要一直查詢should_stop函數，為True時當前線程需要退出
每一個啟動的線程都可以通過調用request_stop函數來通知其他線程退出（當一個線程調用request_stop函數時，should_stop函數就好被設置為True，這樣其他線程就可以同時終止）

import tensorflow as tf import numpy as np import threading import time# 線程中運行的程序，這個程序每隔1s判斷是否需要停止并打印自己的ID def MyLoop(coord,worker_id):# 使用tf.Coordinator類提供的協同工具判斷當前線程是否需要停止while not coord.should_stop():# 隨機停止所有線程if np.random.rand()<0.1:print("stop from id: %d\n" % worker_id)# 調用coord.request_stop()函數來通知其他線程停止coord.request_stop()else:# 打印當前線程的IDprint("working on id: %d" % worker_id)# 暫停1stime.sleep(1)# 聲明一個tf.train.Coordinator類來協同多個線程 coord=tf.train.Coordinator() # 聲明創建5個線程 threads=[threading.Thread(target=MyLoop,args=(coord,i,)) for i in range(5)] #啟動所有的線程 for t in threads:t.start() #等待所有線程退出 coord.join(threads)

輸出：

working on id: 0 working on id: 1 working on id: 2 working on id: 3 working on id: 4 working on id: 0 working on id: 3 working on id: 2 working on id: 1 working on id: 4 working on id: 3 working on id: 0 working on id: 2 working on id: 4 working on id: 1 stop from id: 4

當所有線程啟動后，每個線程會打印各自的ID，于是前4行打印了它們的ID，暫停1s之后，所有的線程將會第二遍打印ID。

tf.QueueRunner主要用于啟動多個線程來操作同一個隊列
啟動的所有線程可以通過tf.Coordinator類來同一管理
下列代碼展示了如何使用來管理多線程隊列操作

import tensorflow as tf# 聲明一個先進先出的隊列，隊列中最多100個元素，類型為實數 queue = tf.FIFOQueue(100,"float") # 定義隊列的入隊操作 enqueue_op = queue.enqueue([tf.random_normal([1])])# 使用tf.train.QueueRunner來創建多個線程運行隊列的入隊操作 # tf.train.QueueRunner的第一個參數給出了被操作的隊列 # [enqueue_op] * 5表示需要啟動5個線程，每個線程中運行的是enqueue_op的操作 qr = tf.train.QueueRunner(queue, [enqueue_op] * 5)# 將定義過的QueueRunner加入TensorFlow計算圖上指定的集合 # tf.train.QueueRunner函數沒有指定集合，則加入默認集合tf.GraphKeys.QUEUE_RUNNERS # 下面的函數就是剛剛定義的qr加入默認的集合tf.GraphKeys.QUEUE_RUNNERS tf.train.add_queue_runner(qr)# 定義出隊操作 out_tensor = queue.dequeue()with tf.Session() as sess:# 使用tf.train.Coordinator來協同啟動的線程coord = tf.train.Coordinator()# 使用tf.train.QueueRunner()時，需要明確調用tf.train.start_queue_runners來啟動所有線程# 否則因為沒有線程運行入隊操作# 當調用出隊操作時，程序會一直等待入隊操作被運行# tf.train.start_queue_runners函數會默認啟動tf.GraphKeys.QUEUE_RUNNERS集合中所有的QueueRunner# 因為該函數只支持啟動指定集合中的QueueRunner# 所以一般來說tf.train.add_queue_runner函數和tf.train.start_queue_runners函數會指定同一個集合threads = tf.train.start_queue_runners(sess=sess, coord=coord)#獲取隊列中的取值for _ in range(3):print(sess.run(out_tensor)[0])# 使用tf.train.Coordinator來停止所有線程coord.request_stop()coord.join(threads)

輸出：

-0.574549 1.83348 -0.67578

7.3.2 輸入文件隊列

本節介紹如何使用TensorFlow中的隊列管理輸入文件列表

雖然一個TFRecord文件可以存儲多個訓練樣本，但當訓練數據量較大時，可以將數據分成多個TFRecord文件來提高處理效率。

獲取一個正則表達式的所有文件：tf.train.match_filenames_once

進行有效的管理：tf.train.string_input_producer

該函數會使用初始化時提供的文件列表創建一個輸入隊列，輸入隊列中的原始元素為文件列表中的所有文件，創建好的輸入隊列可以作為文件讀取函數的參數，每層調用文件讀取函數時，該函數會先判斷當前是否已經有打開的文件可讀，如果沒有或者打開的文件已經讀完，則該函數會從輸入隊列中出隊一個文件并從該文件中讀取數據。

當shuffle=True時，文件在加入隊列之前會被打亂順序，所以出隊順序也是隨機的；

隨機打亂文件順序以及加入輸入隊列的過程是一個單獨的線程，不會影響獲取文件的速度。
當輸入隊列中的所有文件都被處理完之后，會將初始化時提供的文件列表中的文件全部重新加入隊列。

num_epochs：限制加載初始文件列表的最大輪數

當設置為1時，計算完一輪之后，程序將自動停止。
神經網絡模型測試時，所有測試數據僅僅需要使用一次即可，所以將其設置為1。

生成樣例數據的簡單程序：

import tensorflow as tf# 創建TFRecord幫助函數 def _int64_feature(value):return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) # 模擬海量數據CIA將數據寫入不同的文件 # num_shards 定義了總共寫入多少個文件 # instances_per_shard定義了每個文件中有多少個數據 num_shards=2 instances_per_shard=2 for i in range(num_shards):# 將數據分為多個文件時，可以將不同文件以類似0000n-of-0000m的后綴區分# m：表示數據總共被存在了多少個文件中# n：表示當前文件的編號# 式樣的方式既方便了通過正則表達式獲取文件列表，又在文件名中加入了更多的信息filename=("E:\\Opencv Image\\data.tfrecords-%.5d-of-%.5d" % (i,num_shards))writer=tf.python_io.TFRecordWriter(filename)# 將數據封裝改成example結構并寫入TFRecord文件for j in range(instances_per_shard):# example結構僅包含當前樣例屬于第幾個文件以及是當前文件的第幾個樣本example=tf.train.Example(features=tf.train.Features(feature={'i':_int64_feature(i),'j':_int64_feature(j)}))writer.write(example.SerializeToString())writer.close()

程序運行之后，會在指定目錄下生成兩個文件：

每個文件存儲了兩個樣例，生成樣例之后，以下代碼展示了兩個函數的使用方法

import tensorflow as tf# 使用tf.train.match_filenames_once函數獲取文件列表 files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*") filename_queue = tf.train.string_input_producer(files, shuffle=False) reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example,features={'i': tf.FixedLenFeature([], tf.int64),'j': tf.FixedLenFeature([], tf.int64),}) with tf.Session() as sess:# # tf.global_variables_initializer().run() #報錯sess.run([tf.global_variables_initializer(),tf.local_variables_initializer()])print(sess.run(files))# 聲明tf.train.Coordinator類來協同不同線程，并啟動線程coord = tf.train.Coordinator()threads = tf.train.start_queue_runners(sess=sess, coord=coord)# 多次執行獲取數據的操作for i in range(6):print(sess.run([features['i'], features['j']]))coord.request_stop()coord.join(threads)

輸出：

[b'E:\\pycharm\\TensorFlow chap7\\data.tfrecords-00000-of-00002'b'E:\\pycharm\\TensorFlow chap7\\data.tfrecords-00001-of-00002'] [0, 0] [0, 1] [1, 0] [1, 1] [0, 0] [0, 1]

7.3.3 組合訓練數據（batching）

將多個輸入樣例組織成一個batch可以提高模型訓練的效率，所以在得到單個樣例的預處理結果之后，還需要將其組織成batch，再提供給神經網絡的輸入層。
1. tf.train.batch：可以將樣例組織成batch會生成一隊列，隊列的入隊操作是生成單個樣例的方法，每次出隊會得到一個樣例。
2. tf.train.shuffle_batch：可以交給你樣例組織成batch，會生成一隊列，隊列的入隊操作是生成單個樣例的方法，每次出隊會得到一個樣例，但是tf.train.shuffle_batch函數會將數據順序打亂。

tf.train.batch代碼示例：

import tensorflow as tf# 獲取文件列表 files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*")# 創建文件輸入隊列 filename_queue = tf.train.string_input_producer(files, shuffle=False)# 讀取并解析Example reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example,features={'i': tf.FixedLenFeature([], tf.int64),'j': tf.FixedLenFeature([], tf.int64)})# i代表特征向量，j代表標簽 example, label = features['i'], features['j']# 一個batch中的樣例數 batch_size = 3# 文件隊列中最多可以存儲的樣例個數 capacity = 1000 + 3 * batch_size# 組合樣例 example_batch, label_batch = tf.train.batch([example, label], batch_size=batch_size, capacity=capacity)with tf.Session() as sess:# 使用match_filenames_once需要用local_variables_initializer初始化一些變量sess.run([tf.global_variables_initializer(),tf.local_variables_initializer()])# 用Coordinator協同線程，并啟動線程coord = tf.train.Coordinator()threads = tf.train.start_queue_runners(coord=coord)# 獲取并打印組合之后的樣例。真實問題中一般作為神經網路的輸入for i in range(2):cur_example_batch, cur_label_batch = sess.run([example_batch, label_batch])print(cur_example_batch, cur_label_batch)coord.request_stop()coord.join(threads)

輸出：

[0 0 1] [0 1 0] [1 0 0] [1 0 1] # tf.train.batch 函數可以將單個的數據組織成3個一組的batch # 在example，lable中讀到的數據依次為： example：0, lable：0 example：0, lable：1 example：1, lable：0 example：1, lable：1 # 這是因為函數不會隨機打亂順序，所以組合之后得到的數據組合成了上面給出的輸出

tf.train.shuffle_batch代碼示例如下：

import tensorflow as tf # 獲取文件列表 files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*")# 創建文件輸入隊列 filename_queue = tf.train.string_input_producer(files, shuffle=False)# 讀取并解析Example reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example,features={'i': tf.FixedLenFeature([], tf.int64),'j': tf.FixedLenFeature([], tf.int64)})# i代表特征向量，j代表標簽 example, label = features['i'], features['j']# 一個batch中的樣例數 batch_size = 3# 文件隊列中最多可以存儲的樣例個數 capacity = 1000 + 3 * batch_size# 組合樣例 # `min_after_dequeue` 是該函數特有的參數，參數限制了出隊時隊列中元素的最少個數， # 但當隊列元素個數太少時，隨機的意義就不大了 example_batch,label_batch = tf.train.shuffle_batch([example,label],batch_size=batch_size,capacity=capacity,min_after_dequeue=30)with tf.Session() as sess:# 使用match_filenames_once需要用local_variables_initializer初始化一些變量sess.run( [tf.global_variables_initializer(),tf.local_variables_initializer()])# 用Coordinator協同線程，并啟動線程coord = tf.train.Coordinator()threads = tf.train.start_queue_runners(coord=coord)# 獲取并打印組合之后的樣例。真實問題中一般作為神經網路的輸入for i in range(2):cur_example_batch, cur_label_batch = sess.run([example_batch, label_batch])print(cur_example_batch, cur_label_batch)coord.request_stop()coord.join(threads)

輸出：

[0 1 1] [0 1 0] [1 0 0] [0 0 1]

7.3.4 輸入數據處理框架

本節給出以上步驟整合之后的代碼

框架主要是三方面的內容：

TFRecord 輸入數據格式
圖像數據處理
多線程輸入數據處理

以下代碼只是描繪了一個輸入數據處理的框架，需要根據實際使用環境進行修改

import tensorflow as tf# 創建文件列表 files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*")# 創建輸入文件隊列 filename_queue = tf.train.string_input_producer(files, shuffle=False)# 解析數據。假設image是圖像數據，label是標簽，height、width、channels給出了圖片的維度 reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example,features={'image': tf.FixedLenFeature([], tf.string),'label': tf.FixedLenFeature([], tf.int64),'height': tf.FixedLenFeature([], tf.int64),'width': tf.FixedLenFeature([], tf.int64),'channels': tf.FixedLenFeature([], tf.int64)}) image, label = features['image'], features['label'] height, width = features['height'], features['width'] channels = features['channels']# 從原始圖像中解析出像素矩陣，并還原圖像 decoded_image = tf.decode_raw(image, tf.uint8) decoded_image.set_shape([height, width, channels])# 定義神經網絡輸入層圖片的大小 image_size = 299# preprocess_for_train函數是對圖片進行預處理的函數 distorted_image = preprocess_for_train(decoded_image, image_size, image_size,None)# 組合成batch min_after_dequeue = 10000 batch_size = 100 capacity = min_after_dequeue + 3 * batch_size image_batch, label_batch = tf.train.shuffle_batch([distorted_image, label],batch_size=batch_size,capacity=capacity,min_after_dequeue=min_after_dequeue)# 定義神經網絡的結構及優化過程 logit = inference(image_batch) loss = calc_loss(logit, label_batch) train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)with tf.Session() as sess:sess.run([tf.global_variables_initializer(),tf.local_variables_initializer()])coord = tf.train.Coordinator()threads = tf.train.start_queue_runners(coord=coord)# 神經網絡訓練過程for i in range(TRAINING_ROUNDS):sess.run(train_step)coord.request_stop()coord.join()

總結

對于輸入數據的處理，大體上流程都差不多，可以歸結如下：

將數據轉為 TFRecord 格式的多個文件

用 tf.train.match_filenames_once() 創建文件列表（圖中為{A,B,C}）

用 tf.train.string_input_producer() 創建輸入文件隊列，可以將輸入文件順序隨機打亂，并加入輸入隊列（是否打亂為可選項，該函數也會生成并維護一個輸入文件隊列，不同進程中的文件讀取函數可以共享這個輸入文件隊列）

用 tf.TFRecordReader() 讀取文件中的數據

用 tf.parse_single_example() 解析數據

對數據進行解碼及預處理

用 tf.train.shuffle_batch() 將數據組合成 batch

將 batch 用于訓練

上圖就是以上代碼中輸入數據處理的全部流程

創作挑戰賽新人創作獎勵來咯，堅持創作打卡瓜分現金大獎

總結

以上是生活随笔為你收集整理的TensorFlow：实战Google深度学习框架（六）图像数据处理的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：济南一辆理想L9车头当街起火官方回应：
下一篇：梳理百年深度学习发展史-七月在线机器学习