當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

FCN-数据篇

發布時間：2023/12/10 编程问答 33 豆豆

生活随笔收集整理的這篇文章主要介紹了 FCN-数据篇小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

從本篇開始，我們來記錄一下全卷積網絡用來做語義分割的全過程。
代碼：https://github.com/shelhamer/fcn.berkeleyvision.org

下面我們將描述三方面的內容：
1. 官方提供的公開數據集
2. 自己的數據集如何準備，主要是如何標注label
3. 訓練結束后如何對結果著色。

公開數據集

這里分別說一下SiftFlowDataset與pascal voc數據集。
1. pascal voc
根據FCN代碼中的data文件夾下的pascal說明：

# PASCAL VOC and SBDPASCAL VOC is a standard recognition dataset and benchmark with detection and semantic segmentation challenges. The semantic segmentation challenge annotates 20 object classes and background. The Semantic Boundary Dataset (SBD) is a further annotation of the PASCAL VOC data that provides more semantic segmentation and instance segmentation masks.PASCAL VOC has a private test set and [leaderboard for semantic segmentation](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6).The train/val/test splits of PASCAL VOC segmentation challenge and SBD diverge. Most notably VOC 2011 segval intersects with SBD train. Care must be taken for proper evaluation by excluding images from the train or val splits.We train on the 8,498 images of SBD train. We validate on the non-intersecting set defined in the included `seg11valid.txt`.Refer to `classes.txt` for the listing of classes in model output order. Refer to `../voc_layers.py` for the Python data layer for this dataset.See the dataset sites for download:- PASCAL VOC 2012: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/ - SBD: see [homepage](http://home.bharathh.info/home/sbd) or [direct download](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz)

我們可以下載訓練數據集：SBD 以及測試集：PASCAL VOC 2012
然后進入fcn/data，新建sbdd文件夾（如果沒有），將benchmark的dataset解壓到sbdd中，將VOC2012解壓到data下的pascal文件夾下。這兩個文件夾已經準備好了train.txt用于訓練，seg11valid.txt用于測試。
2. SIFT-Flow
下載數據集：下載地址。
并解壓至/fcn.berkeleyvision.org/data/下，并覆蓋名為sift-flow的文件夾。
由于FCN源代碼已經為我們準備好了train.txt等文件了，所以不需要重新生成。

準備自己的數據集

深度學習圖像分割（FCN）訓練自己的模型大致可以以下三步：

1.為自己的數據制作label；

2.將自己的數據分為train,val和test集；

3.仿照voc_lyaers.py編寫自己的輸入數據層。

在FCN中，圖像的大小是不限的，此時如果數據集的圖片大小不一，則每次只能訓一張圖片。這是FCN代碼的默認設置。即batch_size=1.但是如果批量訓練，則應該要求所有的數據集大小相同。此時我們需要使用resize進行縮放。一般情況下，我們將原圖縮放到256*256，或者500*500.

1. 縮放圖像

下面給出幾個縮放函數，來自網上：http://blog.csdn.net/u010402786/article/details/72883421
（1）單張圖片的resize

import Image def convert(width,height):im = Image.open("C:\\xxx\\test.jpg")out = im.resize((width, height),Image.ANTIALIAS)out.save("C:\\xxx\\test.jpg") if __name__ == '__main__':convert(256,256)

（2）resize整個文件夾里的圖片

import Image import osdef convert(dir,width,height):file_list = os.listdir(dir)print(file_list)for filename in file_list:path = ''path = dir+filenameim = Image.open(path)out = im.resize((256,256),Image.ANTIALIAS)print "%s has been resized!"%filenameout.save(path)if __name__ == '__main__':dir = raw_input('please input the operate dir:')convert(dir,256,256)

(3)按比例resize

import Image def convert(width,height):im = Image.open("C:\\workspace\\PythonLearn1\\test_1.jpg")(x, y)= im.sizex_s = widthy_s = y * x_s / xout = im.resize((x_s, y_s), Image.ANTIALIAS)out.save("C:\\workspace\\PythonLearn1\\test_1_out.jpg") if __name__ == '__main__':convert(256,256)

圖像標簽制作

第一步：使用github開源軟件進行標注

地址：https://github.com/wkentaro/labelme

Usage

Annotation

Run labelme --help for detail.

labelme # Open GUI labelme static/apc2016_obj3.jpg # Specify file labelme static/apc2016_obj3.jpg -O static/apc2016_obj3.json # Close window after the save

The annotations are saved as a JSON file. The
file includes the image itself.

Visualization

To view the json file quickly, you can use utility script:

labelme_draw_json static/apc2016_obj3.json

Convert to Dataset

To convert the json to set of image and label, you can run following:

labelme_json_to_dataset static/apc2016_obj3.json

第二步：為標注出來的label.png進行著色
上面的標注軟件將生成的json文件轉化為Dataset后，會生成label.png文件。是一張灰度圖像，16位。
因此我們需要對照VOC分割的顏色進行著色，一定要保證顏色的準確性。Matlab代碼:

function cmap = labelcolormap(N)if nargin==0N=256 end cmap = zeros(N,3); for i=1:Nid = i-1; r=0;g=0;b=0;for j=0:7r = bitor(r, bitshift(bitget(id,1),7 - j));g = bitor(g, bitshift(bitget(id,2),7 - j));b = bitor(b, bitshift(bitget(id,3),7 - j));id = bitshift(id,-3);endcmap(i,1)=r; cmap(i,2)=g; cmap(i,3)=b; end cmap = cmap / 255;

或者python代碼：

import numpy as np# Get the specified bit value def bitget(byteval, idx):return ((byteval & (1 << idx)) != 0)# Create label-color map, label --- [R G B] # 0 --- [ 0 0 0], 1 --- [128 0 0], 2 --- [ 0 128 0] # 3 --- [128 128 0], 4 --- [ 0 0 128], 5 --- [128 0 128] # 6 --- [ 0 128 128], 7 --- [128 128 128], 8 --- [ 64 0 0] # 9 --- [192 0 0], 10 --- [ 64 128 0], 11 --- [192 128 0] # 12 --- [ 64 0 128], 13 --- [192 0 128], 14 --- [ 64 128 128] # 15 --- [192 128 128], 16 --- [ 0 64 0], 17 --- [128 64 0] # 18 --- [ 0 192 0], 19 --- [128 192 0], 20 --- [ 0 64 128] def labelcolormap(N=256):color_map = np.zeros((N, 3))for n in xrange(N):id_num = nr, g, b = 0, 0, 0for pos in xrange(8):r = np.bitwise_or(r, (bitget(id_num, 0) << (7-pos)))g = np.bitwise_or(g, (bitget(id_num, 1) << (7-pos)))b = np.bitwise_or(b, (bitget(id_num, 2) << (7-pos)))id_num = (id_num >> 3)color_map[n, 0] = rcolor_map[n, 1] = gcolor_map[n, 2] = breturn color_map/255if __name__=="__main__":color_map=labelcolormap(21)print color_map

上面會生成如下的矩陣,以python的結果為例：

[[ 0. 0. 0. ][ 0.50196078 0. 0. ][ 0. 0.50196078 0. ][ 0.50196078 0.50196078 0. ][ 0. 0. 0.50196078][ 0.50196078 0. 0.50196078][ 0. 0.50196078 0.50196078][ 0.50196078 0.50196078 0.50196078][ 0.25098039 0. 0. ][ 0.75294118 0. 0. ][ 0.25098039 0.50196078 0. ][ 0.75294118 0.50196078 0. ][ 0.25098039 0. 0.50196078][ 0.75294118 0. 0.50196078][ 0.25098039 0.50196078 0.50196078][ 0.75294118 0.50196078 0.50196078][ 0. 0.25098039 0. ][ 0.50196078 0.25098039 0. ][ 0. 0.75294118 0. ][ 0.50196078 0.75294118 0. ][ 0. 0.25098039 0.50196078]]

分別對應著Pascal voc的colormap:

background 0 0 0 aeroplane 128 0 0 bicycle 0 128 0 bird 128 128 0 boat 0 0 128 bottle 128 0 128 bus 0 128 128 car 128 128 128 cat 64 0 0 chair 192 0 0 cow 64 128 0 diningtable 192 128 0 dog 64 0 128 horse 192 0 128 motorbike 64 128 128 person 192 128 128 pottedplant 0 64 0 sheep 128 64 0 sofa 0 192 0 train 128 192 0 tvmonitor 0 64 128

這里使用函數生成了label對應的顏色，這里label就是指0,1,2，… ,21(這里pascal voc共21類)
而在第一步標注生成的圖像label.png里面的數值就是0,1,2…21.最多256個數值。一般取為灰度圖像。
因此我們需要根據這個colormap將上面生成的灰度圖轉化為rgb圖像。

方法一：改造skimage的colormap
其實在skimage中已經包含了部分colormap，但是不是針對于pascal voc的格式，因此我們需要單獨指定。
找到如下路徑：

/*/anaconda2/lib/python2.7/site-packages/skimage/color/

修改colorlabel.py，增加

DEFAULT_COLORS1 = ('maroon', 'lime', 'olive', 'navy', 'purple', 'teal','gray', 'fcncat', 'fcnchair', 'fcncow', 'fcndining','fcndog', 'fcnhorse', 'fcnmotor', 'fcnperson', 'fcnpotte','fcnsheep', 'fcnsofa', 'fcntrain', 'fcntv')

并且把_label2rgb_overlay函數改造：

if colors is None:colors = DEFAULT_COLORS1

最后在rgb_colors.py中新增如下變量：

fcnchair = (0.753, 0, 0) fcncat = (0.251, 0, 0) fcncow = (0.251, 0.502, 0) fcndining = (0.753, 0.502, 0) fcndog = (0.251, 0, 0.502) fcnhorse = (0.753, 0, 0.502) fcnmotor = (0.251, 0.502, 0.502) fcnperson = (0.753, 0.502, 0.502) fcnpotte = (0, 0.251, 0) fcnsheep = (0.502, 0.251, 0) fcnsofa = (0, 0.753, 0) fcntrain = (0.502, 0.753, 0) fcntv = (0, 0.251, 0.502)

如果嫌麻煩，只需要下載：https://github.com/315386775/FCN_train
然后將Add_colortoimg下的skimge-color替換skimage的color文件夾即可。
最后執行轉換：

#!usr/bin/python # -*- coding:utf-8 -*- import PIL.Image import numpy as np from skimage import io,data,color import matplotlib.pyplot as pltimg = PIL.Image.open('xxx.png') img = np.array(img) dst = color.label2rgb(img, bg_label=0, bg_color=(0, 0, 0)) io.imsave('xxx.png', dst)

方法二： 不修改源代碼

#!usr/bin/python # -*- coding:utf-8 -*- import PIL.Image import numpy as np from skimage import io,data,color# Get the specified bit value def bitget(byteval, idx):return ((byteval & (1 << idx)) != 0)# Create label-color map, label --- [R G B] # 0 --- [ 0 0 0], 1 --- [128 0 0], 2 --- [ 0 128 0] # 4 --- [128 128 0], 5 --- [ 0 0 128], 6 --- [128 0 128] # 7 --- [ 0 128 128], 8 --- [128 128 128], 9 --- [ 64 0 0] # 10 --- [192 0 0], 11 --- [ 64 128 0], 12 --- [192 128 0] # 13 --- [ 64 0 128], 14 --- [192 0 128], 15 --- [ 64 128 128] # 16 --- [192 128 128], 17 --- [ 0 64 0], 18 --- [128 64 0] # 19 --- [ 0 192 0], 20 --- [128 192 0], 21 --- [ 0 64 128] def labelcolormap(N=256):color_map = np.zeros((N, 3))for n in xrange(N):id_num = nr, g, b = 0, 0, 0for pos in xrange(8):r = np.bitwise_or(r, (bitget(id_num, 0) << (7-pos)))g = np.bitwise_or(g, (bitget(id_num, 1) << (7-pos)))b = np.bitwise_or(b, (bitget(id_num, 2) << (7-pos)))id_num = (id_num >> 3)color_map[n, 0] = rcolor_map[n, 1] = gcolor_map[n, 2] = breturn color_map/255color_map = labelcolormap(21)img = PIL.Image.open('label.png') img = np.array(img) dst = color.label2rgb(img,colors=color_map[1:],bg_label=0, bg_color=(0, 0, 0)) io.imsave('xxx.png', dst)

這種方法直接加載了colormap，更簡單明了。

需要注意的是：第一種方法中，將部分colormap做了修改，比如DEFAULT_COLORS1的第二個color，本來應該是(0 128 0)，即(0, 0.502, 0)，在skimge顯示為green，但是這里使用了lime = (0, 1, 0)。不過差別不大。

第三步：最關鍵的一步
把24位png圖轉換為8位png圖，直接上matlab代碼：

dirs=dir('F:/xxx/*.png'); map =labelcolormap(256); for n=1:numel(dirs)strname=strcat('F:/xxx/',dirs(n).name);img=imread(strname);x=rgb2ind(img,map);newname=strcat('F:/xxx/',dirs(n).name);imwrite(x,map,newname,'png'); end

至此我們就生成了8位的彩色圖。

需要注意的是，我們可以讀取上面的生成的圖像，看下面的輸出是否與VOC輸出一致。

In [23]: img = PIL.Image.open('F:/DL/000001_json/test/dstfcn.png') In [24]: np.unique(img) Out[24]: array([0, 1, 2], dtype=uint8)

主要關注[0, 1, 2] ，是不是有這樣的輸出，如果有，證明我們就成功地生成了label。

上面我們經歷了生成label灰度圖像–>生成colormap–>轉化為rgb—》轉化為8位rgb。

接下來，我們需要為訓練準備如下數據：
test.txt是測試集，train.txt是訓練集，val.txt是驗證集，trainval.txt是訓練和驗證集
這時可以參考faster rcnn的比例，VOC2007中，trainval大概是整個數據集的50%，test也大概是整個數據集的50%；train大概是trainval的50%，val大概是trainval的50%。可參考以下代碼：

參考：http://blog.csdn.net/sinat_30071459/article/details/50723212

%% %該代碼根據已生成的xml，制作VOC2007數據集中的trainval.txt;train.txt;test.txt和val.txt %trainval占總數據集的50%，test占總數據集的50%；train占trainval的50%，val占trainval的50%； %上面所占百分比可根據自己的數據集修改，如果數據集比較少，test和val可少一些 %% %注意修改下面四個值 xmlfilepath='E:\Annotations'; txtsavepath='E:\ImageSets\Main\'; trainval_percent=0.5;%trainval占整個數據集的百分比，剩下部分就是test所占百分比 train_percent=0.5;%train占trainval的百分比，剩下部分就是val所占百分比%% xmlfile=dir(xmlfilepath); numOfxml=length(xmlfile)-2;%減去.和.. 總的數據集大小trainval=sort(randperm(numOfxml,floor(numOfxml*trainval_percent))); test=sort(setdiff(1:numOfxml,trainval));trainvalsize=length(trainval);%trainval的大小 train=sort(trainval(randperm(trainvalsize,floor(trainvalsize*train_percent)))); val=sort(setdiff(trainval,train));ftrainval=fopen([txtsavepath 'trainval.txt'],'w'); ftest=fopen([txtsavepath 'test.txt'],'w'); ftrain=fopen([txtsavepath 'train.txt'],'w'); fval=fopen([txtsavepath 'val.txt'],'w');for i=1:numOfxmlif ismember(i,trainval)fprintf(ftrainval,'%s\n',xmlfile(i+2).name(1:end-4));if ismember(i,train)fprintf(ftrain,'%s\n',xmlfile(i+2).name(1:end-4));elsefprintf(fval,'%s\n',xmlfile(i+2).name(1:end-4));endelsefprintf(ftest,'%s\n',xmlfile(i+2).name(1:end-4));end end fclose(ftrainval); fclose(ftrain); fclose(fval); fclose(ftest);

不過這里是利用了xml文件，我們可以直接利用img文件夾即可。

對測試結果著色

其實這一步主要就是修改infer.py
方法一：

import numpy as np from PIL import Image import caffe# load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe im = Image.open('pascal/VOC2010/JPEGImages/2007_000129.jpg') in_ = np.array(im, dtype=np.float32) in_ = in_[:,:,::-1] in_ -= np.array((104.00698793,116.66876762,122.67891434)) in_ = in_.transpose((2,0,1))# load net net = caffe.Net('voc-fcn8s/deploy.prototxt', 'voc-fcn8s/fcn8s-heavy-pascal.caffemodel', caffe.TEST) # shape for input (data blob is N x C x H x W), set data net.blobs['data'].reshape(1, *in_.shape) net.blobs['data'].data[...] = in_ # run net and take argmax for prediction net.forward() out = net.blobs['score'].data[0].argmax(axis=0)arr=out.astype(np.uint8) im=Image.fromarray(arr)palette=[] for i in range(256):palette.extend((i,i,i)) palette[:3*21]=np.array([[0, 0, 0],[128, 0, 0],[0, 128, 0],[128, 128, 0],[0, 0, 128],[128, 0, 128],[0, 128, 128],[128, 128, 128],[64, 0, 0],[192, 0, 0],[64, 128, 0],[192, 128, 0],[64, 0, 128],[192, 0, 128],[64, 128, 128],[192, 128, 128],[0, 64, 0],[128, 64, 0],[0, 192, 0],[128, 192, 0],[0, 64, 128]], dtype='uint8').flatten() im.putpalette(palette) im.show() im.save('test.png')

或者采用跟準備數據一樣的方法：

import numpy as np from PIL import Imageimport caffefrom scipy.misc import imread, imsave from skimage.color import label2rgb# Get the specified bit value def bitget(byteval, idx):return ((byteval & (1 << idx)) != 0)# Create label-color map, label --- [R G B] # 0 --- [ 0 0 0], 1 --- [128 0 0], 2 --- [ 0 128 0] # 4 --- [128 128 0], 5 --- [ 0 0 128], 6 --- [128 0 128] # 7 --- [ 0 128 128], 8 --- [128 128 128], 9 --- [ 64 0 0] # 10 --- [192 0 0], 11 --- [ 64 128 0], 12 --- [192 128 0] # 13 --- [ 64 0 128], 14 --- [192 0 128], 15 --- [ 64 128 128] # 16 --- [192 128 128], 17 --- [ 0 64 0], 18 --- [128 64 0] # 19 --- [ 0 192 0], 20 --- [128 192 0], 21 --- [ 0 64 128] def labelcolormap(N=256):color_map = np.zeros((N, 3))for n in xrange(N):id_num = nr, g, b = 0, 0, 0for pos in xrange(8):r = np.bitwise_or(r, (bitget(id_num, 0) << (7-pos)))g = np.bitwise_or(g, (bitget(id_num, 1) << (7-pos)))b = np.bitwise_or(b, (bitget(id_num, 2) << (7-pos)))id_num = (id_num >> 3)color_map[n, 0] = rcolor_map[n, 1] = gcolor_map[n, 2] = breturn color_mapdef main():# load image, switch to BGR, subtract mean, and make dims C x H x W for Caffeim = Image.open('data/pascal/VOCdevkit/VOC2012/JPEGImages/2007_000346.jpg')in_ = np.array(im, dtype=np.float32)in_ = in_[:,:,::-1]in_ -= np.array((104.00698793,116.66876762,122.67891434))in_ = in_.transpose((2,0,1))# load netnet = caffe.Net('voc-fcn8s/deploy.prototxt', 'ilsvrc-nets/fcn8s-heavy-pascal.caffemodel', caffe.TEST)# shape for input (data blob is N x C x H x W), set datanet.blobs['data'].reshape(1, *in_.shape)net.blobs['data'].data[...] = in_# run net and take argmax for predictionnet.forward()out = net.blobs['score'].data[0].argmax(0).astype(np.uint8)color_map = labelcolormap(21)label_mask = label2rgb(out, colors=color_map[1:], bg_label=0)label_mask[out == 0] = [0, 0, 0]imsave('data/pascal/VOCdevkit/VOC2012/JPEGImages/test_prediction.png', label_mask.astype(np.uint8))if __name__ == '__main__':main()

參考文獻

圖像分割 | FCN數據集制作的全流程（圖像標注）

FCN制作自己的數據集、訓練和測試全流程

FCN網絡訓練終極版

【FCN實踐】04 預測

總結

以上是生活随笔為你收集整理的FCN-数据篇的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

数据
FCN

上一篇： Mask RCNN笔记
下一篇：【OS修炼指南目录】----《X86汇编