當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

【cntk速成】cntk图像分类从模型自定义到测试

發(fā)布時間：2025/3/20 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了【cntk速成】cntk图像分类从模型自定义到测试小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

文章首發(fā)于微信公眾號《有三AI》

【cntk速成】cntk圖像分類從模型自定義到測試

歡迎來到專欄《2小時玩轉(zhuǎn)開源框架系列》，這是我們第七篇，前面已經(jīng)說過了caffe，tensorflow，pytorch，mxnet，keras，paddlepaddle。

今天說cntk，本文所用到的數(shù)據(jù)，代碼請參考我們官方git

https://github.com/longpeng2008/LongPeng_ML_Course

作者?|?言有三

編輯?|?言有三

01?CNTK是什么

地址：https://github.com/Microsoft/CNTK

CNTK是微軟開源的深度學(xué)習(xí)工具包，它通過有向圖將神經(jīng)網(wǎng)絡(luò)描述為一系列計算步驟。在有向圖中，葉節(jié)點表示輸入值或網(wǎng)絡(luò)參數(shù)，而其他節(jié)點表示其輸入上的矩陣運算。?

CNTK允許用戶非常輕松地實現(xiàn)和組合流行的模型，包括前饋DNN，卷積網(wǎng)絡(luò)（CNN）和循環(huán)網(wǎng)絡(luò)（RNN?/LSTM）。與目前大部分框架一樣，實現(xiàn)了自動求導(dǎo)，利用隨機梯度下降方法進行優(yōu)化。

cntk有什么特點呢？

1.1?性能較高

按照其官方的說法，比其他的開源框架性能都更高。

筆者在實際進行實驗的時候，確實也發(fā)現(xiàn)它的訓(xùn)練比較快。

1.2?適合做語音

CNTK本就是微軟語音團隊開源的，自然是更合適做語音任務(wù)，使用RNN等模型，以及在時空尺度分別進行卷積非常容易。

當(dāng)然，現(xiàn)在的背靠python的這些框架已經(jīng)大同小異，未來實現(xiàn)大一統(tǒng)并非不可能。

02?CNTK模型訓(xùn)練

pip安裝一條命令即可，可以選擇安裝cpu或者gpu版本。

pip?install?cntk/cntk-gpu。

接下來就是數(shù)據(jù)的準(zhǔn)備，模型的定義，結(jié)果的保存與分析。

在此之前，我們先看官方的分類案例，直觀感受一下，代碼比較長。

from?__future__?import?print_function

import?numpy?as?np

import?cntk?as?C

from?cntk.learners?import?sgd

from?cntk.logging?import?ProgressPrinter

from?cntk.layers?import?Dense,?Sequential

def?generate_random_data(sample_size,?feature_dim,?num_classes):

????#?Create?synthetic?data?using?NumPy.
????Y?=?np.random.randint(size=(sample_size,?1),?low=0,?high=num_classes)

????#?Make?sure?that?the?data?is?separable
????X?=?(np.random.randn(sample_size,?feature_dim)?+?3)?*?(Y?+?1)
????X?=?X.astype(np.float32)
????#?converting?class?0?into?the?vector?"1?0?0",
????#?class?1?into?vector?"0?1?0",?...
????class_ind?=?[Y?==?class_number?for?class_number?in?range(num_classes)]
????Y?=?np.asarray(np.hstack(class_ind),?dtype=np.float32)
????return?X,?Ydef?ffnet():
???inputs?=?2
???outputs?=?2
???layers?=?2
???hidden_dimension?=?50

???#?input?variables?denoting?the?features?and?label?data
???features?=?C.input_variable((inputs),?np.float32)
???label?=?C.input_variable((outputs),?np.float32)

???#?Instantiate?the?feedforward?classification?model
???my_model?=?Sequential?([
???????????????????Dense(hidden_dimension,?activation=C.sigmoid),
???????????????????Dense(outputs)])
???z?=?my_model(features)

???ce?=?C.cross_entropy_with_softmax(z,?label)
???pe?=?C.classification_error(z,?label)

???#?Instantiate?the?trainer?object?to?drive?the?model?training
???lr_per_minibatch?=?C.learning_parameter_schedule(0.125)
???progress_printer?=?ProgressPrinter(0)
???trainer?=?C.Trainer(z,?(ce,?pe),?[sgd(z.parameters,?lr=lr_per_minibatch)],?[progress_printer])

???#?Get?minibatches?of?training?data?and?perform?model?training
???minibatch_size?=?25
???num_minibatches_to_train?=?1024

???aggregate_loss?=?0.0
???for?i?in?range(num_minibatches_to_train):
???????train_features,?labels?=?generate_random_data(minibatch_size,?inputs,?outputs)
???????#?Specify?the?mapping?of?input?variables?in?the?model?to?actual?minibatch?data?to?be?trained?with
???????trainer.train_minibatch({features?:?train_features,?label?:?labels})
???????sample_count?=?trainer.previous_minibatch_sample_count
???????aggregate_loss?+=?trainer.previous_minibatch_loss_average?*?sample_count

???last_avg_error?=?aggregate_loss?/?trainer.total_number_of_samples_seen

???test_features,?test_labels?=?generate_random_data(minibatch_size,?inputs,?outputs)
???avg_error?=?trainer.test_minibatch({features?:?test_features,?label?:?test_labels})
???print('?error?rate?on?an?unseen?minibatch:?{}'.format(avg_error))
???return?last_avg_error,?avg_errornp.random.seed(98052)ffnet()

上面就是一個兩層的全連接神經(jīng)網(wǎng)絡(luò)，使用input_variable封裝數(shù)據(jù)，使用Sequential定義模型，使用train_minibatch({features?:?train_features,?label?:?labels})來feed數(shù)據(jù)，與tf，pytorch等框架都是一樣的，的確是沒有什么好說的。

2.1?數(shù)據(jù)讀取

這里需要用到接口，io.ImageDeserializer與C.io.StreamDefs，C.io.StreamDef。

它可以直接輸入如下格式的txt文件用于圖像分類問題。

../../../../datas/mouth/1/182smile.jpg1????

../../../../datas/mouth/1/435smile.jpg1????

../../../../datas/mouth/0/40neutral.jpg0????

../../../../datas/mouth/1/206smile.jpg1????

注意上面采用的分隔符是'\t'，這一點與MXNet相同，與caffe不同，完整的解析代碼如下：

C.io.MinibatchSource(C.io.ImageDeserializer(map_file,?C.io.StreamDefs(????

features?=?C.io.StreamDef(field='image',?transforms=transforms),????

labels???=?C.io.StreamDef(field='label',?shape=num_classes)????

)))????

在對圖像數(shù)據(jù)進行封裝的時候，添加了transform，所以可以在這里進行數(shù)據(jù)預(yù)處理操作。

常用的裁剪與縮放如下：

transform.crop(crop_type='randomside',?side_ratio=0.8)

transform.scale(width=image_width,?height=image_height,?channels=num_channels,interpolations='linear')

C.io.MinibatchSource的返回就是數(shù)據(jù)指針，可以直接用于訓(xùn)練。

2.2?網(wǎng)絡(luò)定義

與tensorflow和pytorch頗為相似，如下

def?simpleconv3(input,?out_dims):
???with?C.layers.default_options(init=C.glorot_uniform(),?activation=C.relu):
???????net?=?C.layers.Convolution((3,3),?12,?pad=True)(input)
???????net?=?C.layers.MaxPooling((3,3),?strides=(2,2))(net)

???????net?=?C.layers.Convolution((3,3),?24,?pad=True)(net)
???????net?=?C.layers.MaxPooling((3,3),?strides=(2,2))(net)

???????net?=?C.layers.Convolution((3,3),?48,?pad=True)(net)
???????net?=?C.layers.MaxPooling((3,3),?strides=(2,2))(net)

???????net?=?C.layers.Dense(128)(net)
???????net?=?C.layers.Dense(out_dims,?activation=None)(net)

???return?net

2.3?損失函數(shù)與分類錯誤率指標(biāo)定義

如下，model_func就是上面的net，input_var_norm和label_var分別就是數(shù)據(jù)和標(biāo)簽。

z?=?model_func(input_var_norm,?out_dims=2)????

ce?=?C.cross_entropy_with_softmax(z,?label_var)????

pe?=?C.classification_error(z,?label_var)???

2.4?訓(xùn)練參數(shù)

就是學(xué)習(xí)率，優(yōu)化方法，epoch等配置。

epoch_size?????=?900????

minibatch_size?=?64??????

lr_per_minibatch???????=?C.learning_rate_schedule([0.01]*100?+?[0.003]*100?+?[0.001],????

C.UnitType.minibatch,?epoch_size)????

m?=?C.momentum_schedule(0.9)????

l2_reg_weight??????????=?0.001????

learner?=?C.momentum_sgd(z.parameters,????

lr?=?lr_per_minibatch,????

momentum?=?m,????

l2_regularization_weight=l2_reg_weight)????

progress_printer?=?C.logging.ProgressPrinter(tag='Training',?num_epochs=max_epochs)????

trainer?=?C.Trainer(z,?(ce,?pe),?[learner],?[progress_printer])????

注意學(xué)習(xí)率的配置比較靈活，通過learning_rate_schedule接口，上面的C.learning_rate_schedule([0.01]*100?+[0.003]*100?+?[0.001]意思是，在0～100?epoch，使用0.01的學(xué)習(xí)率，100～100+100?epoch，使用0.003學(xué)習(xí)率，此后使用0.001學(xué)習(xí)率。

2.5?訓(xùn)練與保存

使用數(shù)據(jù)指針的next_minibatch獲取訓(xùn)練數(shù)據(jù)，trainer的train_minibatch進行訓(xùn)練，可以看出cntk非常強調(diào)minibatch的概念，實際上學(xué)習(xí)率和優(yōu)化方法都可以針對單個樣本進行設(shè)置。

for?epoch?in?range(max_epochs):????

???sample_count?=?0??????

???while?sample_count?<?epoch_size:????

??????data?=?reader_train.next_minibatch(min(minibatch_size,?epoch_size?-sample_count),?input_map=input_map)????

??????trainer.train_minibatch(data)????

模型的保存就一行代碼：

z.save("simpleconv3.dnn")

2.6?可視化

需要可視化的內(nèi)容不多，就是loss曲線和精度曲線，所以可以直接自己添加代碼，用上面的模型訓(xùn)練最后的loss如下，更好參數(shù)可自己調(diào)。

03?CNTK模型測試

測試就是載入模型，做好與訓(xùn)練時同樣的預(yù)處理操作然后forward就行了。

import?***
model_file?=?sys.argv[1]
image_list?=?sys.argv[2]
model?=?C.load_model(model_file)

count?=?0
acc?=?0
imagepaths?=?open(image_list,'r').readlines()
for?imagepath?in?imagepaths:
???imagepath,label?=?imagepath.strip().split('\t')
???im?=?Image.open(imagepath)
???print?imagepath
???print?"im?size",im.size
???image_data?=?np.array(im,dtype=np.float32)
???image_data?=?cv2.resize(image_data,(image_width,image_height))
???image_data?=?np.ascontiguousarray(np.transpose(image_data,?(2,?0,?1)))
???output?=?model.eval({model.arguments[0]:[image_data]})[0]
???print?output
???print?label,np.argmax(np.squeeze(output))
???if?str(label)?==?str(np.argmax(np.squeeze(output))):
???????acc?=?acc?+?1
???count?=?count?+?1
print?"acc=",float(acc)?/?float(count)

最終模型訓(xùn)練集準(zhǔn)確率91%，測試集準(zhǔn)確率88%，大家可以自己去做更多調(diào)試。

總結(jié)

相比于tensorflow，pytorch，cntk固然是比較小眾，但也不失為一個優(yōu)秀的平臺，尤其是對于語音任務(wù)，感興趣大家可以自行體驗，代碼已經(jīng)上傳至https://github.com/longpeng2008/LongPeng_ML_Course。

轉(zhuǎn)載文章請后臺聯(lián)系

侵權(quán)必究

本系列完整文章：

第一篇：【caffe速成】caffe圖像分類從模型自定義到測試

第二篇：【tensorflow速成】Tensorflow圖像分類從模型自定義到測試

第三篇：【pytorch速成】Pytorch圖像分類從模型自定義到測試

第四篇：【paddlepaddle速成】paddlepaddle圖像分類從模型自定義到測試

第五篇：【Keras速成】Keras圖像分類從模型自定義到測試

第六篇：【mxnet速成】mxnet圖像分類從模型自定義到測試

第七篇：【cntk速成】cntk圖像分類從模型自定義到測試

第八篇：【chainer速成】chainer圖像分類從模型自定義到測試

第九篇：【DL4J速成】Deeplearning4j圖像分類從模型自定義到測試

第十篇：【MatConvnet速成】MatConvnet圖像分類從模型自定義到測試

第十一篇：【Lasagne速成】Lasagne/Theano圖像分類從模型自定義到測試

第十二篇：【darknet速成】Darknet圖像分類從模型自定義到測試

感謝各位看官的耐心閱讀，不足之處希望多多指教。后續(xù)內(nèi)容將會不定期奉上，歡迎大家關(guān)注有三公眾號 有三AI！

與50位技術(shù)專家面對面20年技術(shù)見證，附贈技術(shù)全景圖

總結(jié)

以上是生活随笔為你收集整理的【cntk速成】cntk图像分类从模型自定义到测试的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：【AI白身境】只会用Python？g++
下一篇：【学员分享】程序员效率神器，最常用VIM