當(dāng)前位置：首頁(yè) > 人文社科 > 生活经验 >内容正文

生活经验

Kaggle上的犬种识别（ImageNet Dogs）

發(fā)布時(shí)間：2023/11/28 生活经验 29 豆豆

生活随笔收集整理的這篇文章主要介紹了 Kaggle上的犬种识别（ImageNet Dogs）小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Kaggle上的犬種識(shí)別（ImageNet Dogs）

Dog Breed Identification (ImageNet Dogs) on Kaggle

在本節(jié)中，將解決在Kaggle競(jìng)賽中的犬種識(shí)別挑戰(zhàn)。比賽的網(wǎng)址是

https://www.kaggle.com/c/dog-breed-identification

在這場(chǎng)競(jìng)賽中，試圖鑒別120種不同品種的狗。本次競(jìng)賽中使用的數(shù)據(jù)集實(shí)際上是著名的ImageNet數(shù)據(jù)集的一個(gè)子集。與CIFAR-10數(shù)據(jù)集中的圖像不同，ImageNet數(shù)據(jù)集中的圖像更高更寬，尺寸不一致。

圖1顯示了比賽網(wǎng)頁(yè)上的信息。為了提交結(jié)果，請(qǐng)先在Kaggle網(wǎng)站注冊(cè)一個(gè)帳戶。

Fig. 1 Dog breed identification competition website. The dataset for the competition can be accessed by clicking the “Data” tab.

首先，導(dǎo)入比賽所需的軟件包或模塊。

import collections

from d2l
import mxnet as d2l

import math

from mxnet
import autograd, gluon, init, npx

from mxnet.gluon
import nn

import os

import time

npx.set_np()

Obtaining and Organizing the Dataset

比賽數(shù)據(jù)分為訓(xùn)練集和測(cè)試集。訓(xùn)練集包含10222幀圖像和測(cè)試集包含10357幀圖像。兩組圖像均為JPEG格式。這些圖像包含三個(gè)RGB通道（顏色），具有不同的高度和寬度。訓(xùn)練集中有120種狗，包括拉布拉多犬、貴賓犬、臘腸犬、薩摩耶犬、哈士奇犬、吉娃娃犬和約克郡梗（Labradors, Poodles, Dachshunds, Samoyeds, Huskies, Chihuahuas, and Yorkshire Terriers）。

1.1. Downloading the Dataset

登錄Kaggle后，點(diǎn)擊圖1所示犬種識(shí)別比賽網(wǎng)頁(yè)上的“數(shù)據(jù)”選項(xiàng)卡，點(diǎn)擊“全部下載”按鈕下載數(shù)據(jù)集。在…/data中解壓縮下載的文件后，將在以下路徑中找到整個(gè)數(shù)據(jù)集：

·
…/data/dog-breed-identification/labels.csv

·
…/data/dog-breed-identification/sample_submission.csv

·
…/data/dog-breed-identification/train

·
…/data/dog-breed-identification/test

可能已經(jīng)注意到，上述結(jié)構(gòu)與第13.13節(jié)中的CIFAR-10競(jìng)賽非常相似，其中文件夾分別訓(xùn)練/和測(cè)試/包含訓(xùn)練和測(cè)試狗圖像，以及標(biāo)簽.csv有訓(xùn)練圖像的標(biāo)簽。

類(lèi)似地，為了更容易開(kāi)始，提供了上面提到的數(shù)據(jù)集的小規(guī)模樣本，“train_valid_test_tiny.zip”. 如果要為Kaggle競(jìng)賽使用完整的數(shù)據(jù)集，還需要將下面的demo變量更改為False。

#@save

d2l.DATA_HUB[‘dog_tiny’] = (d2l.DATA_URL

‘kaggle_dog_tiny.zip’,

‘7c9b54e78c1cedaa04998f9868bc548c60101362’)

# If you use the full dataset downloaded for the Kaggle
competition, change

# the variable below to False

demo = True

if demo:

data_dir = d2l.download_extract('dog_tiny')

else:

data_dir = os.path.join(’…’, ‘data’, ‘dog-breed-identification’)

1.2. Organizing the Dataset

組織數(shù)據(jù)集，即將驗(yàn)證集與訓(xùn)練集分離，并將圖像移動(dòng)到按標(biāo)簽分組的子文件夾中。

下面的reorg_dog_data函數(shù)用于讀取訓(xùn)練數(shù)據(jù)標(biāo)簽、分割驗(yàn)證集并組織訓(xùn)練集。

def reorg_dog_data(data_dir, valid_ratio):

labels = d2l.read_csv_labels(os.path.join(data_dir, 'labels.csv'))d2l.reorg_train_valid(data_dir, labels, valid_ratio)d2l.reorg_test(data_dir)

batch_size = 1 if demo else 128

valid_ratio = 0.1

reorg_dog_data(data_dir,
valid_ratio)

Image Augmentation

此部分中的圖像的大小大于上一部分中的圖像。下面是一些可能有用的圖像增強(qiáng)操作。

transform_train = gluon.data.vision.transforms.Compose([

Randomly crop the image to obtain an image with an area of 0.08 to 1 of

the original area and height to width ratio between 3/4 and 4/3. Then,

scale the image to create a new image with a height and width of 224

pixels each

gluon.data.vision.transforms.RandomResizedCrop(224, scale=(0.08,

1.0),

ratio=(3.0/4.0, 4.0/3.0)),

gluon.data.vision.transforms.RandomFlipLeftRight(),#

Randomly change the brightness, contrast, and saturation

gluon.data.vision.transforms.RandomColorJitter(brightness=0.4,

contrast=0.4,

saturation=0.4),

Add random noise

gluon.data.vision.transforms.RandomLighting(0.1),gluon.data.vision.transforms.ToTensor(),#

Standardize each channel of the image

gluon.data.vision.transforms.Normalize([0.485, 0.456, 0.406],

[0.229, 0.224, 0.225])])

在測(cè)試過(guò)程中，只使用明確的圖像預(yù)處理操作。

transform_test = gluon.data.vision.transforms.Compose([

gluon.data.vision.transforms.Resize(256),#

Crop a square of 224 by 224 from the center of the image

gluon.data.vision.transforms.CenterCrop(224),gluon.data.vision.transforms.ToTensor(),gluon.data.vision.transforms.Normalize([0.485, 0.456, 0.406],

[0.229, 0.224, 0.225])])

Reading the Dataset

與上一節(jié)一樣，可以創(chuàng)建一個(gè)ImageFolderDataset實(shí)例來(lái)讀取包含原始圖像文件的數(shù)據(jù)集。

train_ds, valid_ds, train_valid_ds, test_ds = [

gluon.data.vision.ImageFolderDataset(os.path.join(data_dir, 'train_valid_test', folder))for

folder in (‘train’, ‘valid’, ‘train_valid’, ‘test’)]

在這里，創(chuàng)建DataLoader實(shí)例。

train_iter, train_valid_iter = [gluon.data.DataLoader(

dataset.transform_first(transform_train), batch_size, shuffle=True,last_batch='keep') for dataset in (train_ds, train_valid_ds)]

valid_iter, test_iter = [gluon.data.DataLoader(

dataset.transform_first(transform_test), batch_size, shuffle=False,last_batch='keep') for dataset in (valid_ds, test_ds)]

Defining the Model

本次比賽的數(shù)據(jù)集是ImageNet數(shù)據(jù)集的一個(gè)子集。因此，選擇一個(gè)在整個(gè)ImageNet數(shù)據(jù)集上預(yù)先訓(xùn)練的模型，并使用來(lái)提取圖像特征，以便輸入到定制的小規(guī)模輸出網(wǎng)絡(luò)中。Gluon提供了一系列預(yù)先訓(xùn)練的模型。這里，將使用經(jīng)過(guò)預(yù)先訓(xùn)練的ResNet-34模型。由于競(jìng)爭(zhēng)數(shù)據(jù)集是預(yù)訓(xùn)練數(shù)據(jù)集的一個(gè)子集，因此只需重用預(yù)訓(xùn)練模型輸出層的輸入，即提取的特征。然后，可以用一個(gè)可以訓(xùn)練的小的定制輸出網(wǎng)絡(luò)來(lái)代替原來(lái)的輸出層，例如一系列中兩個(gè)完全連接的層。不重新訓(xùn)練用于特征提取的預(yù)訓(xùn)練模型。這減少了訓(xùn)練時(shí)間和存儲(chǔ)模型參數(shù)梯度所需的內(nèi)存。

必須注意，在圖像增強(qiáng)過(guò)程中，使用整個(gè)ImageNet數(shù)據(jù)集的三個(gè)RGB通道的平均值和標(biāo)準(zhǔn)差進(jìn)行標(biāo)準(zhǔn)化。這與預(yù)訓(xùn)練模型的規(guī)范化是一致的。

def get_net(ctx):

finetune_net = gluon.model_zoo.vision.resnet34_v2(pretrained=True)# Define a new output networkfinetune_net.output_new = nn.HybridSequential(prefix='')finetune_net.output_new.add(nn.Dense(256, activation='relu'))# There are 120 output categoriesfinetune_net.output_new.add(nn.Dense(120))# Initialize the output networkfinetune_net.output_new.initialize(init.Xavier(), ctx=ctx)# Distribute the model parameters to the CPUs or GPUs used for computationfinetune_net.collect_params().reset_ctx(ctx)return finetune_net

在計(jì)算損失時(shí)，首先利用成員變量特征來(lái)獲取預(yù)先訓(xùn)練模型輸出層的輸入，即提取的特征。然后，使用這個(gè)特性作為小型定制輸出網(wǎng)絡(luò)的輸入并計(jì)算輸出。

loss = gluon.loss.SoftmaxCrossEntropyLoss()

def evaluate_loss(data_iter, net, ctx):

l_sum, n = 0.0, 0for X, y in data_iter:y = y.as_in_ctx(ctx)output_features = net.features(X.as_in_ctx(ctx))outputs = net.output_new(output_features)l_sum += float(loss(outputs, y).sum())n += y.sizereturn l_sum / n

Defining the Training Functions

將根據(jù)模型在驗(yàn)證集上的性能來(lái)選擇模型并調(diào)整超參數(shù)。模型訓(xùn)練功能訓(xùn)練只訓(xùn)練小型定制輸出網(wǎng)絡(luò)。

def train(net, train_iter, valid_iter, num_epochs, lr, wd, ctx, lr_period,

      lr_decay):# Only train the small custom output networktrainer = gluon.Trainer(net.output_new.collect_params(), 'sgd',{'learning_rate': lr, 'momentum': 0.9, 'wd': wd})for epoch in range(num_epochs):train_l_sum, n, start = 0.0, 0, time.time()if epoch > 0 and epoch % lr_period == 0:trainer.set_learning_rate(trainer.learning_rate * lr_decay)for X, y in train_iter:y = y.as_in_ctx(ctx)output_features = net.features(X.as_in_ctx(ctx))with autograd.record():outputs = net.output_new(output_features)l = loss(outputs, y).sum()l.backward()trainer.step(batch_size)train_l_sum += float(l)n += y.sizetime_s = "time %.2f sec" % (time.time() - start)if valid_iter is not None:valid_loss = evaluate_loss(valid_iter, net, ctx)epoch_s = ("epoch %d, train loss %f, valid loss %f, "% (epoch + 1, train_l_sum / n, valid_loss))else:epoch_s = ("epoch %d, train loss %f, "% (epoch + 1, train_l_sum / n))print(epoch_s + time_s + ', lr ' + str(trainer.learning_rate))

Training and Validating the Model

現(xiàn)在，可以訓(xùn)練和驗(yàn)證模型。可以調(diào)整以下超參數(shù)。例如，可以增加紀(jì)元的數(shù)量。由于lr_period and lr_decay分別設(shè)置為10和0.1，因此優(yōu)化算法的學(xué)習(xí)率每10個(gè)周期將乘以0.1。

ctx, num_epochs, lr, wd = d2l.try_gpu(), 1, 0.01, 1e-4

lr_period, lr_decay, net = 10, 0.1, get_net(ctx)

net.hybridize()

train(net, train_iter, valid_iter, num_epochs, lr, wd, ctx, lr_period,

  lr_decay)

epoch 1, train loss 4.879428, valid loss 4.834594, time 8.79 sec, lr 0.01

Classifying the Testing Set and Submitting Results on Kaggle

在獲得滿意的模型設(shè)計(jì)和超參數(shù)后，使用所有訓(xùn)練數(shù)據(jù)集（包括驗(yàn)證集）對(duì)模型進(jìn)行再訓(xùn)練，然后對(duì)測(cè)試集進(jìn)行分類(lèi)。請(qǐng)注意，預(yù)測(cè)是由剛剛訓(xùn)練的輸出網(wǎng)絡(luò)做出的。

net = get_net(ctx)

net.hybridize()

train(net, train_valid_iter, None, num_epochs, lr, wd, ctx, lr_period,

  lr_decay)

preds = []

for data, label in test_iter:

output_features = net.features(data.as_in_ctx(ctx))output = npx.softmax(net.output_new(output_features))preds.extend(output.asnumpy())

ids = sorted(os.listdir(

os.path.join(data_dir, 'train_valid_test', 'test', 'unknown')))

with open(‘submission.csv’, ‘w’) as f:

f.write('id,' + ','.join(train_valid_ds.synsets) + '\n')for i, output in zip(ids, preds):f.write(i.split('.')[0] + ',' + ','.join([str(num) for num in output]) + '\n')

epoch 1, train loss 4.848448, time 10.14 sec, lr 0.01

執(zhí)行上述代碼后，將生成一個(gè)“submission.csv
“文件。此文件的格式符合Kaggle競(jìng)賽要求。

Summary

We can use a model pre-trained on the ImageNet dataset to extract features and only train a small custom output network. This will allow us to classify a subset of the ImageNet dataset with lower computing and storage overhead.

總結(jié)

以上是生活随笔為你收集整理的Kaggle上的犬种识别（ImageNet Dogs）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。