日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

VGG16网络结构与代码

發布時間:2023/12/31 编程问答 27 豆豆
生活随笔 收集整理的這篇文章主要介紹了 VGG16网络结构与代码 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

VGG16總共有16層(不包括池化層),13個卷積層和3個全連接層,第一次經過64個卷積核的兩次卷積后,采用一次pooling,第二次經過兩次128個卷積核卷積后,采用pooling;再經過3次256個卷積核卷積后,采用pooling;再經過3次512個卷積核卷積,采用pooling;再經過3次512個卷積核卷積,采用pooling,最后經過三次全連接。

? ? ? ? ? ? ? ? ? ? ? ?模塊

? ? ? ? ? ?各模塊的涉及的層次

? ? ? ? ? ? ? ? 輸入模塊

? ? ? ? ? ??224*224*3

? ? ? ? ? ? ? ? 第一個模塊

? ? ? ? ? ? ? conv3-64

? ? ? ? ? ? ? conv3-64

? ? ? ? ? ? ? maxpool

? ? ? ? ? ? ? ? 第二個模塊

? ? ? ? ? ? ? conv3-128

? ? ? ? ? ? ? conv3-128

? ? ? ? ? ? ? maxpool

? ? ? ? ? ? ? ?第三個模塊

? ? ? ? ? ? ? conv3-256

? ? ? ? ? ? ? conv3-256

? ? ? ? ? ? ? conv3-256

? ? ? ? ? ? ? maxpool

? ? ? ? ? ? ? 第四個模塊

? ? ? ? ? ? ? conv3-512

? ? ? ? ? ? ? conv3-512

? ? ? ? ? ? ? conv3-512

? ? ? ? ? ? ? maxpool

? ? ? ? ? ? ?第五個模塊

? ? ? ? ? ? ? conv3-512

? ? ? ? ? ? ? conv3-512

? ? ? ? ? ? ? conv3-512

? ? ? ? ? ? ? maxpool

? ? ? ? ? ? 第六個模塊(全連接層和輸出層)

? ? ? ? ? ? ? FC-4096 (實際上前面需要加一個Flatten層)

? ? ? ? ? ? ? FC-4096

? ? ? ? ? ? ? FC-1000 (負責分類,有幾個類別輸出就是幾)

? ? ? ? ? ? ? softmax(輸出層函數)

??

步驟理解
下面算一下每一層的像素值計算:
輸入:224 * 224 * 3

conv3-64(卷積核的數量)----------------------------------------kernel size:3 stride:1 padding:1
像素:(224 + 2 * 1 – 1 * (3 - 1)- 1 )/ 1 + 1=224 ---------------------輸出尺寸:224 * 224 * 64
參數: (3 * 3 * 3)* 64 =1728
conv3-64-------------------------------------------------------------kernel size:3 stride:1 padding:1
像素: (224 + 2 * 1 – 2 - 1)/ 1 + 1=224 ---------------------輸出尺寸:224 * 224 * 64
參數: (3 * 3 * 64) * 64 =36864
pool2 ----------------------------------------------------------------kernel size:2 stride:2 padding:0
像素: (224 - 2)/ 2 = 112 ----------------------------------輸出尺寸:112 * 112 * 64
參數: 0
conv3-128(卷積核的數量)--------------------------------------------kernel size:3 stride:1 padding:1
像素: (112 + 2 * 1 - 2 - 1) / 1 + 1 = 112 -------------------輸出尺寸:112 * 112 * 128
參數: (3 * 3 * 64) * 128 =73728
conv3-128------------------------------------------------------------kernel size:3 stride:1 padding:1
像素: (112 + 2 * 1 -2 - 1) / 1 + 1 = 112 ---------------------輸出尺寸:112 * 112 * 128
參數: (3 * 3 * 128) * 128 =147456
pool2------------------------------------------------------------------kernel size:2 stride:2 padding:0
像素: (112 - 2) / 2 + 1=56 ----------------------------------輸出尺寸:56 * 56 * 128
參數:0
conv3-256(卷積核的數量)----------------------------------------------kernel size:3 stride:1 padding:1
像素: (56 + 2 * 1 - 2 - 1)/ 1+1=56 -----------------------------輸出尺寸:56 * 56 * 256
參數:(3 * 3* 128)*256=294912
conv3-256-------------------------------------------------------------kernel size:3 stride:1 padding:1
像素: (56 + 2 * 1 - 2 - 1) / 1 + 1=56 --------------------------輸出尺寸:56 * 56 * 256
參數:(3 * 3 * 256) * 256=589824
conv3-256------------------------------------------------------------ kernel size:3 stride:1 padding:1
像素: (56 + 2 * 1 - 2 - 1) / 1 + 1=56 -----------------------------輸出尺寸:56 * 56 * 256
參數:(3 * 3 * 256)*256=589824
pool2------------------------------------------------------------------kernel size:2 stride:2 padding:0
像素:(56 - 2) / 2 + 1 = 28-------------------------------------輸出尺寸: 28 * 28 * 256
參數:0
conv3-512(卷積核的數量)------------------------------------------kernel size:3 stride:1 padding:1
像素:(28 + 2 * 1 - 2 - 1) / 1 + 1=28 ----------------------------輸出尺寸:28 * 28 * 512
參數:(3 * 3 * 256) * 512 = 1179648
conv3-512-------------------------------------------------------------kernel size:3 stride:1 padding:1
像素:(28 + 2 * 1 - 2 - 1) / 1 + 1=28 ----------------------------輸出尺寸:28 * 28 * 512
參數:(3 * 3 * 512) * 512 = 2359296
conv3-512-------------------------------------------------------------kernel size:3 stride:1 padding:1
像素:(28 + 2 * 1 - 2 - 1) / 1 + 1=28 ----------------------------輸出尺寸:28 * 28 * 512
參數:(3 * 3 * 512) * 512 = 2359296
pool2------------------------------------------------------------------ kernel size:2 stride:2 padding:0
像素:(28 - 2) / 2 + 1=14 -------------------------------------輸出尺寸:14 * 14 * 512
參數: 0
conv3-512(卷積核的數量)----------------------------------------------kernel size:3 stride:1 padding:1
像素:(14 + 2 * 1 - 2 - 1) / 1 + 1=14 ---------------------------輸出尺寸:14 * 14 * 512
參數:(3 * 3 * 512) * 512 = 2359296
conv3-512-------------------------------------------------------------kernel size:3 stride:1 padding:1
像素:(14 + 2 * 1 - 2 - 1) / 1 + 1=14 ---------------------------輸出尺寸:14 * 14 * 512
參數:(3 * 3 * 512) * 512 = 2359296
conv3-512-------------------------------------------------------------kernel size:3 stride:1 padding:1
像素:(14 + 2 * 1 - 2 - 1) / 1 + 1=14 ---------------------------輸出尺寸:14 * 14 * 512
參數:(3 * 3 * 512) * 512 = 2359296
pool2------------------------------------------------------------------kernel size:2 stride:2 padding:0
像素:(14 - 2) / 2 + 1=7 ----------------------------------------輸出尺寸:7 * 7 * 512
參數:0
FC------------------------------------------------------------------------ 4096 neurons
像素:1 * 1 * 4096
參數:7 * 7 * 512 * 4096 = 102760448
FC------------------------------------------------------------------------ 4096 neurons
像素:1 * 1 * 4096
參數:4096 * 4096 = 16777216
FC------------------------------------------------------------------------ 1000 neurons
像素:1 * 1 * 1000
參數:4096 * 1000=4096000
?


因為在pytorch中默認dilation是為1的,故上式也可以簡化為

Hout = (Hin + 2padding - kernel_size?) /?stride +1

參數 =?kernel size *?in_channels * out_channels

max pooling(kernel size:2 stride:2 padding:0)不改變通道數,只會讓特征圖尺寸減半

卷積時卷積核的數量就是輸出的通道數

BatchNorm跟輸出通道數保持一致

假設從下往上數為特征層1,2,3,第3個特征層的一個1×1的感受野(F(i+1)=1)對應上一個(第二個)特征層2×2大小的感受野,第2個特征層的一個2×2的感受野(F(i+1)=2)對應上一個(第一個)特征層5×5大小的感受野。

默認步距stride=1,兩個3×3的卷積核和一個5×5的卷積核得到的特征圖大小是一樣的,三個3×3的卷積核和一個7×7的卷積核得到的特征圖大小是一樣的。

net.py如下:?

import torch from torch import nn import torch.nn.functional as F# 224 * 224 * 3 class Vgg16_net(nn.Module):def __init__(self):super(Vgg16_net, self).__init__()self.layer1 = nn.Sequential(nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1), # 224 * 224 * 64nn.BatchNorm2d(64), # Batch Normalization強行將數據拉回到均值為0,方差為1的正太分布上,一方面使得數據分布一致,另一方面避免梯度消失。nn.ReLU(inplace=True),nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1), # 224 * 224 * 64nn.BatchNorm2d(64),nn.ReLU(inplace=True),nn.MaxPool2d(kernel_size=2, stride=2) # 112 * 112 * 64)self.layer2 = nn.Sequential(nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1), # 112 * 112 * 128nn.BatchNorm2d(128),nn.ReLU(inplace=True),nn.Conv2d(in_channels=128, out_channels=128, kernel_size=3, stride=1, padding=1), # 112 * 112 * 128nn.BatchNorm2d(128),nn.ReLU(inplace=True),nn.MaxPool2d(2, 2) # 56 * 56 * 128)self.layer3 = nn.Sequential(nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=1, padding=1), # 56 * 56 * 256nn.BatchNorm2d(256),nn.ReLU(inplace=True),nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1), # 56 * 56 * 256nn.BatchNorm2d(256),nn.ReLU(inplace=True),nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1), # 56 * 56 * 256nn.BatchNorm2d(256),nn.ReLU(inplace=True),nn.MaxPool2d(2, 2) # 28 * 28 * 256)self.layer4 = nn.Sequential(nn.Conv2d(in_channels=256, out_channels=512, kernel_size=3, stride=1, padding=1), # 28 * 28 * 512nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1), # 28 * 28 * 512nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1), # 28 * 28 * 512nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.MaxPool2d(2, 2) # 14 * 14 * 512)self.layer5 = nn.Sequential(nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1), # 14 * 14 * 512nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1), # 14 * 14 * 512nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1), # 14 * 14 * 512nn.BatchNorm2d(512),nn.ReLU(inplace=True),nn.MaxPool2d(2, 2) # 7 * 7 * 512)self.conv = nn.Sequential(self.layer1,self.layer2,self.layer3,self.layer4,self.layer5)self.fc = nn.Sequential(nn.Linear(7*7*512, 512),nn.ReLU(inplace=True),nn.Dropout(0.5),nn.Linear(512, 256),nn.ReLU(inplace=True),nn.Dropout(0.5),nn.Linear(256, 10) # 十分類問題)def forward(self, x):x = self.conv(x)# 這里-1表示一個不確定的數,就是你如果不確定你想要reshape成幾行,但是你很肯定要reshape成7*7*512列# 那不確定的地方就可以寫成-1# 如果出現x.size(0)表示的是batchsize的值# x=x.view(x.size(0),-1)x = x.view(-1, 7*7*512)x = self.fc(x)return x

train.py如下:

import json import sysimport torch import torchvision from torch import nn, optim from tqdm import tqdmfrom net import Vgg16_net import numpy as np from torch.optim import lr_scheduler import osfrom torchvision import transforms from torchvision.datasets import ImageFolder from torch.utils.data import DataLoaderimport matplotlib.pyplot as plt # import torchvision.models.vgg 可以在這里面下載預訓練權重import os os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'ROOT_TRAIN = r'E:/cnn/AlexNet/data/train' ROOT_TEST = r'E:/cnn/AlexNet/data/val'def main():device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")print("using {} device.".format(device))data_transform = {"train": transforms.Compose([transforms.RandomResizedCrop(224),transforms.RandomHorizontalFlip(),transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),"val": transforms.Compose([transforms.Resize((224, 224)), # cannot 224, must (224, 224)transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])} # 數據預處理train_dataset = ImageFolder(ROOT_TRAIN, transform=data_transform["train"]) # 加載訓練集train_num = len(train_dataset) # 打印訓練集有多少張圖片animal_list = train_dataset.class_to_idx # 獲取類別名稱以及對應的索引cla_dict = dict((val, key) for key, val in animal_list.items()) # 將上面的鍵值對位置對調一下json_str = json.dumps(cla_dict, indent=4) # 把類別和對應的索引寫入根目錄下class_indices.json文件中with open('class_indices.json', 'w') as json_file:json_file.write(json_str)batch_size = 32train_loader = torch.utils.data.DataLoader(train_dataset,batch_size=batch_size, shuffle=True,num_workers=0)validate_dataset = ImageFolder(ROOT_TEST, transform=data_transform["val"]) # 載入測試集val_num = len(validate_dataset) # 打印測試集有多少張圖片validate_loader = torch.utils.data.DataLoader(validate_dataset,batch_size=16, shuffle=False,num_workers=0)print("using {} images for training, {} images for validation.".format(train_num, val_num)) # 用于打印總的訓練集數量和驗證集數量# 用于查看數據集,注意改一下上面validate_loader的batch_size,batch_size等幾就是一次查看幾張圖片,shuffle=True順序打亂一下# test_data_iter = iter(validate_loader)# test_image, test_label = test_data_iter.next()## def imshow(img):# img = img / 2 + 0.5 # unnormalize# npimg = img.numpy()# plt.imshow(np.transpose(npimg, (1, 2, 0)))# plt.show()## print(' '.join('%5s' % cla_dict[test_label[j].item()] for j in range(4)))# imshow(utils.make_grid(test_image))net = Vgg16_net(num_classes=2) # 實例化網絡,num_classes代表有幾個類別# 載入預訓練模型參數(如果不想使用遷移學習的方法就把下面五行注釋掉,然后在resnet34()里傳入參數num_classes即可,如果使用遷移學習的方法就不需要在resnet34()里傳入參數num_classes)# model_weight_path = "./vgg16-pre.pth" # 預訓練權重# assert os.path.exists(model_weight_path), "file {} does not exist.".format(model_weight_path)# net.load_state_dict(torch.load(model_weight_path, map_location='cpu')) # 通過torch.load載入模型預訓練權重# in_channel = net.fc.in_features# net.fc = nn.Linear(in_channel, 2) # 重新賦值全連接層,這里的2指代的是類別數,訓練時需要改一下# VGG加載預訓練權重沒有成功net.to(device) # 將網絡指認到GPU或CPU上loss_function = nn.CrossEntropyLoss()# pata = list(net.parameters())optimizer = optim.Adam(net.parameters(), lr=0.0002)epochs = 1save_path = './VGGNet.pth'best_acc = 0.0train_steps = len(train_loader)for epoch in range(epochs):# trainnet.train()running_loss = 0.0train_bar = tqdm(train_loader, file=sys.stdout)for step, data in enumerate(train_bar):images, labels = dataoptimizer.zero_grad()outputs = net(images.to(device))loss = loss_function(outputs, labels.to(device))loss.backward()optimizer.step()# print statisticsrunning_loss += loss.item()train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,epochs,loss)# validatenet.eval()acc = 0.0 # accumulate accurate number / epochwith torch.no_grad():val_bar = tqdm(validate_loader, file=sys.stdout)for val_data in val_bar: # 遍歷驗證集val_images, val_labels = val_data # 數據分為圖片和標簽outputs = net(val_images.to(device)) # 將圖片指認到設備上傳入網絡進行正向傳播并得到輸出predict_y = torch.max(outputs, dim=1)[1] # 求得輸出預測中最有可得的類別(概率最大值)acc += torch.eq(predict_y, val_labels.to(device)).sum().item() # 將預測標簽與真實標簽進行比對,求得總的預測正確數量val_accurate = acc / val_num # 預測正確數量/測試集總數量print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' %(epoch + 1, running_loss / train_steps, val_accurate))if val_accurate > best_acc:best_acc = val_accuratetorch.save(net.state_dict(), save_path)print('Finished Training')if __name__ == '__main__':main()

?predict.py如下:

import os import jsonimport torch from PIL import Image from torchvision import transforms import matplotlib.pyplot as pltfrom net import Vgg16_netimport os os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'def main():device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")data_transform = transforms.Compose([transforms.Resize((224, 224)),transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])# load imageimg_path = "./7.jpg"assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)img = Image.open(img_path)plt.imshow(img)# [N, C, H, W]img = data_transform(img)# expand batch dimensionimg = torch.unsqueeze(img, dim=0)# read class_indictjson_path = './class_indices.json'assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)with open(json_path, "r") as f:class_indict = json.load(f)# create modelmodel = Vgg16_net(num_classes=2).to(device)# load model weightsweights_path = "./VGGNet.pth"assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path)model.load_state_dict(torch.load(weights_path, map_location=device))model.eval()with torch.no_grad():# predict classoutput = torch.squeeze(model(img.to(device))).cpu()predict = torch.softmax(output, dim=0)predict_cla = torch.argmax(predict).numpy()print_res = "class: {} prob: {:.3}".format(class_indict[str(predict_cla)],predict[predict_cla].numpy())plt.title(print_res)for i in range(len(predict)):print("class: {:10} prob: {:.3}".format(class_indict[str(i)],predict[i].numpy()))plt.show()if __name__ == '__main__':main()

REFERENCE:

[深度學習]-從零開始手把手教你利用pytorch訓練VGG16網絡實現自定義數據集上的圖像分類(含代碼及詳細注釋)_orangezs的博客-CSDN博客_vgg16實現圖片分類

經典卷積神經網絡---VGG16詳解_無盡的沉默的博客-CSDN博客_vgg16

總結

以上是生活随笔為你收集整理的VGG16网络结构与代码的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。