當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

pytorch基础知识+构建LeNet对Cifar10进行训练+PyTorch-OpCounter统计模型大小和参数量+模型存储与调用

發布時間：2024/7/23 编程问答 24 豆豆

生活随笔收集整理的這篇文章主要介紹了 pytorch基础知识+构建LeNet对Cifar10进行训练+PyTorch-OpCounter统计模型大小和参数量+模型存储与调用小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

整個環境的配置請參考我另一篇博客。ubuntu安裝python3.5+pycharm+anaconda+opencv+docker+nvidia-docker+tensorflow+pytorch+Cmake3.8_智障變智能-CSDN博客

中文文檔:torch - PyTorch中文文檔

github簡單示例：多卡分布式教程，帶有多卡mnist分布式訓練和單卡訓練

一. 基礎知識

pyTorch 使用的是動態圖（Dynamic Computational Graphs）的方式，而 TensorFlow 使用的是靜態圖（Static Computational Graphs）。

所謂動態圖，就是每次當我們搭建完一個計算圖，然后在反向傳播結束之后，整個計算圖就在內存中被釋放了。如果想再次使用的話，必須從頭再搭一遍。而以 TensorFlow 為代表的靜態圖，每次都先設計好計算圖，需要的時候實例化這個圖，然后送入各種輸入，重復使用，只有當會話結束的時候創建的圖才會被釋放.

１．tensor

import torch as t print(t.__version__)# 構建 5x3 矩陣，只是分配了空間，未初始化 x = t.Tensor(5, 3)

# 使用[0,1]均勻分布隨機初始化二維數組 x = t.rand(5, 3) print(x)

print(x.size()) # 查看x的shape print(x.size(1))

2.view與squeeze:

#通過`tensor.view`方法可以調整tensor的形狀，但必須保證調整前后元素總數一致。 # `view`不會修改自身的數據，返回的新tensor與源tensor共享內存， # 也即更改其中的一個，另外一個也會跟著改變。 # 在實際應用中可能經常需要添加或減少某一維度， # 這時候`squeeze`和`unsqueeze`兩個函數就派上用場了。 a = t.arange(0, 6) b=a.view(2, 3) #類似reshape print(a) print(b)c=b.unsqueeze(-1) print(c.shape)d=c.squeeze(-1) # 壓縮最后一維的“１” print(d.shape)d=c.squeeze() # 把所有維度為“1”的壓縮 print(d.shape)# a修改，b作為view之后的，也會跟著修改 a[1] = 100 print(b)

3.加法的三種寫法：

y = t.rand(5, 3) # 加法的第一種寫法 print(x + y) # 加法的第二種寫法 z=t.add(x, y) print(z) # 加法的第三種寫法：指定加法結果的輸出目標為result result = t.Tensor(5, 3) # 預先分配空間 t.add(x, y, out=result) # 輸入到result print(result)

4.add與add_的區別 tensor直接轉list

print('x=') x = t.Tensor([[1,2],[3,4]]) print(x)print('y=') y = t.Tensor([[1,2],[3,4]]) print(y)print('第二種加法，y的結果') y.add_(x) # inplace 加法，y變了 print(y)z=y.tolist() print(z)

Tensor還支持很多操作，包括數學運算、線性代數、選擇、切片等等，其接口設計與Numpy極為相似。

Tensor和Numpy的數組之間的互操作非常容易且快速。對于Tensor不支持的操作，可以先轉為Numpy數組處理，之后再轉回Tensor。

5.Tensor -> Numpy

a = t.ones(5) # 新建一個全1的Tensor print(a) b = a.numpy() # Tensor -> Numpy print(b)

6.Numpy->Tensor

import numpy as np a = np.ones(5) b = t.from_numpy(a) # Numpy->Tensor print(a) print(b)

scalar = b[0] print(scalar)print('scalar.size()=',scalar.size()) #0-dim print(scalar.item()) # 使用scalar.item()能從中取出python對象的數值 print(scalar.numpy())tensor = t.tensor([2]) # 注意和scalar的區別 print(tensor) print('tensor.size()=',tensor.size())

a = torch.rand(1) print(a) print(type(a)) print(a.item()) print(type(a.item()))

7.利用gpu加速?

a = t.ones(5) b = t.ones(5) # 在不支持CUDA的機器下，下一步還是在CPU上運行 device = t.device("cuda:0" if t.cuda.is_available() else "cpu") x = a.to(device) y = b.to(device) z = x+y print(z)

tensor放到gpu上加控制gpu順序

os.environ["CUDA_VISIBLE_DEVICES"] = '0' device = torch.device("cuda" if torch.cuda.is_available() else "cpu") prior_boxes = [[1,2,3,4],[3,4,5,6]] prior_boxes = torch.FloatTensor(prior_boxes).to(device) prior_boxes.clamp_(0,1)#防止越界 # print('len(prior_boxes)', len(prior_boxes)) print(prior_boxes)

8.gpu的一些操作　利用pycuda

torch.version.cuda 查看所使用的cuda版本

import torch import torchvision import numpy as np import pandas as pd from torch import nn import matplotlib.pyplot as plt import pycuda.driver as cuda def test_gpu():cuda.init()print('cuda is:', torch.cuda.is_available())print('current device id:', torch.cuda.current_device())## Get Id of current cuda deviceprint('cuda device count:', cuda.Device.count())num = cuda.Device.count()for i in range(num):print(cuda.Device(i).name(), "(Id: %d)" % i)available, total = cuda.mem_get_info()print("Available: %.2f GB\nTotal: %.2f GB" % (available / 1e9, total / 1e9))# for i in range(num):# print('cuda attrib:',cuda.Device(i).get_attributes())print('memory allocate:', torch.cuda.memory_allocated())Mat_cpu = torch.FloatTensor([1., 2.])print('Mat_cpu:', Mat_cpu)Mat_gpu = Mat_cpu.cuda()print('Mat_gpu:', Mat_gpu)if __name__ == '__main__':test_gpu()

9.注意在求導的時候要梯度清零

x = t.ones(2, 2, requires_grad=True) print(x)# 上一步等價于 x = t.ones(2,2) x.requires_grad = True print(x)y = x.sum() print(y) #y.grad_fn y.backward() # 反向傳播,計算梯度 print(x.grad)#注意：`grad`在反向傳播過程中是累加的(accumulated)，這意味著每一次運行反向傳播，梯度都會累加之前的梯度，所以反向傳播之前需把梯度清零。 y.backward() print(x.grad)# 以下劃線結束的函數是inplace操作，會修改自身的值，就像add_ x.grad.data.zero_() y.backward() print(x.grad)

Autograd實現了反向傳播功能，但是直接用來寫深度學習的代碼在很多情況下還是稍顯復雜，torch.nn是專門為神經網絡設計的模塊化接口。nn構建于 Autograd之上，可用來定義和運行神經網絡。nn.Module是nn中最重要的類，可把它看成是一個網絡的封裝，包含網絡各層定義以及forward方法，調用forward(input)方法，可返回前向傳播的結果。

10.torch.cat與torch.stack

對于cat會保持維度即可，而stack會增加維度。cat時　dim=0相當于np.vstack? dim=1相當于np.hstack

import torch print(torch.version.cuda) print(torch.cuda.is_available()) print(torch.__version__)x = torch.tensor([[1, 2, 3]]) print('x=', x)# 按第0維度堆疊，對于矩陣，相當于“豎著”堆 y=torch.cat((x, x, x), 0) print('y=', y)# 按第1維度堆疊，對于矩陣，相當于“橫著”拼 z=torch.cat((x, x, x), 1) print('z=', z)import numpy as np x=np.array([[1,2,3]]) print('x=', x)y=np.vstack((x,x,x)) print('y=',y)z = np.hstack((x, x, x)) print('z=', z)

cat和stack對比

reg_mask = torch.tensor([1, 2, 3]) reg_mask_list = [] for i in range(5):reg_mask_list.append(reg_mask)stack_mask = torch.stack(reg_mask_list, dim=0) print(stack_mask)cat_mask = torch.cat(reg_mask_list, dim=0) print(cat_mask)

11.tensor.chunk(tensor, chunks, dim=0)切塊數量由chunks決定,返回的是tuple

a = torch.arange(10) print('a=',a)b=torch.chunk(a, 4,dim=0) print('b=',b)

將輸入映射到N個不同的線性投影中。chunk性能更高

import torch.nn.functional as F d = 1024 batch = torch.rand((8, d)) layers = nn.Linear(d, 128, bias=False), nn.Linear(d, 128, bias=False), nn.Linear(d, 128, bias=False)out1 = layers[0](batch) out2 = layers[1](batch) out3 = layers[2](batch) print('===out1,shape', out1.shape) print('===out2,shape', out2.shape) print('===out3,shape', out3.shape) print('====方式2===') one_layer = nn.Linear(d, 128 * 3, bias=False) out1, out2, out3 = torch.chunk(one_layer(batch), 3, dim=1) print('===out1,shape', out1.shape) print('===out1,shape', out1.shape) print('===out1,shape', out1.shape)

12.tensor.split()返回的是tuple

將輸入張量分割成相等形狀的chunks（如果可分）。如果沿指定維的張量形狀大小不能被split_size?整分，則最后一個分塊會小于其它分塊。

a = torch.arange(10) print('a=',a)b=torch.chunk(a, 4,dim=0) print('b=',b)b = torch.split(a, 4, dim=0) print('b=', b)

13.torch.nn.functional.pairwise_distance計算行向量二范數

x1=torch.tensor([[1],[2]],dtype=torch.float32) print('x1=',x1) x2=torch.tensor([[2],[3]],dtype=torch.float32) y=torch.nn.functional.pairwise_distance(x2, x1) print('y=',y)x1 = torch.tensor([[1,2]], dtype=torch.float32) print('x1=', x1) x2 = torch.tensor([[2,3]], dtype=torch.float32) y = torch.nn.functional.pairwise_distance(x2, x1) print('y=', y)

計算cos? 自己計算或者使用?F.cosine_similarity

import torch.nn.functional as Foutput1 = torch.tensor([[1, 2],[4, 0]],dtype=torch.float32) output2 = torch.tensor([[2, 2],[2, 0]], dtype=torch.float32) dot = torch.sum(output1 * output2, dim=1) norm = torch.norm(output1, dim=1) * torch.norm(output2, dim=1) print('dot:', dot) print(torch.norm(output1, dim=1)) print(torch.norm(output2, dim=1)) print('norm:', norm) print('cos:', dot/norm)print(F.cosine_similarity(output1, output2, dim=1))

14.計算范數的幾種方式

torch.norm

（１）默認計算2范數

a=torch.tensor([[1,2,3,4],[1,3,2,2]],dtype=torch.float32) print(torch.norm(a)) print(torch.norm(a, dim=1)) print(torch.norm(a, dim=0))

（２）求1范數：絕對值之和

a=torch.tensor([[1,2,3,4],[1,3,2,2]],dtype=torch.float32)# print(torch.norm(a))# print(torch.norm(a, dim=1))print(torch.norm(a, dim=0,p=1))

torch.renorm與F.normalize

import torch x = torch.tensor([[1, 1, 1],[3, 4, 5],[4, 5, 6]]).float() res = x.renorm(2, 0, 1)#行操作 print('==res:', res)res = x.renorm(2, 0, 1e-5).mul(1e5) print('==res:', res)import torch.nn.functional as F res = F.normalize(x, p=2, dim=1)#行操作 print('==res:', res)

15.torch.nn.functional.adaptive_avg_pool2d? 平均池化

#batch, channel,a = torch.rand(16,128,7,7)out = torch.nn.functional.adaptive_avg_pool2d(a, (1, 1))print(out.shape)print(out.view(a.size(0),-1).shape)

16.nn.RNN

import torch rnn = torch.nn.RNN(input_size=20,hidden_size=50,num_layers=1) #sequence batch channels input = torch.randn(100, 1, 20)#(num_layers, batch, hidden_size) h_0 =torch.randn(1, 1, 50)output,hn=rnn(input, h_0)print('input:',input.size()) print('output',output.size()) print('hn',hn.size())

17.nn.LSTM

（１）單向lstm

import torch import torch.nn as nn lstm = nn.LSTM(input_size=20,hidden_size=50,num_layers=1) #sequence batch channels input = torch.randn(100, 1, 20)#(num_layers, batch, hidden_size) h_0 =torch.randn(1, 1, 50) c_0 = torch.randn(1, 1, 50)output,(hn_1,c_1)=lstm(input, (h_0,c_0))print('input:',input.size()) print('output',output.size()) print('hn_1:',hn_1.size()) print('c_1:',c_1.size())

（２）雙向lstm(輸出節點變為雙倍)

#test lstmlstm = torch.nn.LSTM(input_size=32, hidden_size=50, bidirectional=True)# (seq_len, batch, input_size):input = torch.randn(64, 16, 32)#(num_layers * num_directions, batch, hidden_size)h0 = torch.randn(2, 16, 50)#(num_layers * num_directions, batch, hidden_size):c0 = torch.randn(2, 16, 50)output, (hn,cn) = lstm(input, (h0, c0))print('========output=============')print(output.size())print('===========hn===============')print(hn.size())print('=========cn=================')print(cn.size())

18.

(1)　finetune　方式一

import torch,os,torchvision import torch.nn as nn import torch.nn.functional as F import pandas as pd import numpy as np import matplotlib.pyplot as plt from torch.utils.data import DataLoader, Dataset from torchvision import datasets, models, transforms from PIL import Image from sklearn.model_selection import StratifiedShuffleSplitprint(torch.__version__) # CUDA=torch.cuda.is_available() # DEVICE = torch.device("cuda" if CUDA else "cpu") DEVICE = torch.device("cpu") model_ft = models.resnet50(pretrained=True) # 這里自動下載官方的預訓練模型，并且 # 將所有的參數層進行凍結 for param in model_ft.parameters():param.requires_grad = False # 這里打印下全連接層的信息 print('=========fc info===================') print(model_ft.fc) num_fc_ftr = model_ft.fc.in_features #獲取到fc層的輸入 print('num_fc_ftr:',num_fc_ftr) model_ft.fc = nn.Linear(num_fc_ftr, 10) # 定義一個新的FC層model_ft=model_ft.to(DEVICE)# 放到設備中 print('=========after fine tune model=====================') print(model_ft) # 最后再打印一下新的模型

(2)　finetune　方式二

　　同時concate了max pooling和average pooling．max pooling更加關注重要的局部特征，而average pooling更加關注全局的特征．

import torch.nn as nn import torch from torchvision.models import resnet18class res18(nn.Module):def __init__(self, num_classes):super(res18, self).__init__()self.base = resnet18(pretrained=True)print('resnet18:', resnet18())self.feature = nn.Sequential(self.base.conv1,self.base.bn1,self.base.relu,self.base.maxpool,self.base.layer1,self.base.layer2,self.base.layer3,self.base.layer4)self.avg_pool = nn.AdaptiveAvgPool2d(1)self.max_pool = nn.AdaptiveMaxPool2d(1)self.reduce_layer = nn.Conv2d(1024, 512, 1)self.fc = nn.Sequential(nn.Dropout(0.5),nn.Linear(512, num_classes))def forward(self, x):bs = x.shape[0]x = self.feature(x)print('feature.shape:', x.shape)print('self.avg_pool(x).shape:', self.avg_pool(x).shape)print('self.max_pool(x).shape:', self.max_pool(x).shape)x = torch.cat([self.avg_pool(x), self.max_pool(x)], dim=1)print('cat x.shape', x.shape)x = self.reduce_layer(x).view(bs, -1)print('reduce x.shape', x.shape)logits = self.fc(x)return logitsdef test_resnet_18():model = res18(2)# print('model:', model)#b,c,h,wx = torch.rand(32, 3, 224, 224)print('input.shape:', x.shape)model(x) if __name__ == '__main__':test_resnet_18()

（３）改變預訓練權重

import torch from torch import nndef change_shape_of_coco_wt_lstm():load_from = './mixed_second_finetune_acc97p7.pth'save_to = './pretrained_model.pth'weights = torch.load(load_from)print('weights.keys():', weights.keys())for key,values in weights.items():print('key:', key)print(weights['rnn.1.embedding.weight'].shape)print(weights['rnn.1.embedding.bias'].shape)weights['rnn.1.embedding.weight'] = nn.init.kaiming_normal_(torch.empty(5146, 512),mode='fan_in', nonlinearity='relu')weights['rnn.1.embedding.bias'] = torch.rand(5146)torch.save(weights, save_to) if __name__ == '__main__':change_shape_of_coco_wt_lstm()

19.三種搭建網絡的方式
(1).比較常用的，簡潔明了

# Method 1 ----------------------------------------- import torch.nn as nn import torch.nn.functional as F import torchclass Net1(nn.Module):def __init__(self):super(Net1, self).__init__()#(w-k+2*p)/s+1self.conv = nn.Conv2d(3, 32, 3, 1, 1)self.dense1 = nn.Linear(32 * 3 * 3, 128)self.dense2 = nn.Linear(128, 10)def forward(self, x):x = F.max_pool2d(F.relu(self.conv(x)), 2)x = x.view(x.size(0), -1)x = F.relu(self.dense1(x))x = self.dense2(x)return xprint("==========Method 1================") model1 = Net1() print(model1) #(B,c,h,w) input=torch.rand((32,3,6,6)) print('input.size():',input.size()) output=model1(input) print(output.size())

(2).用nn.Sequential()容器進行快速搭建，模型的各層被順序添加到容器中。缺點是每層的編號是默認的阿拉伯數字，不易區分。示例還將每個容器的網絡結構打印出來.

import torch import torch.nn as nn # Method 2 ------------------------------------------ class Net2(nn.Module):def __init__(self):super(Net2, self).__init__()self.conv = nn.Sequential(nn.Conv2d(3, 32, 3, 1, 1),nn.ReLU(),nn.MaxPool2d(2))self.dense = nn.Sequential(nn.Linear(32 * 3 * 3, 128),nn.ReLU(),nn.Linear(128, 10))def forward(self, x):conv_out = self.conv(x)res = conv_out.view(conv_out.size(0), -1)out = self.dense(res)return outprint("==========Method 2================") model2 = Net2() print('==model2===\n', model2) # (B,c,h,w) input = torch.rand((32, 3, 6, 6)) print('==input.size()==:', input.size()) output = model2(input) print('==output.size()===:\n', output.size())print('==model2.conv==:\n', model2.conv) print('==model2.dense===:\n', model2.dense)

(3).對第二種方法的改進：通過add_module()添加每一層，并且為每一層增加了一個單獨的名字。

# Method 3 ------------------------------- class Net3(nn.Module):def __init__(self):super(Net3, self).__init__()self.conv=nn.Sequential()self.conv.add_module("conv1",nn.Conv2d(3, 32, 3, 1, 1))self.conv.add_module("relu1",nn.ReLU())self.conv.add_module("pool1",nn.MaxPool2d(2))self.dense = nn.Sequential()self.dense.add_module("dense1",nn.Linear(32 * 3 * 3, 128))self.dense.add_module("relu2",nn.ReLU())self.dense.add_module("dense2",nn.Linear(128, 10))def forward(self, x):conv_out = self.conv(x)res = conv_out.view(conv_out.size(0), -1)out = self.dense(res)return outprint("==========Method 3================") model3 = Net3() print(model3) #(B,c,h,w) input=torch.rand((32,3,6,6)) print('input.size():',input.size()) output=model3(input) print(output.size())

20．利用torch summary查看每一層輸出

import torch.nn as nn import torch from torchvision.models import resnet18 from torchsummary import summarydef check_output_size():model = resnet18()summary(model, (3, 224, 224)) if __name__ == '__main__':# test_resnet_18()check_output_size()

21．梯度裁剪

import torch.nn as nnoutputs = model(data) loss= loss_fn(outputs, target) optimizer.zero_grad() loss.backward() nn.utils.clip_grad_norm_(model.parameters(), max_norm=20, norm_type=2) optimizer.step()

nn.utils.clip_grad_norm_ 的參數：

parameters – 一個基于變量的迭代器，會進行梯度歸一化

max_norm – 梯度的最大范數

norm_type – 規定范數的類型，默認為L2

思路:

首先設置一個梯度閾值：clip_gradient
在后向傳播中求出各參數的梯度，這里我們不直接使用梯度進去參數更新，我們求這些梯度的l2范數
然后比較梯度的l2范數||g||與clip_gradient的大小
如果前者大，求縮放因子clip_gradient/||g||,　由縮放因子可以看出梯度越大，則縮放因子越小，這樣便很好地控制了梯度的范圍
最后將梯度乘上縮放因子便得到最后所需的梯度

22．凍結層

import torch.nn as nn import torch from torchvision.models import resnet18 from torchsummary import summary import torch.optim as optimdef freeze_parameters():model = resnet18(pretrained=True)for name, value in model.named_parameters():print('name={}, value.requires_grad={}'.format(name, value.requires_grad))#需要凍結的層no_grad = ['conv1.weight','bn1.weight','bn1.bias']for name, value in model.named_parameters():if name in no_grad:value.requires_grad = Falseelse:value.requires_grad = Trueprint('================================')for name, value in model.named_parameters():print('name={}, value.requires_grad={}'.format(name, value.requires_grad))#再定義優化器criterion = nn.CrossEntropyLoss()optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.01)#... if __name__ == '__main__':freeze_parameters()

23.nn.conv2d計算輸出（帶有空洞卷積）與分組卷積計算

pytorch1.0教程　torch.nn · Pytorch 中文文檔　

1.空洞卷積示例　conv_arithmetic/README.md at master · vdumoulin/conv_arithmetic · GitHub

一般形式　(n-k+2*p)/s+1 也就是dilation為１的時候．

2.分組卷積

分組卷積的極至就是可分離卷積，分組數等于輸入通道數

nn.Conv2d(inchannles*expansion, inchannles * expansion, kernel_size=3, padding=1, stride=stride,groups=inchannles * expansion)

總結:

標準卷積 :c1*k*k*c2

分組卷積:c1/g*k*k*c2/g*g,是標準卷積的1/g

可分離卷積:k*k*c1+c1*c2 = 1/c2 + 1/k^2 差不多是標準卷積的1/k

24.索引查找 index_select

x = torch.linspace(1, 12, steps=12).reshape(3, 4)print('==x', x) indices = torch.LongTensor([0, 2]) y = torch.index_select(x, 0, indices)#對行操作 print('==y', y)z = torch.index_select(x, 1, indices)#對列操作 print('==z', z)z = torch.index_select(y, 1, indices)#對列操作 print('==z',z)

25.全連接權重轉換成卷積權重

def decimate(tensor, m):"""Decimate a tensor by a factor 'm', i.e. downsample by keeping every 'm'th value.This is used when we convert FC layers to equivalent Convolutional layers, BUT of a smaller size.:param tensor: tensor to be decimated:param m: list of decimation factors for each dimension of the tensor; None if not to be decimated along a dimension:return: decimated tensor"""assert tensor.dim() == len(m)for d in range(tensor.dim()):if m[d] is not None:tensor = tensor.index_select(dim=d,index=torch.arange(start=0, end=tensor.size(d), step=m[d]).long())print('==tensor.shape:', tensor.shape)return tensor def test_fc_conv():"""fc (4096,25088)-->conv (1024,512,3,3)"""fc_weight_init = torch.rand(4096, 25088)fc_weight = fc_weight_init.reshape(4096, 512, 7, 7)m = [4, None, 3, 3]conv_weight = decimate(fc_weight, m)print('==conv_weight.shape', conv_weight.shape)

26.模型權重初始化

1.方式一，方式二參考另一篇文章resnet系列+mobilenet v2+pytorch代碼實現_智障變智能-CSDN博客

class AuxiliaryConvolutions(nn.Module):"繼續在vgg基礎上添加conv網絡"def __init__(self):super(AuxiliaryConvolutions, self).__init__()#調用父類初始化self.conv8_1 = nn.Conv2d(1024, 256, kernel_size=1, stride=1)self.conv8_2 = nn.Conv2d(256, 512, kernel_size=3, stride=2, padding=1)self.conv8_1 = nn.Conv2d(1024, 256, kernel_size=1, stride=1)self.conv8_2 = nn.Conv2d(256, 512, kernel_size=3, stride=2, padding=1)self.conv9_1 = nn.Conv2d(512, 128, kernel_size=1, stride=1)self.conv9_2 = nn.Conv2d(128, 256, kernel_size=3, stride=2, padding=1)self.conv10_1 = nn.Conv2d(256, 128, kernel_size=1, stride=1)self.conv10_2 = nn.Conv2d(128, 256, kernel_size=3, stride=1)self.conv11_1 = nn.Conv2d(256, 128, kernel_size=1, stride=1)self.conv11_2 = nn.Conv2d(128, 256, kernel_size=3, stride=1)self.init_conv2d()def init_conv2d(self):for c in self.children():if isinstance(c, nn.Conv2d):nn.init.xavier_uniform_(c.weight)nn.init.constant_(c.bias,0)def forward(self, input):out = F.relu(self.conv8_1(input))#(B,1024,19,19)out = F.relu(self.conv8_2(out))#(B,256,19,19)conv8_2feats = outout = F.relu(self.conv9_1(out))#(B,512,10,10)out = F.relu(self.conv9_2(out))##(B,256,5,5)conv9_2feats = outout = F.relu(self.conv10_1(out)) # (B,128,5,5)out = F.relu(self.conv10_2(out)) ##(B,256,3,3)conv10_2feats = outout = F.relu(self.conv11_1(out)) # (B,128,3,3)out = F.relu(self.conv11_2(out)) ##(B,256,1,1)conv11_2feats = out# print(out.size())return conv8_2feats, conv9_2feats, conv10_2feats, conv11_2feats def test_vgg_base():model = VGGbase()x = torch.rand((10, 3, 300, 300))model(x) def test_AUx_conv():model = AuxiliaryConvolutions()# (B, 1024, 19, 19)x = torch.rand((10, 1024, 19, 19))model(x)

27.torch將(cx,cy,w,h)轉換成(xmin,ymin,xmax,ymax)便于加速

def cxcy_to_xy(cxcy):"""Convert bounding boxes from center-size coordinates (c_x, c_y, w, h) to boundary coordinates (x_min, y_min, x_max, y_max).:param cxcy: bounding boxes in center-size coordinates, a tensor of size (n_boxes, 4):return: bounding boxes in boundary coordinates, a tensor of size (n_boxes, 4)"""return torch.cat([cxcy[:, :2] - (cxcy[:, 2:] / 2), # x_min, y_mincxcy[:, :2] + (cxcy[:, 2:] / 2)], 1) # x_max, y_maxcxcy = torch.tensor([[3, 3, 6, 6]]) res = cxcy_to_xy(cxcy) print('==res', res)

28.torch計算IOU便于加速

def find_intersection(set_1, set_2):"""Find the intersection of every box combination between two sets of boxes that are in boundary coordinates.:param set_1: set 1, a tensor of dimensions (n1, 4):param set_2: set 2, a tensor of dimensions (n2, 4):return: intersection of each of the boxes in set 1 with respect to each of the boxes in set 2, a tensor of dimensions (n1, n2)"""# PyTorch auto-broadcasts singleton dimensions# print('set_1[:, :2].unsqueeze(1).shape', set_1[:, :2].unsqueeze(1).shape)# print('set_2[:, :2].unsqueeze(0).shape', set_2[:, :2].unsqueeze(0).shape)lower_bounds = torch.max(set_1[:, :2].unsqueeze(1), set_2[:, :2].unsqueeze(0)) # (n1, n2, 2)# print('lower_bounds', lower_bounds.shape)upper_bounds = torch.min(set_1[:, 2:].unsqueeze(1), set_2[:, 2:].unsqueeze(0)) # (n1, n2, 2)intersection_dims = torch.clamp(upper_bounds - lower_bounds, min=0) # (n1, n2, 2)return intersection_dims[:, :, 0] * intersection_dims[:, :, 1] # (n1, n2)def find_jaccard_overlap(set_1, set_2):"""Find the Jaccard Overlap (IoU) of every box combination between two sets of boxes that are in boundary coordinates.:param set_1: set 1, a tensor of dimensions (n1, 4):param set_2: set 2, a tensor of dimensions (n2, 4):return: Jaccard Overlap of each of the boxes in set 1 with respect to each of the boxes in set 2, a tensor of dimensions (n1, n2)"""# Find intersectionsintersection = find_intersection(set_1, set_2) # (n1, n2)# Find areas of each box in both setsareas_set_1 = (set_1[:, 2] - set_1[:, 0]) * (set_1[:, 3] - set_1[:, 1]) # (n1)areas_set_2 = (set_2[:, 2] - set_2[:, 0]) * (set_2[:, 3] - set_2[:, 1]) # (n2)# Find the union# PyTorch auto-broadcasts singleton dimensionsunion = areas_set_1.unsqueeze(1) + areas_set_2.unsqueeze(0) - intersection # (n1, n2)return intersection / union # (n1, n2)objects = 3 box = torch.rand(objects, 4) priors_xy = torch.rand(8732,4)iou = find_jaccard_overlap(box, priors_xy) print('==iou.shape:', iou.shape)

29.torch計算檢測偏移量

g是ground truth p是預測的

#返回偏移量 def cxcy_to_gcxgcy(cxcy, priors_cxcy):"""輸入box [cx,cy,w,h]priors_cxcy [cx,cy,w,h] :return: [dx,dy,dw,dh]"""return torch.cat([(cxcy[:, :2] - priors_cxcy[:, :2]) / (priors_cxcy[:, 2:]), # g_c_x, g_c_ytorch.log(cxcy[:, 2:] / priors_cxcy[:, 2:])], 1) # g_w, g_hcxcy = torch.rand((8732, 4)) priors_cxcy = torch.rand((8732, 4)) res = cxcy_to_gcxgcy(cxcy, priors_cxcy) print('==res:', res.shape)

30.torch.nn.functional.unfold

從輸入樣本中，提取出滑動的３＊３局部區域塊成行，其余地方用0 pad,在ctpn中有用到

import torch from torch.nn import functional as f import numpy as np # x = torch.arange(0, 1 * 3 * 15 * 15).float() a = np.array([[1, 2, 3],[4, 5, 6],[7, 8, 9]]).astype(np.float32)x = torch.from_numpy(a) x = x.view(1, 1, 3, 3) print('===input x.shape:', x.shape) print('==x', x) height = x.shape[2] # (h-k+2*p)/s +1 x1 = f.unfold(x, kernel_size=3, dilation=1, stride=1, padding=1) print('===x1.shape', x1.shape) print('===x1', x1)x1 = x1.reshape((x1.shape[0], x1.shape[1], height, -1)) print('===final x1.shape', x1.shape)

31.torch.scatter生成onehot

scatter(dim, index, src)將src中數據根據index中的索引按照dim的方向進行填充。

y = y.scatter(dim,index,src)#則： y [ index[i][j] ] [j] = src[i][j] #if dim==0 y[i] [ index[i][j] ] = src[i][j] #if dim==1 import torchindex = torch.tensor([[1],[2],[0],[3]]) onehot = torch.zeros(4, 4) onehot.scatter_(1, index, 1) print('==onehot:', onehot)

data = torch.tensor([1, 2, 3, 4, 5]) index = torch.tensor([0, 1, 4]) values = torch.tensor([-1, -2, -3, -4, -5]) data.scatter_(0, index, values) print('==data:', data)

data = torch.zeros((4, 4)).float() index = torch.tensor([[0, 1],[2, 3],[0, 3],[1, 2] ]) values = torch.arange(1, 9).float().view(4, 2) print('===values:', values) data.scatter_(1, index, values) print('===data:', data)

32.內置onehot

import torch.nn.functional as F import torchtensor = torch.arange(0, 5) one_hot = F.one_hot(tensor) print('==one_hot:', one_hot)

33.F.interpolate進行插值, unet上采樣可以使用

import torch.nn.functional as F input = torch.arange(1, 5, dtype=torch.float32).view(1, 1, 2, 2) print('==input:', input) print('==input.shape:', input.shape) x = F.interpolate(input, scale_factor=2, mode='nearest') print(x)x = F.interpolate(input, size=(4, 4), mode='nearest') print(x)

上采樣圖片

import cv2 import numpy as np from torchvision.transforms.functional import to_tensor, to_pil_image img = cv2.imread('./111.png') new_img = to_pil_image(F.interpolate(to_tensor(img).unsqueeze(0), # batch of size 1mode="bilinear",scale_factor=2.0,align_corners=False).squeeze(0) # remove batch dimension ) print('==new_img.shape:', np.array(new_img).shape) cv2.imwrite('./new_img.jpg', np.array(new_img))

34.nn.functional.binary_cross_entropy 采用ohem

基礎使用

import numpy as np import torch from torch import nn res = torch.log(torch.tensor(np.exp(1))) print('==res:', res)gt = np.array([[0.]]).astype(np.float32) pred = np.array([[1.]]).astype(np.float32) pred = torch.from_numpy(pred) gt = torch.from_numpy(gt) loss = nn.functional.binary_cross_entropy(pred, gt, reduction='none') print('===loss:', loss) loss1 = -(1.*(torch.log(torch.tensor(0.)+1e-12))+(1.-1.)*(torch.log(torch.tensor(1.)-torch.tensor(0.)+1e-12))) print('===loss1:', loss1)

ohem

import numpy as np import torch from torch import nngt = np.array([[0, 0, 1],[1, 0, 0],[0, 0, 0]]).astype(np.float32)pred = np.array([[1, 0, 1],[0, 0, 0],[0, 1, 0]]).astype(np.float32) negative_ratio = 2 pred = torch.from_numpy(pred) gt = torch.from_numpy(gt) print('=====pred:', pred) print('=====gt:', gt) loss = nn.functional.binary_cross_entropy(pred, gt, reduction='none') print('=====loss:', loss)positive = (gt).byte() negative = ((1 - gt)).byte() print('==positive:', positive) print('==negative:', negative)positive_count = int(positive.float().sum()) negative_count = min(int(negative.float().sum()), int(positive_count * negative_ratio)) print('==positive_count:', positive_count) print('==negative_count:', negative_count)positive_loss = loss * positive.float() negative_loss = loss * negative.float() print('==positive_loss:', positive_loss) print('==negative_loss:', negative_loss) negative_loss, _ = negative_loss.view(-1).topk(negative_count) print('==negative_loss:', negative_loss) balance_loss = (positive_loss.sum() + negative_loss.sum()) / (positive_count + negative_count + 1e-8) print('==balance_loss:', balance_loss)

35.多個像素同一類loss

import torch.nn.functional as F src_logits = torch.rand((2, 2, 5))#(bs, cls, h*w) target_classes = torch.tensor([[0, 1, 0, 1, 0], #(bs, h*w)[1, 1, 0, 0, 0]]) loss = F.cross_entropy(src_logits, target_classes) print('==loss:', loss)soft_x = F.softmax(src_logits, dim=1) print('==soft_x:', soft_x) log_soft_out = torch.log(soft_x) loss = F.nll_loss(log_soft_out, target_classes) print('==loss:', loss)

?35.nn.embedding　將詞換成embedding向量

import numpy as np import torch import torch.nn as nn# 獲取每個詞的embedding向量 vocab_size = 6000 # 詞匯數 model_dim = 5 # 每個詞的embedding維度 # 兩個詞對應的索引 word_to_ix = {'hello': 0,'world': 1}embedding = nn.Embedding(vocab_size, model_dim) hello_idx = torch.LongTensor([word_to_ix['hello']])input = torch.LongTensor([[1, 2, 4, 5],[4, 3, 2, 9]])hello_embed = embedding(input) print('====hello_embed:', hello_embed.shape)

36.將特征模長歸一化為1

#將特征模長歸一化成1 import torch feature_list = [] epochs = 2 batch_size = 4 for i in range(epochs):feature = torch.rand(batch_size, 2)feature_list.append(feature)feat = torch.cat(feature_list, 0)#將list特征cat成所有樣本的特征 print('==feat.shape:', feat.shape)res = feat.norm(2, 1).unsqueeze(1)#計算范數 print('res:', res.shape) res = feat.norm(2, 1).unsqueeze(1).repeat(1, 2)#在一維重復2兩次并替代原先的值 print('==res:', res.shape)feat = feat/res print(feat) print('==feat.shape:', feat.shape)

37.dist與cdist計算距離

import torchx1 = torch.tensor([[1, 1],[2, 2]]).float()x2 = torch.tensor([[1, 3],[2, 3]]).float()res = torch.dist(x1, x2, p=2)#對應位置的元素計算歐式距離 print('==res:', res)res = torch.cdist(x1, x2, p=2)#每個行向量計算歐式距離 print('==res:', res)

38.kl散度計算

p = np.array([0.4, 0.4, 0.2]) q = np.array([0.5, 0.1, 0.4]) kl_np = (p*np.log(p/q)).sum() print('=kl_np:', kl_np)p = torch.tensor([0.4, 0.4, 0.2]) q = torch.tensor([0.5, 0.1, 0.4]) kl_torch_F = F.kl_div(q.log(), p, reduction='sum') print('==kl_torch_F:', kl_torch_F)criterion = nn.KLDivLoss(reduction='sum') kl_torch_nn = criterion(q.log(), p) print('==kl_torch_nn:', kl_torch_nn)

39.unsqueeze 拓展維度

import torch ind = torch.tensor([[1, 2, 3],[2, 3, 4]]) dim = 3 print(ind.unsqueeze(2))

40.expand?將tensor廣播到新的形狀

import torch ind = torch.tensor([[1, 2, 3],[2, 3, 4]]) dim = 3 print(ind.unsqueeze(2)) ind = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), dim) print('==ind.shape:', ind.shape) print('==ind:', ind)

41.gather 根據index在ｄim方向取值

b = torch.Tensor([[1, 2, 3],[4, 5, 6]])print(b)index_2 = torch.LongTensor([[0, 1, 1],#dim為0　表示沿著列方向取值[0, 0, 0]])print('====dim=0', torch.gather(b, dim=0, index=index_2))index_1 = torch.LongTensor([[0, 1],[2, 0]])print('====dim=1', torch.gather(b, dim=1, index=index_1)) #dim為1　表示沿著行方向取值

舉個栗子1:根據一個batch內每個通道topk,找出一個batch內的topk

def _gather_feat(feat, ind, mask=None):# feat: (bs, C*topk, 1)# ind: (bs, topk)print('===feat:', feat)print('===ind:', ind)dim = feat.size(2)ind = ind.unsqueeze(2).expand(ind.size(0), ind.size(1), dim) # (bs, topk, 1)print('===ind:', ind)feat = feat.gather(1, ind) # (bs, topk, 1)print('===feat:', feat)if mask is not None:mask = mask.unsqueeze(2).expand_as(feat)feat = feat[mask]feat = feat.view(-1, dim)return featimport torch bs = 2 topk = 2 #(bs, c, topk) #一個bacth每個通道的topk的索引 topk_inds = torch.randint(1, 100, (bs, 3, topk)) #(bs, topk) 一個bacth內topk的索引 topk_ind = torch.randint(1, 4, (bs, topk)) #目的根據batch內topk的索引找出每個通道相應topk的索引的位置print('===before topk_inds', topk_inds) print('==before topk_ind', topk_ind) topk_inds = _gather_feat(topk_inds.view(bs, -1, 1), topk_ind).view(bs, topk)print('===after topk_inds:', topk_inds.shape) print('===after topk_inds:', topk_inds)

舉個栗子2:找到輸出相應的點進行回歸

import random import torchdef _gather_feat(feat, ind):# feat: (bs, C*topk, 1)# ind: (bs, topk)# print('===feat:', feat)# print('===ind:', ind)dim = feat.size(-1)ind = ind.unsqueeze(len(ind.shape)).expand(*ind.shape, dim) # (bs, topk, 1)print('===ind.shape:', ind.shape)print('===ind:', ind)feat = feat.gather(dim=1, index=ind) # (bs, topk, 1)# print('===feat:', feat)return feat# 神經網絡預測輸出　指定相應的位置進行回歸 bs = 2 w_h = 4 # (bs, objects, 2) preds = torch.rand(bs, w_h, 2) print('==before preds:', preds) print('===before preds.shape:', preds.shape)max_objs = 5 # regress_index_list = [] #要回歸中心點的索引為了方便后續用正樣本進行回歸 regress_index_mask_list = []# 要回歸中心點的box記錄for i in range(bs):regress_index = torch.zeros(max_objs) # 要回歸中心點的索引為了方便后續用正樣本進行回歸regress_index_mask = torch.zeros(max_objs) # 要回歸中心點的box記錄for k in range(random.randint(1, 3)):regress_index[k] = random.randint(1, 3)regress_index_mask[k] = 1regress_index_list.append(regress_index)regress_index_mask_list.append(regress_index_mask)regress_indexs = torch.stack(regress_index_list, dim=0).long() regress_masks = torch.stack(regress_index_mask_list, dim=0) print('===regress_indexs', regress_indexs) print('===regress_indexs.shape', regress_indexs.shape) print('===regress_masks:', regress_masks) print('===regress_masks.shape', regress_masks.shape)preds = _gather_feat(preds, regress_indexs) print('==after preds:', preds) print('===after preds.shape:', preds.shape)regress_masks = regress_masks.unsqueeze(dim=2).expand_as(preds).float() print('==regress_masks expand:', regress_masks) real_need_pres = preds * regress_masks print('==real_need_pres', real_need_pres) print('-==real_need_pres.shape:', real_need_pres.shape)

???

42.nn.CrossEntropyLoss()計算loss分析

其默認是取mean，是對貢獻loss的正樣本取平均,和focal loss區別就在于參與訓練的是正樣本.

import torch import torch.nn as nn import numpy as np import torch.nn.functional as Fx_input = torch.rand(2, 3) # 隨機生成輸入 print('x_input:\n', x_input)y_target = torch.tensor([1, 2]) y_one_hot = F.one_hot(y_target) print('==y_one_hot:', y_one_hot)crossentropyloss = nn.CrossEntropyLoss()crossentropyloss_output = crossentropyloss(x_input, y_target) print('=====torch loss:', crossentropyloss_output)softmax_func = nn.Softmax(dim=1) soft_output = softmax_func(x_input) print('softmax_output:\n', soft_output) # 在softmax的基礎上取log logsoft_output = torch.log(soft_output) print('logsoft_output:\n', logsoft_output)# logsoftmax_func = nn.LogSoftmax(dim=1) # logsoftmax_output = logsoftmax_func(x_input) # print('logsoftmax_output:\n', logsoftmax_output)multiply_softmax = (y_one_hot * logsoft_output).numpy() print('==multiply_softmax:', multiply_softmax) index_y, index_x = np.nonzero(multiply_softmax) print(index_y, index_x) sum_loss = [] for i in range(len(index_y)):sum_loss.append(multiply_softmax[index_y[i]][index_x[i]]) print('===self compute loss:', -sum(sum_loss) / len(sum_loss))gts = y_one_hot alpha = 0.25 beta = 2 cls_preds = soft_output pos_inds = (gts == 1.0).float() print('==pos_inds:', pos_inds) neg_inds = (gts != 1.0).float() print('===neg_inds:', neg_inds) # pos_loss = -pos_inds * alpha * (1.0 - cls_preds) ** beta * torch.log(cls_preds) # neg_loss = -neg_inds * (1 - alpha) * ((cls_preds) ** beta) * torch.log(1.0 - cls_preds) pos_loss = -pos_inds * torch.log(cls_preds) neg_loss = -neg_inds * torch.log(1.0 - cls_preds) num_pos = pos_inds.float().sum() print('==num_pos:', num_pos) print('==pos_loss:', pos_loss)# print('==neg_loss:', neg_loss) pos_loss = pos_loss.sum() print('=pos_loss / num_pos:', pos_loss / num_pos) # neg_loss = neg_loss.sum() # if num_pos == 0: # mean_batch_focal_loss = neg_loss # else: # mean_batch_focal_loss = (pos_loss + neg_loss) / num_pos # print('==mean_batch_focal_loss:', mean_batch_focal_loss)

43.torch.topk

a = torch.tensor([[1, 2, 3, 5],[6, 4, 4, 6],[3, 2, 1, 0]]).float() topk = 2 topk_score, topk_ind = torch.topk(a, topk) print('==topk_score:', topk_score,) print('==topk_ind:', topk_ind)

#(B, h*w) scores = torch.tensor([[1, 2, 3, 5],[6, 4, 4, 6],[3, 2, 1, 0]]).float() top_k = 2 scores, indexes = torch.topk(scores,top_k,dim=1,largest=True,sorted=True)#(N, topk)#從大到小排序 print('===scores:', scores) print('===indexes:', indexes)

得到前topk框的分數和類別號

#(B, h*w, cls) 4類 per_level_cls_head = torch.tensor([[[0.1, 0.2, 0.3, 1],[0.6, 0.4, 2, 0.5],[0.1, 0.2, 3, 0.5],[0.6, 0.4, 5, 0.6]],[[0.1, 0.2, 0.3, 6],[0.6, 0.4, 4, 0.5],[0.1, 0.2, 4, 0.5],[0.6, 6, 0.6, 0.6]],[[0.1, 0.2, 3, 0.6],[0.6, 2, 0.4, 0.5],[1, 0.2, 0.4, 0.5],[-0.6, 0, -0.6, -0.6]]]).float() #(B, h*w) (B, h*w) scores, score_classes = torch.max(per_level_cls_head, dim=2) # (N, h*w) print('====scores:====', scores) print('====score_classes:====', score_classes)top_k = 2 #只取前兩 scores, indexes = torch.topk(scores,top_k,dim=1,largest=True,sorted=True)#(N, topk)#從大到小排序 print('===scores:', scores) print('===indexes:', indexes) score_classes = torch.gather(score_classes, 1, indexes)#(N, topk) print('==after score_classes:', score_classes)repeat_indexs = indexes.unsqueeze(-1).repeat(1, 1, 4) print('===repeat_indexs:', repeat_indexs)

根據score得到對應框

min_score_threshold = 0.8 #(B, h*w, 4) pred_bboxes = torch.tensor([[[0.1, 0.2, 0.3, 1],[0.6, 0.4, 2, 0.5],[0.1, 0.2, 3, 0.5]],[[0.1, 0.2, 0.3, 6],[0.6, 0.4, 4, 0.5],[0.1, 0.2, 4, 0.5]],[[0.1, 0.2, 3, 0.6],[0.6, 2, 0.4, 0.5],[-0.6, 0, -0.6, -0.6]]]) #(B, h*w) scores = torch.tensor([[0.88, 0.9, 0.5],[0.6, 0.9, 0.3],[0.1,0.4,0.88]]) #(B, h*w) score_classes = torch.tensor([[2, 3, 1],[6, 5, 3],[7, 8, 9]]) print('===scores > min_score_threshold===:', scores > min_score_threshold) score_classes = score_classes[scores > min_score_threshold].float() print('====score_classes :', score_classes ) pred_bboxes = pred_bboxes[scores > min_score_threshold].float() print('===pred_bboxes===', pred_bboxes)

44.torch.meshgrid

生成網格點

import torchhs = 3 ws = 2 print(torch.arange(hs)) grid_y, grid_x = torch.meshgrid([torch.arange(hs), torch.arange(ws)]) print('==grid_y', grid_y) print('==grid_x:', grid_x) grid_xy = torch.stack([grid_x, grid_y], dim=-1).float() print('==grid_xy:', grid_xy) print('==grid_xy.shape:', grid_xy.shape)grid_xy = grid_xy.view(1, hs * ws, 2) print('==grid_xy:', grid_xy)

4*4 feature map還原到原圖

import torch hs = 4. ws = 4. stride = 2. print(torch.arange(hs)) grid_y, grid_x = torch.meshgrid([torch.arange(hs) + 0.5, torch.arange(ws) + 0.5]) print('==grid_y', grid_y) print('==grid_x:', grid_x) grid_xy = torch.stack([grid_x, grid_y], dim=-1).float() print('==grid_xy:', grid_xy) print('==grid_xy.shape:', grid_xy.shape)grid_xy *= stride grid_xy = grid_xy.view(1, int(hs) * int(ws), 2) print('==grid_xy:', grid_xy)

45.torch.max

求最值和索引

a = torch.tensor([[1, 5, 62, 54],[2, 6, 2, 6],[2, 65, 2, 6]]) values1 = torch.max(a, dim=0).values print('==values1:', values1) values2 = torch.max(a, dim=0, keepdim=True).values print('==values2:', values2) indices = torch.max(a, dim=0).indices print('==indices:', indices)

舉個栗子:沿著通道取最值

a = torch.rand((2,3,100,100)) values1 = torch.max(a, dim=1, keepdim=True).valuesprint('==values1:', values1) print(values1.shape)

46.nn.dropout

當模型使用了dropout layer，訓練的時候只有占比為 1-p?的隱藏層單元參與訓練，那么在預測的時候，如果所有的隱藏層單元都需要參與進來，則得到的結果相比訓練時平均要大 1/1-p .故可以在訓練的時候直接將dropout后留下的權重擴大 1/1-p?倍，這樣就可以使結果的scale保持不變，而在預測的時候也不用做額外的操作了，更方便一些。

Drop = nn.Dropout(0.8)value = torch.tensor([[-1, 2, 1],[3, 4, 3]]).float() print('==Drop(value):', Drop(value))#對未置為0的等比例變化為x/(1-p)

47.nn.relu

Relu = nn.ReLU() value = torch.tensor([[-1, 2, 1],[3, 4, 3]]).float() print('==Relu(value):', Relu(value))

48.nonzero獲取非零索引

final_sample_flag = torch.tensor([1, 0, 3, -3]) final_sample_flag = final_sample_flag > 0 #大于０的就是正樣本 print('==(final_sample_flag == True):', (final_sample_flag == True)) print('==(final_sample_flag == True).nonzero():',(final_sample_flag == True).nonzero()) positive_index = (final_sample_flag == True).nonzero().squeeze( dim=-1) print('==positive_index:', positive_index)

49.不同切片理解

positive_candidates = torch.tensor([[[-1, -2, -3, 0],[1, 2, 3, 0],[1, 2, -1, 0]],[[-1, -2, -3, 0],[1, 5, 3, 0],[1, 9, -1, 0]],]) candidate_indexes = (torch.linspace(1, positive_candidates.shape[0], positive_candidates.shape[0]) - 1).long() print('===candidate_indexes:', candidate_indexes) min_index = [1, 2] final_candidate_reg_gts = positive_candidates[candidate_indexes, min_index, :] print('===final_candidate_reg_gts:', final_candidate_reg_gts)print('===positive_candidates[:, min_index, :]===', positive_candidates[:, min_index, :])

50-1.?nn.Module.register_buffer

將值注入網絡并且不需要學習可以調用forward時使用，例如，它可以是一個“權重”參數，它可以縮放損失或一些固定張量，它不會改變，但每次都使用。對于這種情況，請使用nn.Module.register_buffer?方法，它告訴PyTorch將傳遞給它的值存儲在模塊中，并將這些值隨模塊一起移動。如果你初始化你的模塊，然后將它移動到GPU，這些值也會自動移動。此外，如果你保存模塊的狀態，buffers也會被保存！

一旦注冊，這些值就可以在forward函數中訪問，就像其他模塊的屬性一樣。

class ModuleWithCustomValues(nn.Module):def __init__(self, weights, alpha):super().__init__()self.register_buffer("weights", torch.tensor(weights))self.register_buffer("alpha", torch.tensor(alpha))def forward(self, x):print('===self.weights:', self.weights)print('===self.alpha:', self.alpha)return x * self.weights + self.alphaValueClass = ModuleWithCustomValues(weights=[1.0, 2.0], alpha=1e-4 ) res = ValueClass(torch.tensor([1.23, 4.56])) print('==res:', res)

50-2.hook函數

?hook 函數，其三個參數不能修改(參數名隨意)，本質上是 PyTorch 內部回調函數

# hook 函數，其三個參數不能修改(參數名隨意)，本質上是 PyTorch 內部回調函數 # module: 本身對象 # input: 該 module forward 前輸入 # output: 該 module forward 后輸出 def forward_hook_fn(module, input, output):print('weight:', module.weight.data)print('bias:', module.bias.data)print('input:', input)print('output:', output)class Model(nn.Module):def __init__(self):super(Model, self).__init__()self.fc = nn.Linear(3, 1)self.fc.register_forward_hook(forward_hook_fn)constant_init(self.fc, 1)def forward(self, x):# print('===x.shape', x.shape)o = self.fc(x)return omodel = Model() x = torch.Tensor([[0.0, 1.0, 2.0]]) y = model(x)

51.torch.masked_select

有時你只需要對輸入張量的一部分進行計算。給你一個例子：你想計算的損失只在滿足某些條件的張量上。為了做到這一點，你可以使用torch.masked_select，注意，當需要梯度時也可以使用這個操作。

data = torch.rand((3, 3)).requires_grad_() print('==data:', data) mask = data > data.mean() print('==mask:', mask) data1 = torch.masked_select(data, mask) print('==data1:', data1)data2 = data[mask] print('==data2:', data2)

52.torch.where

x = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0], requires_grad=True) y = -x condition_or_mask = x <= 3.0 res = torch.where(condition_or_mask, x, y) print('=res:', res)

53.make_grid

顯示圖片

from torchvision.utils import make_grid from torchvision.transforms.functional import to_tensor, to_pil_image from PIL import Image import cv2 import numpy as np import matplotlib.pyplot as plt img = cv2.imread("./111.png") img = to_pil_image(make_grid([to_tensor(i) for i in [img, img, img]],nrow=2,# number of images in single rowpadding=5 # "frame" size) ) cv2.imwrite('./show_img.jpg', np.array(img))

54.切片將二維tensor劃分為窗口

window_size = 5 shift_size = 3 H = 15 W = 15 img_mask = torch.ones((H, W))*0 # 1 H W 1 h_slices = (slice(0, -window_size),slice(-window_size, -shift_size),slice(-shift_size, None)) w_slices = (slice(0, -window_size),slice(-window_size, -shift_size),slice(-shift_size, None)) cnt = 0for h in h_slices:for w in w_slices:img_mask[h, w] = cntcnt += 1print('===img_mask:\n', img_mask)x = img_mask x = x.view(H // window_size, window_size, W // window_size, window_size) print(x) windows = x.permute(0, 2, 1, 3).contiguous().view(-1, window_size, window_size) print('==windows.shape:', windows.shape) print('==windows:', windows)

55.獲取2*2窗口的相對位置

window_size = [2, 2] coords_h = torch.arange(window_size[0]) coords_w = torch.arange(window_size[1]) coords = torch.stack(torch.meshgrid([coords_h, coords_w])) # 2, Wh, Ww print('==coords:', coords) coords_flatten = torch.flatten(coords, 1) # 2, Wh*Ww print('===coords_flatten:', coords_flatten) print('===coords_flatten[:, :, None]:', coords_flatten[:, :, None]) print('==coords_flatten[:, None, :]:', coords_flatten[:, None, :]) relative_coords = coords_flatten[:, :, None] - coords_flatten[:, None, :] # 2, Wh*Ww, Wh*Ww print('==relative_coords:', relative_coords) relative_coords = relative_coords.permute(1, 2, 0).contiguous() # Wh*Ww, Wh*Ww, 2 print('==relative_coords.shape:', relative_coords.shape) print('==relative_coords:', relative_coords)

56.三維tensor變為四維one hot

def _expand_onehot_labels(labels, label_weights, target_shape, ignore_index):"""Expand onehot labels to match the size of prediction."""bin_labels = labels.new_zeros(target_shape)valid_mask = (labels >= 0) & (labels != ignore_index)print('==valid_mask:', valid_mask)inds = torch.nonzero(valid_mask, as_tuple=True)print('==inds:', inds)if inds[0].numel() > 0:if labels.dim() == 3:bin_labels[inds[0], labels[valid_mask], inds[1], inds[2]] = 1else:bin_labels[inds[0], labels[valid_mask]] = 1valid_mask = valid_mask.unsqueeze(1).expand(target_shape).float()if label_weights is None:bin_label_weights = valid_maskelse:bin_label_weights = label_weights.unsqueeze(1).expand(target_shape)bin_label_weights *= valid_maskreturn bin_labels, bin_label_weightsweight=None ignore_index = 255 label = torch.tensor([[[255, 0, 1, 255],[255, 2, 3, 255],[255, 4, 5, 255]] ]) pred = F.softmax(torch.rand((1, 10, 3, 4)), dim=1) label, weight = _expand_onehot_labels(label, weight, pred.shape,ignore_index) print('=label:', label) print('=label.shape:', label.shape) one_hot = label.transpose(1, 2).transpose(2, 3) print(one_hot.shape) print('==one_hot:', one_hot) print('==weight.shape:', weight.shape)

57.多通道輸出結果（四維）和三維ｇｔ計算acc

def accuracy(pred, target, topk=1, thresh=None):"""Calculate accuracy according to the prediction and target.Args:pred (torch.Tensor): The model prediction, shape (N, num_class, ...)target (torch.Tensor): The target of each prediction, shape (N, , ...)topk (int | tuple[int], optional): If the predictions in ``topk``matches the target, the predictions will be regarded ascorrect ones. Defaults to 1.thresh (float, optional): If not None, predictions with scores underthis threshold are considered incorrect. Default to None.Returns:float | tuple[float]: If the input ``topk`` is a single integer,the function will return a single float as accuracy. If``topk`` is a tuple containing multiple integers, thefunction will return a tuple containing accuracies ofeach ``topk`` number."""assert isinstance(topk, (int, tuple))if isinstance(topk, int):topk = (topk, )return_single = Trueelse:return_single = Falsemaxk = max(topk)if pred.size(0) == 0:accu = [pred.new_tensor(0.) for i in range(len(topk))]return accu[0] if return_single else accuassert pred.ndim == target.ndim + 1assert pred.size(0) == target.size(0)assert maxk <= pred.size(1), \f'maxk {maxk} exceeds pred dimension {pred.size(1)}'pred_value, pred_label = pred.topk(maxk, dim=1) #(b, 1, h, w)# transpose to shape (maxk, N, ...)pred_label = pred_label.transpose(0, 1)#(1, b, h, w)print('==pred_label:', pred_label)print('=target.unsqueeze(0):', target.unsqueeze(0))correct = pred_label.eq(target.unsqueeze(0).expand_as(pred_label))if thresh is not None:# Only prediction values larger than thresh are counted as correctcorrect = correct & (pred_value > thresh).t()res = []for k in topk:correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)res.append(correct_k.mul_(100.0 / target.numel()))return res[0] if return_single else respred = F.softmax(torch.rand((1, 10, 3, 4)), dim=1) print('==pred.shape:', pred.shape) #10個通道壓縮后的每個通道索引 target = torch.tensor([[[1, 0, 1, 9],[2, 2, 3, 8],[6, 4, 5, 6]]]) acc = accuracy(pred, target, topk=1, thresh=None) print('==acc:', acc)

58.F.nll_loss 一般用在softmax算-log之后計算loss

import torch.nn.functional as F a = torch.tensor([[0, 1, 2, 3],[5, 4, 5, 6]], dtype=torch.float32)b = torch.tensor([0, 1]) #-(0+4)/2 = -2 res = F.nll_loss(a, b) print('==res:', res)a = torch.tensor([[[0, 1],[2, 3],[5, 4],[5, 6]],[[0, 2],[2, 3],[3, 3],[4, 6]]], dtype=torch.float32) print('=a.shape:', a.shape) print('==a.transpose(1, 2):', a.transpose(1, 2)) b = torch.tensor([[1, 1],[0, 1]]) res = F.nll_loss(a, b)# -2(2+3+0+3)/4 = -2 print('==res:', res)

59.transforms.Compose

from torchvision import transforms as trans from PIL import Image import cv2 train_trans = trans.Compose([trans.ToTensor(),# trans.Resize((h, w)),# trans.Normalize(mean, std),]) img = np.array([[[255, 255, 255],[255, 255, 255],[0, 0, 0]],[[255, 255, 255],[255, 255, 255],[0, 0, 0]]], dtype=np.uint) print('==img.shape:', img.shape) cv2.imwrite('./img.jpg', img) img = cv2.imread('./img.jpg') print('==img:', img)#(h, w, c) img = train_trans(Image.fromarray(img))#(c, h, w) print('==tensor img', img)

60.soft-argmax

#soft argmax import numpy as np import torch import torch.nn.functional as F heatmap_size = 10 heatmap1d = np.array([[1, 5, 5, 2, 0, 1, 0, 1, 3, 2],[9, 6, 2, 8, 2, 1, 0, 1, 0, 2],[3, 7, 9, 1, 0, 2, 1.3, 2.3, 0, 1]]).astype(np.float32) print('==np.argmax(heatmap1d):', np.argmax(heatmap1d, axis=1)) heatmap1d = torch.from_numpy(heatmap1d) heatmap1d = heatmap1d * 10 #乘上10進一步放大差距 heatmap1d = F.softmax(heatmap1d, 1) print('==heatmap1d:', heatmap1d) accu = heatmap1d * torch.arange(heatmap_size, dtype=heatmap1d.dtype,device=heatmap1d.device)[None, :] print('==accu:', accu) coord = accu.sum(dim=1) print('==coord:', coord)

61.CAM

import io import requests from PIL import Image from torchvision import models, transforms from torch.autograd import Variable from torch.nn import functional as F import numpy as np import cv2 import pdb import json import os model_path = './torch_models' os.environ['TORCH_HOME'] = model_path os.makedirs(model_path, exist_ok=True)# input image # LABELS_URL = 'https://s3.amazonaws.com/outcome-blog/imagenet/labels.json' # IMG_URL = 'http://media.mlive.com/news_impact/photo/9933031-large.jpg' # 使用本地的圖片和下載到本地的labels.json文件 # LABELS_PATH = "labels.json" # networks such as googlenet, resnet, densenet already use global average pooling at the end, so CAM could be used directly. model_id = 1 # 選擇使用的網絡 if model_id == 1:net = models.squeezenet1_1(pretrained=True)finalconv_name = 'features' # this is the last conv layer of the network elif model_id == 2:net = models.resnet18(pretrained=True)finalconv_name = 'layer4' elif model_id == 3:net = models.densenet161(pretrained=True)finalconv_name = 'features' # 有固定參數的作用，如norm的參數 net.eval() # 獲取特定層的feature map # hook the feature extractor features_blobs = [] # print('=before net:', net) def hook_feature(module, input, output):features_blobs.append(output.data.cpu().numpy())finalconv_name = 'features' #獲取finalconv_name層的特征輸出 net._modules.get(finalconv_name).register_forward_hook(hook_feature) print('=after net:', net) # 得到softmax weight, params = list(net.parameters())# 將參數變換為列表 # print('==params[-2].shape:', params[-2].shape) # print('==params[-1].shape:', params[-1].shape)#(1000, 512, 1, 1) weight_softmax = np.squeeze(params[-2].data.numpy())# 提取softmax 層的參數 # for name, value in net.named_parameters(): # print('name={}, value.requires_grad={}, value.shape={}'.format(name, value.requires_grad, value.shape)) def returnCAM(feature_conv, weight_softmax, class_idx):# generate the class activation maps upsample to 256x256size_upsample = (256, 256)#1, 512, 13, 13bz, nc, h, w = feature_conv.shape# 獲取feature_conv特征的尺寸# import pdb;pdb.set_trace()output_cam = []# class_idx為預測分值較大的類別的數字表示的數組，一張圖片中有N類物體則數組中N個元素for idx in class_idx:# weight_softmax中預測為第idx類的參數w乘以feature_map(為了相乘，故reshape了map的形狀)cam = weight_softmax[idx].dot(feature_conv.reshape((nc, h*w)))#512,13*13# 將feature_map的形狀reshape回去cam = cam.reshape(h, w)#(13, 13)# 歸一化操作（最小的值為0，最大的為1）cam = cam - np.min(cam)cam_img = cam / np.max(cam)# 轉換為圖片的255的數據cam_img = np.uint8(255 * cam_img)# resize 圖片尺寸與輸入圖片一致output_cam.append(cv2.resize(cam_img, size_upsample))return output_cam # 數據處理，先縮放尺寸到（224*224），再變換數據類型為tensor,最后normalize normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225] ) preprocess = transforms.Compose([transforms.Resize((224, 224)),transforms.ToTensor(),normalize ])img_pil = Image.open('./cam.png') img_pil.save('test.jpg') # 將圖片數據處理成所需要的可用的數據 img_tensor = preprocess(img_pil) # 處理圖片為Variable數據 img_variable = Variable(img_tensor.unsqueeze(0)) # 將圖片輸入網絡得到預測類別分值 logit = net(img_variable) # print('==logit.shape:', logit.shape) # download the imagenet category list # 下載imageNet 分類標簽列表，并存儲在classes中（數字類別，類別名稱） # # 使用本地的 LABELS_PATH # with open(LABELS_PATH) as f: # data = json.load(f).items()# classes = {int(key):value for (key, value) in data} classes = {i : (str(i)) for i in range(0, 1000)} # 使用softmax打分 h_x = F.softmax(logit, dim=1).data.squeeze()# 分類分值# 對分類的預測類別分值排序，輸出預測值和在列表中的位置 probs, idx = h_x.sort(0, True) # 轉換數據類型 probs = probs.numpy() idx = idx.numpy() # 輸出預測分值排名在前五的五個類別的預測分值和對應類別名稱 for i in range(0, 5):print('{:.3f} -> {}'.format(probs[i], classes[idx[i]])) # generate class activation mapping for the top1 prediction # 輸出與圖片尺寸一致的CAM圖片 for i in range(len(features_blobs)):print('==features_blobs[{}].shape={}:'.format(i, features_blobs[i].shape)) CAMs = returnCAM(features_blobs[0], weight_softmax, [idx[0]]) # render the CAM and output print('output CAM.jpg for the top1 prediction: %s'%classes[idx[0]]) # 將圖片和CAM拼接在一起展示定位結果結果 img = cv2.imread('test.jpg') height, width, _ = img.shape # 生成熱度圖 heatmap = cv2.applyColorMap(cv2.resize(CAMs[0], (width, height)), cv2.COLORMAP_JET) result = heatmap * 0.3 + img * 0.5 cv2.imwrite('CAM.jpg', result)

?二.例子：對CIFAR10數據集進行訓練

import torch as t import torchvision as tv import torchvision.transforms as transforms from torchvision.transforms import ToPILImage import os from PIL import Image import matplotlib.pyplot as plt import cv2 show = ToPILImage() # 可以把Tensor轉成Image，方便可視化# 定義對數據的預處理 transform = transforms.Compose([transforms.ToTensor(), # 轉為Tensor 歸一化至0～1transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), # 歸一化])path='./data' if not os.path.exists(path):os.mkdir(path) # 訓練集 trainset = tv.datasets.CIFAR10(root=path,train=True,download=True,transform=transform)trainloader = t.utils.data.DataLoader(trainset,batch_size=4,shuffle=True,num_workers=2)# 測試集 testset = tv.datasets.CIFAR10(path,train=False,download=True,transform=transform)testloader = t.utils.data.DataLoader(testset,batch_size=4,shuffle=False,num_workers=2)classes = ('plane', 'car', 'bird', 'cat','deer', 'dog', 'frog', 'horse', 'ship', 'truck')(data, label) = trainset[100] print(data.shape) print(classes[label]) def vis_data_cv2():new_data = data.numpy()new_data = (new_data * 0.5 + 0.5) * 255print(new_data.shape)new_data = new_data.transpose((1, 2, 0))new_data = cv2.resize(new_data, (100, 100))new_data = cv2.cvtColor(new_data, cv2.COLOR_RGB2BGR)print(new_data.shape)cv2.imwrite('1.jpg', new_data)def vis_data_mutilpy():dataiter = iter(trainloader)images, labels = dataiter.next() # 返回4張圖片及標簽print(' '.join('%11s' % classes[labels[j]] for j in range(4)))img = show(tv.utils.make_grid((images + 1) / 2)).resize((400, 100))import numpy as npimg = np.array(img)img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)print(img.shape)cv2.imwrite('2.jpg', img)if __name__ == '__main__':# vis_data_cv2()vis_data_mutilpy()

上述可視化結果如下圖：

構造LeNet模型如下所示：

import torch.nn as nn import torch.nn.functional as F class Net(nn.Module):def __init__(self):super(Net, self).__init__()self.conv1 = nn.Conv2d(3, 6, 5)self.conv2 = nn.Conv2d(6, 16, 5)self.fc1 = nn.Linear(16 * 5 * 5, 120)self.fc2 = nn.Linear(120, 84)self.fc3 = nn.Linear(84, 10)def forward(self, x):x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))x = F.max_pool2d(F.relu(self.conv2(x)), 2)x = x.view(x.size()[0], -1)x = F.relu(self.fc1(x))x = F.relu(self.fc2(x))x = self.fc3(x)#x = F.softmax(self.fc3(x),dim=1)return x net = Net() print(net) for name, parameters in net.named_parameters():print(name, ':', parameters.size())params = list(net.parameters()) print(len(params)) print('params=',params)from torch import optimcriterion = nn.CrossEntropyLoss() # 交叉熵損失函數 optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)t.set_num_threads(8)for epoch in range(1):running_loss = 0.0for i, data in enumerate(trainloader, 0):if i<50:# print(len(data))# print(data[0].size())# 輸入數據inputs, labels = data# 梯度清零optimizer.zero_grad()# forward + backwardoutputs = net(inputs)loss = criterion(outputs, labels)# print('loss=',loss)loss.backward()# 更新參數optimizer.step()# 打印log信息# loss 是一個scalar,需要使用loss.item()來獲取數值，不能使用loss[0]running_loss += loss.item()if i % 100 == 0: # 每100個樣本打印一下訓練狀態print('[%d, %5d] loss: %.3f' \% (epoch + 1, i + 1, running_loss / 100))running_loss = 0.0 print('Finished Training')correct = 0 # 預測正確的圖片數 total = 0 # 總共的圖片數#由于測試的時候不需要求導，可以暫時關閉autograd，提高速度，節約內存 with t.no_grad():for i,data in enumerate(testloader):images, labels = dataoutputs = net(images)_, predicted = t.max(outputs, 1)total += labels.size(0)correct += (predicted == labels).sum()print('10000張測試集中的準確率為: %d %%' % (100 * correct / total))

三.PyTorch-OpCounter統計模型大小和參數量

GitHub - Lyken17/pytorch-OpCounter: Count the MACs / FLOPs of your PyTorch model.

import torch from torchvision.models import resnet50 from thop import profile model = resnet50() input = torch.randn(1, 3, 224, 224) flops, params = profile(model, inputs=(input, ))print('flops=',flops) print('params=',params)

flops計算

對于Eltwise Sum 來講，兩個大小均為 (N, C, H, W) 的 Tensor 相加，計算量就是 N x C x H x W；而對于卷積來說，計算量公式為（乘加各算一次）：

參數量: OC*KH*KW*IC

?訪存量:訪存量一般用?Bytes（或者?KB/MB/GB）來表示，即模型計算到底需要存/取多少 Bytes 的數據。

對于 Eltwise Sum 來講，兩個大小均為 (N, C, H, W) 的 Tensor 相加，訪存量是 (2 + 1) x N x C x H x W x sizeof(data_type)，其中 2 代表讀兩個 Tensor，1 代表寫一個 Tensor；而對于卷積來說，訪存量公式為：

?訪存量對于模型速度至關重要．

四.模型存儲與調用

""" torch: 0.4 """ import torch import matplotlib.pyplot as plt# torch.manual_seed(1) # reproducibledef train(x,y):# save net1model = torch.nn.Sequential(torch.nn.Linear(1, 10),torch.nn.ReLU(),torch.nn.Linear(10, 1))optimizer = torch.optim.SGD(model.parameters(), lr=0.5)loss_func = torch.nn.MSELoss()for t in range(100):prediction = model(x)loss = loss_func(prediction, y)optimizer.zero_grad()loss.backward()optimizer.step()# plot resultplt.title('train')plt.scatter(x.data.numpy(), y.data.numpy())plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)plt.show()torch.save(model.state_dict(), 'model_params.pth') # save only the parameters def inference(x,y):# restore only the parametersmodel = torch.nn.Sequential(torch.nn.Linear(1, 10),torch.nn.ReLU(),torch.nn.Linear(10, 1))# copy net1's parameters into net3model.load_state_dict(torch.load('model_params.pth'))prediction = model(x)plt.title('inference')plt.scatter(x.data.numpy(), y.data.numpy())plt.plot(x.data.numpy(), prediction.data.numpy(), 'r-', lw=5)plt.show() if __name__ == '__main__':# fake datax = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1) # x data (tensor), shape=(100, 1)y = x.pow(2) + 0.2 * torch.rand(x.size()) # noisy y data (tensor), shape=(100, 1)train(x,y)# inference(x,y)

? ? ? ? ? ? ?

字符識別代碼：

import os, sys, glob, shutil, jsonos.environ["CUDA_VISIBLE_DEVICES"] = '0' import cv2from PIL import Image import numpy as npfrom tqdm import tqdm, tqdm_notebookimport torchtorch.manual_seed(0) torch.backends.cudnn.deterministic = False torch.backends.cudnn.benchmark = Trueimport torchvision.models as models import torchvision.transforms as transforms import torchvision.datasets as datasets import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from torch.autograd import Variable from torch.utils.data.dataset import Dataset from model import SVHN_Model1class SVHNDataset(Dataset):def __init__(self, img_path, img_label, transform=None):self.img_path = img_pathself.img_label = img_labelif transform is not None:self.transform = transformelse:self.transform = Nonedef __getitem__(self, index):# print('===index:', index)# print('===self.img_path[index]:', self.img_path[index])img = Image.open(self.img_path[index]).convert('RGB')if self.transform is not None:img = self.transform(img)lbl = np.array(self.img_label[index], dtype=np.int)# print('====lbl:', lbl)# print('===list(lbl):', list(lbl))# print('===(5 - len(lbl)) * [10]:', (5 - len(lbl)) * [10])# 原始SVHN中類別10為數字0lbl = list(lbl) + (5 - len(lbl)) * [10]# print('===lbl:',lbl)return img, torch.from_numpy(np.array(lbl[:5]))def __len__(self):return len(self.img_path)def train_database():train_path = glob.glob('./data/mchar_train/*.png')train_path.sort()train_json = json.load(open('./data/train.json'))train_label = [train_json[x]['label'] for x in train_json]print('=len(train_path):', len(train_path), len(train_label))print('==train_label[:3]:', train_label[:3])train_loader = torch.utils.data.DataLoader(SVHNDataset(train_path, train_label,transforms.Compose([transforms.Resize((64, 128)),transforms.RandomCrop((60, 120)),transforms.ColorJitter(0.3, 0.3, 0.2),transforms.RandomRotation(10),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])),batch_size=64,shuffle=True)# for i, (input, target) in enumerate(train_loader):# if i<1:# input = input.cuda()# target = target.cuda()# print('==input.shape:', input.shape)# print('==target:', target)# breakreturn train_loaderdef val_database():val_path = glob.glob('./data/mchar_val/*.png')val_path.sort()val_json = json.load(open('./data/val.json'))val_label = [val_json[x]['label'] for x in val_json]print(len(val_path), len(val_label))val_loader = torch.utils.data.DataLoader(SVHNDataset(val_path, val_label,transforms.Compose([transforms.Resize((60, 120)),# transforms.ColorJitter(0.3, 0.3, 0.2),# transforms.RandomRotation(5),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])),batch_size=64,shuffle=False,num_workers=0,)# for i, (input, target) in enumerate(val_loader):# if i<1:# input = input.cuda()# target = target.cuda()# print('==input.shape:', input.shape)# print('==target:', target)# breakreturn val_loaderdef train(train_loader, model, criterion, optimizer, epoch):# 切換模型為訓練模式model.train()train_loss = []for i, (input, target) in enumerate(train_loader):if use_cuda:input = input.cuda()target = target.cuda()c0, c1, c2, c3, c4 = model(input)loss = criterion(c0, target[:, 0]) + \criterion(c1, target[:, 1]) + \criterion(c2, target[:, 2]) + \criterion(c3, target[:, 3]) + \criterion(c4, target[:, 4])# loss /= 6optimizer.zero_grad()loss.backward()optimizer.step()train_loss.append(loss.item())return np.mean(train_loss)def validate(val_loader, model, criterion):# 切換模型為預測模型model.eval()val_loss = []# 不記錄模型梯度信息with torch.no_grad():for i, (input, target) in enumerate(val_loader):if use_cuda:input = input.cuda()target = target.cuda()c0, c1, c2, c3, c4 = model(input)loss = criterion(c0, target[:, 0]) + \criterion(c1, target[:, 1]) + \criterion(c2, target[:, 2]) + \criterion(c3, target[:, 3]) + \criterion(c4, target[:, 4])# loss /= 6val_loss.append(loss.item())return np.mean(val_loss)def predict(test_loader, model, tta=10):model.eval()test_pred_tta = None# TTA 次數for _ in range(tta):test_pred = []with torch.no_grad():for i, (input, target) in enumerate(test_loader):if use_cuda:input = input.cuda()c0, c1, c2, c3, c4 = model(input)if use_cuda:output = np.concatenate([c0.data.cpu().numpy(),c1.data.cpu().numpy(),c2.data.cpu().numpy(),c3.data.cpu().numpy(),c4.data.cpu().numpy()], axis=1)else:output = np.concatenate([c0.data.numpy(),c1.data.numpy(),c2.data.numpy(),c3.data.numpy(),c4.data.numpy()], axis=1)test_pred.append(output)test_pred = np.vstack(test_pred)if test_pred_tta is None:test_pred_tta = test_predelse:test_pred_tta += test_predreturn test_pred_ttaif __name__ == '__main__':train_loader = train_database()val_loader = val_database()model = SVHN_Model1()criterion = nn.CrossEntropyLoss()optimizer = torch.optim.Adam(model.parameters(), 0.001)best_loss = 1000.0use_cuda = Trueif use_cuda:model = model.cuda()for epoch in range(100):print('====start train,epoch={}'.format(epoch+1))train_loss = train(train_loader, model, criterion, optimizer, epoch)val_loss = validate(val_loader, model, criterion)val_label = [''.join(map(str, x)) for x in val_loader.dataset.img_label]val_predict_label = predict(val_loader, model, 1)val_predict_label = np.vstack([val_predict_label[:, :11].argmax(1),val_predict_label[:, 11:22].argmax(1),val_predict_label[:, 22:33].argmax(1),val_predict_label[:, 33:44].argmax(1),val_predict_label[:, 44:55].argmax(1),]).Tval_label_pred = []for x in val_predict_label:val_label_pred.append(''.join(map(str, x[x != 10])))val_char_acc = np.mean(np.array(val_label_pred) == np.array(val_label))print('Epoch: {0}, Train loss: {1} \t Val loss: {2}'.format(epoch, train_loss, val_loss))print('Val Acc', val_char_acc)# 記錄下驗證集精度if val_loss < best_loss:best_loss = val_loss# print('Find better model in Epoch {0}, saving model.'.format(epoch))torch.save(model.state_dict(), './model.pt')

參考：

GitHub - chenyuntc/pytorch-book: PyTorch tutorials and fun projects including neural talk, neural style, poem writing, anime generation (《深度學習框架PyTorch：入門與實戰》)

PyTorch實戰指南 - 知乎

總結

以上是生活随笔為你收集整理的pytorch基础知识+构建LeNet对Cifar10进行训练+PyTorch-OpCounter统计模型大小和参数量+模型存储与调用的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Xception,Inception-R
下一篇： Opencv——基于索引表的图像细化