當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

cv2 interpolate插值-align_corners

發(fā)布時(shí)間：2023/12/20 编程问答 24 豆豆

生活随笔收集整理的這篇文章主要介紹了 cv2 interpolate插值-align_corners 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

torch interpolate

torch.nn.functional.interpolate(input, size=None, scale_factor=None, mode='nearest', align_corners=None, recompute_scale_factor=None)

input (Tensor)：輸入數(shù)據(jù)
size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int])：輸出數(shù)據(jù)的尺寸
scale_factor (float or Tuple[float])：縮放因子
mode (str)：采樣算法
align_corners (bool, optional)：幾何上，我們認(rèn)為輸入和輸出的像素是正方形，而不是點(diǎn)。如果設(shè)置為True，則輸入和輸出張量由其角像素的中心點(diǎn)對齊，從而保留角像素處的值。如果設(shè)置為False，則輸入和輸出張量由它們的角像素的角點(diǎn)對齊，插值使用邊界外值的邊值填充;當(dāng)scale_factor保持不變時(shí)，使該操作獨(dú)立于輸入大小。僅當(dāng)使用的算法為’linear’, ‘bilinear’, 'bilinear’or 'trilinear’時(shí)可以使用。默認(rèn)設(shè)置為False

角像素：縮放后四個(gè)角的像素值

注意：

scale_factor與size只能設(shè)置一個(gè)。

當(dāng)設(shè)置scale_factor時(shí)，會對輸出size下取整，比如輸入[2, 2], scale_factor=2.1, 則輸出size為[4.2, 4.2] = [4, 4]。

當(dāng)設(shè)置scale_factor時(shí)，再設(shè)置recompute_scale_factor時(shí)，會根據(jù)輸出的實(shí)際大小重新計(jì)算一下scale_factor。

用scale_factor不用size是因?yàn)閟cale_factor可以不寫死大小，而size會固定輸出大小，在處理多分辨率輸入圖像的時(shí)候會有問題。

input：輸入Tensor。size：插值后輸出Tensor的空間維度的大小，這個(gè)spatial size就是去掉Batch，Channel，Depth維度后剩下的值。比如NCHW的spatial size是HW。scale_factor(float 或者 Tuple[float])：spatial size的乘數(shù)，如果是tuple則必須匹配輸入數(shù)據(jù)的大小。 mode(str)：上采樣的模式，包含'nearest' | 'linear' | 'bilinear' | 'bicubic' | 'trilinear' | 'area'。默認(rèn)是 'nearest'。align_corners(bool)：在幾何上，我們將輸入和輸出的像素視為正方形而不是點(diǎn)。如果設(shè)置為True，則輸入和輸出張量按其角像素的中心點(diǎn)對齊，保留角像素處的值。如果設(shè)置為False，則輸入和輸出張量按其角像素的角點(diǎn)對齊，插值使用邊緣值填充來處理邊界外值，當(dāng)scale_factor保持不變時(shí)，此操作與輸入大小無關(guān)。這僅在mode為 'linear' | 'bilinear' | 'bicubic' | 'trilinear'時(shí)有效。默認(rèn)值是False。recompute_scale_factor(bool)：重新計(jì)算用于插值計(jì)算的 scale_factor。當(dāng) scale_factor 作為參數(shù)傳遞時(shí)，它用于計(jì)算 output_size。如果 recompute_scale_factor 為 False 或未指定，則傳入的 scale_factor 將用于插值計(jì)算。否則，將根據(jù)用于插值計(jì)算的輸出和輸入大小計(jì)算新的 scale_factor（即，等價(jià)于顯示傳入output_size）。請注意，當(dāng) scale_factor 是浮點(diǎn)數(shù)時(shí)，由于舍入和精度問題，重新計(jì)算的 scale_factor 可能與傳入的不同。

ops_version對導(dǎo)出onnx影響：

op9, op10是Unsample，而op11變成了Resize。

不同的ops_version對interpolate的支持程度：

F.interpolatenearest bilinear, align_corners=Falsebilinear, align_corners=Truebicubic

op-9	Y	Y	N	N
op-10	Y	Y	N	N
op-11	Y	Y	Y	Y

align_corner的表現(xiàn)行為：

align_corner

如果設(shè)置為True，則輸入和輸出張量由其角像素的中心點(diǎn)對齊，從而保留角像素處的值。如果設(shè)置為False，則輸入和輸出張量由它們的角像素的角點(diǎn)對齊，插值使用邊界外值的邊值填充

當(dāng)**align_corners = True**時(shí)，像素被視為網(wǎng)格的格子上的點(diǎn),拐角處的像素對齊.可知是點(diǎn)之間是等間距的
當(dāng)**align_corners = False**時(shí), 像素被視為網(wǎng)格的交叉線上的點(diǎn), 拐角處的點(diǎn)依然是原圖像的拐角像素,但是差值的點(diǎn)間卻按照上圖的取法取,導(dǎo)致點(diǎn)與點(diǎn)之間是不等距的

opencv, PIL的align_corner為False， mxnet為True，而torch和tensorflow可以設(shè)置。

?首先介紹 align_corners=False，它是 pytorch 中 interpolate 的默認(rèn)選項(xiàng)。這種設(shè)定下，我們認(rèn)定像素值位于像素塊的中心，如下圖所示：(3*3)

?對它上采樣兩倍后，得到下圖:(6*6)

首先觀察綠色框內(nèi)的像素，我們會發(fā)現(xiàn)它們嚴(yán)格遵守了 bilinear 的定義。而對于角上的四個(gè)點(diǎn)，其像素值保持了原圖的值。邊上的點(diǎn)（超出邊界的點(diǎn)）則根據(jù)角點(diǎn)的值填充。所以，我們從全局來看，內(nèi)部和邊緣處采用了比較不同的規(guī)則。?

# align_corners = False # x_ori is the coordinate in original image # x_up is the coordinate in the upsampled image x_ori = (x_up + 0.5) / factor - 0.5

?接下來，我們看看 align_corners=True 情況下，用同樣畫法對上采樣的可視化：(5*5)

這里像素之間毫無對齊的美感，強(qiáng)迫癥看到要爆炸。事實(shí)上，在 align_corners=True 的世界觀下，上圖的畫法是錯(cuò)誤的。在其世界觀里，像素值位于網(wǎng)格上，如下圖所示：?

那么，把它上采樣兩倍后，我們會得到如下的結(jié)果：

1、align_corners 參數(shù)的實(shí)驗(yàn)（11-14）

import torch import torch.nn as nn import torch.nn.functional as F s= [4.] # 由于函數(shù)需要浮點(diǎn)數(shù),所以需要加點(diǎn) s = torch.tensor(s).reshape(1, 1, 1, 1) # 自定義s的通道數(shù)和尺寸大小 print(s) # tensor([[[[4.]]]]) x = F.interpolate(s, size=(1,4), mode='bilinear', align_corners=False) print(x) # tensor([[[[4., 4., 4., 4.]]]])x = F.interpolate(s, size=(32,32), mode='bilinear', align_corners=False) print(x) #tensor([[[[4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.], # ..., # [4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.]]]])

2、align_corners 參數(shù)的實(shí)驗(yàn)（22-44）

import torch import torch.nn as nn import torch.nn.functional as Fa = [[1., 2.], [4., 5.]] a = torch.tensor(a).reshape(1, 1, 2, 2) x = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=True) print(x) #tensor([[[[1.0000, 1.3333, 1.6667, 2.0000], # [2.0000, 2.3333, 2.6667, 3.0000], # [3.0000, 3.3333, 3.6667, 4.0000], # [4.0000, 4.3333, 4.6667, 5.0000]]]]) # 等距 # 像素被視為網(wǎng)格的格子上的點(diǎn),拐角處的像素對齊.可知是點(diǎn)之間是等間距的y = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=False) print(y) #tensor([[[[1.0000, 1.2500, 1.7500, 2.0000], # [1.7500, 2.0000, 2.5000, 2.7500], # [3.2500, 3.5000, 4.0000, 4.2500], # [4.0000, 4.2500, 4.7500, 5.0000]]]]) # 不等距 #

3、align_corners 參數(shù)的實(shí)驗(yàn)（33-66）?

import torch import torch.nn as nn import torch.nn.functional as F a = [[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]] a = torch.tensor(a).reshape(1, 1, 3, 3) print(a) #tensor([[[[1., 2., 3.], # [4., 5., 6.], # [7., 8., 9.]]]])#*************等價(jià)的寫法**********# r = torch.arange(1,10,dtype=torch.float32).view(1,1,3,3) r #tensor([[[[1., 2., 3.], # [4., 5., 6.], # [7., 8., 9.]]]]) #*********************************#x = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=True) print(x) #tensor([[[[1.0000, 1.4000, 1.8000, 2.2000, 2.6000, 3.0000], # [2.2000, 2.6000, 3.0000, 3.4000, 3.8000, 4.2000], # [3.4000, 3.8000, 4.2000, 4.6000, 5.0000, 5.4000], # [4.6000, 5.0000, 5.4000, 5.8000, 6.2000, 6.6000], # [5.8000, 6.2000, 6.6000, 7.0000, 7.4000, 7.8000], # [7.0000, 7.4000, 7.8000, 8.2000, 8.6000, 9.0000]]]]) # 等距y = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=False) print(y) #tensor([[[[1.0000, 1.2500, 1.7500, 2.2500, 2.7500, 3.0000], # [1.7500, 2.0000, 2.5000, 3.0000, 3.5000, 3.7500], # [3.2500, 3.5000, 4.0000, 4.5000, 5.0000, 5.2500], # [4.7500, 5.0000, 5.5000, 6.0000, 6.5000, 6.7500], # [6.2500, 6.5000, 7.0000, 7.5000, 8.0000, 8.2500], # [7.0000, 7.2500, 7.7500, 8.2500, 8.7500, 9.0000]]]]) # 不等距

?補(bǔ)充說明：

由于圖像雙線性插值只會用相鄰的4個(gè)點(diǎn)，因此上述公式雙線性插值的分母都是1。opencv中的源碼如下，用了一些優(yōu)化手段，比如用整數(shù)計(jì)算代替float（下面代碼中的*2048就是變11位小數(shù)為整數(shù)，最后有兩個(gè)連乘，因此>>22位），以及源圖像和目標(biāo)圖像幾何中心的對齊
-?SrcX=(dstX+0.5)* (srcWidth/dstWidth) -0.5
-?SrcY=(dstY+0.5) * (srcHeight/dstHeight)-0.5，
這個(gè)要重點(diǎn)說一下，源圖像和目標(biāo)圖像的原點(diǎn)（0，0）均選擇左上角，然后根據(jù)插值公式計(jì)算目標(biāo)圖像每點(diǎn)像素，假設(shè)你需要將一幅5x5的圖像縮小成3x3，那么源圖像和目標(biāo)圖像各個(gè)像素之間的對應(yīng)關(guān)系如下。如果沒有這個(gè)中心對齊，根據(jù)基本公式去算，就會得到左邊這樣的結(jié)果；而用了對齊，就會得到右邊的結(jié)果：

?原本的插值公式：

（原本的）srcX=dstX*(srcW/dstW)? eg:srcX=0(5/3)=0?

（中心對齊）srcX=(0+0.5)/(5/3)-0.5=1/3

中心點(diǎn)對齊的縮放在卷積網(wǎng)絡(luò)結(jié)構(gòu)設(shè)計(jì)中的注意事項(xiàng)

OpenCV縮放圖片是基于中心點(diǎn)對齊的，
Pytorch中 mode=‘bilinear’, align_corners=False 與OpenCV中的保持一致，
Pytorch中 mode=‘bilinear’, align_corners=True 與TensorFlow中的align_corners=True的條件下保持一致。

tensorFlow的resize_bilinear并未中心對齊，坐標(biāo)計(jì)算方式為

align_corners=False：

srcX=dstX* (srcWidth/dstWidth) ,
srcY = dstY * (srcHeight/dstHeight)

align_corners=True：

srcX=dstX* (srcWidth-1/dstWidth-1) ,
srcY = dstY * (srcHeight-1/dstHeight-1)

參考博客：

一文看懂a(chǎn)lign_corners - 知乎

cv2.reisze, interpolate采樣比較 - bairuiworld - 博客園

【上采樣問題】雙線性插值的幾何中心點(diǎn)重合與align_corners_Hali_Botebie的博客-CSDN博客

總結(jié)

以上是生活随笔為你收集整理的cv2 interpolate插值-align_corners的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： c++课程设计——美发店管理系统
下一篇： Pr 入门教程：如何处理图片文件？