SR综述论文总结
文章目錄
- 論文:A Deep Journey into Super-resolution: A Survey
- 論文概要
- BackGround
- SISR 分類
- 實驗評估
- 未來方向
論文:A Deep Journey into Super-resolution: A Survey
作者:Saeed Anwar, Salman Khan, and Nick Barnes
論文概要
-
論文概要:
對比了近30個最新的超分辨率卷積網絡在6個數據集上(3個經典的,3個最近提出的)的表現,來給SISR定下基準。分成9類。我們還提供了在網絡復雜性、內存占用、模型輸入和輸出、學習細節、網絡損失類型和重要架構差異方面的比較。 -
SISR應用方面:
- large computer displays
- HD television sets
- hand-held devices (mobile phones,tablets, cameras etc.).
- object detection in scenes (particularly small objects )
- face recognition in surveillance videos
- medical imaging
- improving interpretation of images in remote sensing
- astronomical images
- forensics
-
超分辨率是一個經典的問題,但由于種種原因,仍然是計算機視覺領域一個具有挑戰性和開放性的研究課題
原因: - SR is an ill-posed inverse problem
(There exist multiple solutions for the same low-resolution image. To constrain the solution-space, reliable prior information is typically required.) - the complexity of the problem increases as the up-scaling factor increases.(x2,x4,x8,問題就會變得越來越難)
- assessment of the quality of output is not straightforward(模型的質量評估不容易,質量指標PSNR,SSIM只與人類的感知有松散的聯系)
-
DL 在其他 AI 領域的應用:
- 目標分類與探測
- 自然語言處理
- 圖像處理
- 音頻信號處理
-
本論文的貢獻:
- 全面回顧超分辨率的最新技術
- 基于各種超分辨率算法結構的不同提出一個的新的分類方法
- 基于參數數量、算法設置、訓練細節和重要的結構創新進行全面的分析
- 我們對算法進行系統的評估(在6個SISR數據集上)
- 討論了目前超分領域的挑戰和對未來研究的展望
BackGround
-
Degradation Process:
y=Φ(x;θη)(1)y = \Phi( x ; \theta_\eta)\tag{1}y=Φ(x;θη?)(1)
xxx:HR 圖像
yyy :LR 圖像
Φ\PhiΦ :degradation function
θη\theta_\etaθη?:degradation parameters (scaling factor,noise)現實中,只有 yyy 是可獲取的,并且沒有降解過程和也沒有降解參數,超分辨率就是試圖消除降解效應去獲得和 xxx (真實HR圖像) 近似的圖像 x^\hat{x}x^
x^=Φ?1(y,θ?)(2)\hat{x}=\Phi^{-1}(y,\theta_\varsigma)\tag{2}x^=Φ?1(y,θ??)(2)
θ?\theta_\varsigmaθ??:Φ?1\Phi^{-1}Φ?1的參數
降解的過程是未知且非常復雜的,受到很多因素影響,例如:noise (sensor and speckle), compression, blur (defocus and motion), and other artifacts因此,大多數研究工作相對(1)更喜歡下邊降解模型:
y=(x?k)↓s+n(3)y = (x \otimes k) \downarrow_s+ \ n\tag{3}y=(x?k)↓s?+?n(3)
kkk:blurring kernel
x?kx \otimes kx?k :convolution operation
↓s\downarrow_s↓s? :downsampling operation with a scaling factor sss
nnn : the additive white Gaussian noise (AWGN) with a standard deviation of σ\sigmaσ (noise level).圖像超分辨率的目標就是去最小化與模型y=x?k+ny = x \otimes k+ \ ny=x?k+?n相關的數據保真項 (data fidelity term) 如下:
J(x^,θ?,k)=∥x?k?y∥?datafidelityterm+αΨ(x,θ?)?regularizer(正則化)J(\hat{x},\theta_\varsigma,k)=\underbrace{\|x \otimes k -y\|}_{data\ fidelity\ term}+\underbrace{\alpha \Psi(x,\theta_\varsigma)}_{regularizer(正則化)}J(x^,θ??,k)=data?fidelity?term∥x?k?y∥??+regularizer(正則化)αΨ(x,θ??)??
α\alphaα:(the data fidelity term and image prior Ψ(?)\Psi(\cdot)Ψ(?))平衡系數自然圖像先驗
在自然圖像處理領域里,有很多問題(比如圖像去噪、圖像去模糊、圖像修復、圖像重建等)都是反問題 ,即問題的解不是唯一的。為了縮小問題的解的空間或者說為了更好的逼近真實解,我們需要添加限制條件。這些限制條件來自自然圖像本身的特性,即自然圖像的先驗信息。如果能夠很好地利用自然圖像的先驗信息,就可以從低質量的圖像上恢復出高質量的圖像,因此研究自然圖像的先驗信息是非常有意義的。
目前常用的自然圖像的先驗信息有自然圖像的局部平滑性、非局部自相似性、非高斯性、統計特性、稀疏性等特征 。
作者:showaichuan
鏈接:https://www.jianshu.com/p/ed8a5b05c3a4
來源:簡書基于圖像先驗,超分辨率的方法大致可以分為如下幾個類別:
- prediction methods
- edgebased methods
- statistical methods
- patch-based methods
- deep learning methods
SISR 分類
-
Linear networks
only a single path for signal flow without any skip connections or multiple-branchesnote:some linear networks learn to reproduce the residual image (the difference between the LR and HR images)
根據 up-sampling operation 可以分兩類:
-
early upsampling
首先對LR輸入進行上采樣以匹配所需的HR輸出大小,然后學習層次特征表示以生成輸出
常用的上采樣方法:雙立方插值算法- SRCNN(using only convolutional layers for super-resolution)
- 數據集:
training data set:
HR圖像:synthesized by extracting non-overlapping dense patches of size 32 ×\times× 32 from the HR images
LR圖像:The LR input patches are first downsampled and then upsampled using bicubic interpolation having the same size as the high-resolution output image - Layers :three convolutional and two ReLU layers
convolutional layer is termed as patch extraction or feature extraction(從輸入圖像創建特征映射)
convolutional layer is called non-linear mapping(非線性映射,將特征映射轉換為高維特征向量)
convolutional layer aggregates the features maps to output the final high-resolution image - Loss function:Mean Squared Error (MSE)
- VDSR
- Layers :deep CNN architecture
(the VGG-net and uses fixed-size convolutions 3 ×\times× 3 in all network layers)
To avoid slow convergence(緩慢收斂) in deep networks (specifically with 20 weight layers), they propose two effective strategies : - learn a residual mapping(殘差映射) that generates the difference between the HR and LR image(使得目標更簡單,網絡只聚焦在高頻信息)
- gradients are clipped with(夾在) in the range [?θ,+θ][\ -\theta,+\theta\ ][??θ,+θ?](使得到學習率可以加速訓練過程)
-
- 觀點:deeper networks can provide better contextualization and learn generalizable representations that can be used for multi-scale super-resolution
VDSR 與 ResNet - DnCNN
- learns to predict a high-frequency residual directly instead of the latent super-resolved image
- Layers :similar to SRCNN
- depends heavily on the accuracy of noise estimation without knowing the underlying structures and textures present in the image
- computationally expensive (batch normalization operations after every convolutional layer)
- IRCNN(Image Restoration CNN)
- 提出了一套基于CNN的去噪器,可以聯合用于圖像去噪、去模糊和超分辨率等幾個低層次的視覺任務
- Specifically,利用半二次分裂Half Quadratric Splitting (HQS)技術對觀測模型中的正則項和保真項進行解耦,然后,利用CNN具有較強的建模能力和測試時間效率,對去噪先驗進行判別學習
- Layers:CNN去噪器由7個(dilated convolution layers)擴張卷積層組成的堆棧組成,這些卷積層與批歸一化和ReLU非線性層交錯。擴展操作通過封閉更大的接受域有助于對較大的上下文進行建模。
- residual image learning is performed in a similar manner to previous architectures (VDSR, DRCN and DRRN)
- 使用小尺寸的訓練樣本和零填充來避免卷積運算造成的(boundary artifacts)邊界偽影
-
late upsampling
后上采樣網絡對低分辨率輸入進行學習,然后對網絡輸出附近的特征進行上采樣(低內存占用)- FSRCNN
- improves speed and quality over SRCNN
- Datasets: 91-image dataset ,
Data augmentation such as rotation, flipping,and scaling is also employed to increase the number of images by 19 times - Layers:consists of four convolution layers (feature extraction, shrinking, non-linear mapping, and expansion layers)and one deconvolution
- feature extraction step is similar to SRCNN(difference lies in the input size and the filter size, the input to FSRCNN is the original patch without upsampling it)
- shrinking layer : reduce the feature dimensions (number of parameters) by adopting a smaller filter size (i.e. f=1)
- non-linear mapping (critical step):the size of filters in the non-linear mapping layer is set to three,while the number of channels is kept the same as the previous layer
- expansion layers :an inverse operation of the shrinking step to increase the number of dimensions
- upsampling and aggregating deconvolution layer : stride acts as an upscaling factor
- 使用(PReLU)代替了每個卷積層后的整流線性單元(ReLU)
- Loss Funcion : mean-square error
- ESPCN(Efficient sub-pixel convolutional neural network)
a fast SR approach that can operate in real-time
both for images and videos - perform feature extraction in the LR space
- at the very end to aggregate LR feature maps and simultaneously perform projection to high dimensional space to reconstruct the HR image.
- sub-pixel convolution operation used in this work is essentially similar to convolution transpose or deconvolution operation(使用 fractional kernel stride 分數級步幅用于提高輸入特征圖的空間分辨率)
- Loss Function :l1l_1l1? loss
A separate upscaling kernel is used to map each feature map
Residual Networks(殘差網絡)
uses skip connections in the network design (avoid gradients vanishing, more feasible)
algorithms learn residue i.e. the high-frequencies between the input and ground-truth
根據 the number of stages used in such networks 可以分成兩類:
-
Single-stage Residual Nets
- EDSR(The Enhanced Deep Super-Resolution)
modifies the ResNet architecture to work with the SR task
- EDSR(The Enhanced Deep Super-Resolution)
- Removing Batch Normalization layers (from each residual block) and ReLU activation (outside residual blocks) (實質性的改進)
- Similar to VDSR, they also extended their single scale approach to work on multiple scales.
- Propose Multi-scale Deep SR (MDSR) architecture(reduces the number of parameters through a majority of shared parameters)
- 特定于尺度的層僅并行地應用于輸入和輸出塊附近,以學習與尺度相關的表示。
- Data augmentation (rotations and flips) was used to create a ‘self-ensemble’ ( transformed inputs are passed through the network, reverse-transformed and averaged together to create a single output )
- Better performance compared to SR-CNN, VDSR,SR-GAN
- Loss Function :l1l_1l1? loss
- CARN(Cascading residual network)級聯殘差網絡
- 與其他模型的區別在于本地和全局級聯模塊的存在
- 中間層的特點是級聯的,且聚集到一個 1×11\times11×1的卷積層上
- 本地級聯連接與全局級聯連接相同,只是這些塊是簡單的剩余塊。
- DateSets:using 64×6464\times6464×64 patches from BSD , Yang et al. and DIV2K dataset with data augmentation
- Loss Function :l1l_1l1? loss
- Adam is used for optimization with an initial learning rate of 10?410^{-4}10?4 which is halved after every 4×1054\times 10 ^ 54×105 steps
Multi-stage Residual Nets
composed of multiple subnets that are generally trained in succession (第一個子網通常預測粗特征,而其他子網改進了初始預測)
encoder-decoder designs(first downsample the input using an encoder and then perform upsampling via a decoder)(hence two distinct stages)
- FormResNet
composed of two networks, both of which are similar to DnCNN,the difference lies in the loss layers
Loss = Euclidean loss + perceptual loss
The classical algorithms such as BM3D can also replace this formatting layer
第二層網絡的輸入取自第一層網絡
DiffResNet learns the structured regions
- BTSRN(balanced two-stage residual networks)
composed of a low-resolution stage and a high-resolution stage
feature maps have a smaller size, the same as the input patch
(通過反褶積和最近鄰上采樣對特征圖進行上采樣)
The upsampled feature maps are then fed into the high-resolution stage
residual block consists of 1×11 \times 11×1 convolutional layer as a feature map projection to decrease the input size of 3×33 \times 33×3 convolutional features
LR stage has six residual blocks,HR stage consists of four residual blocks
During training, the images are cropped to 108×108108 \times 108108×108 sized patches and augmented using flipping and rotation operations
- REDNet(Residual Encoder Decoder Network)
composed of convolutional and symmetric deconvolutional layers
(ReLU is added after each convolutional and deconvolutional layer)
(在保留對象結構和去除退化的同時提取特征映射)
reconstruct the missing details of the images
卷積層的特征映射與鏡像反卷積層的輸出相加,然后進行非線性校正
(outcome)high-resolution image
該網絡具有端到端可訓練性,通過最小化系統output與ground truth之間的 l2?norml_2 -norml2??norm 來達到收斂性
best performing architecture has 30 weight layers, each with 64 feature maps
Ground truth: The patches of size 50×5050 \times 5050×50
Input patches : 輸入的patch是通過對patch進行降采樣,再用雙三次插值的方法將其恢復到原來的大小
5×55 \times 55×5, respectively.
這些小塊通過其平均值和方差被歸一化,這些平均值和方差隨后被添加到相應的恢復后的最終高分辨率輸出中
the kernel has a size of 5×55 \times 55×5 with 128 feature channels
Recursive networks(遞歸網絡)
employ recursively connected convolutional layers or recursively linked units
這些設計背后的主要動機是將較難的SR問題逐步分解為一組較簡單的SR問題
- DRCN(Deep Recursive Convolutional Network)
這種技術的一個優點是,對于更多的遞歸,參數的數量保持不變
composed of three smaller networks:
analyzes image regions by recursively applying a single layer (consisting of convolution and ReLU)
The size of the receptive field is increased after each recursion.
The output of the inference net is high-resolution feature maps
- DRRN(Deep Recursive Residual Network)
a deep CNN model but with conservative parametric complexity
這是通過將residual image learning與網絡中small blocks層之間的local identity connections相結合來實現的
這種并行信息流實現了對更深層架構的穩定訓練
由于在復制之間共享參數,內存成本和計算復雜度顯著降低
- MemNet(memory network)
MemNet can be broken down into three parts similar to SRCNN
extracts features from the input image
consists of a series of memory blocks
memory block = a recursive unit + a gate unit
composed of two convolutional layers with a pre-activation mechanism and dense connections to the gate unit
Progressive reconstruction designs
To deal with large factors,predict the output in multiple steps i.e.i.e.i.e. ×2\times 2×2 followed by ×4\times 4×4
(CNN算法可一步預測輸出;但是,對于大比例因子而言,這可能不可行)
- SCN(sparse coding-based network)基于稀疏編碼的網絡
將稀疏編碼的優點與深度神經網絡的領域知識相結合,以獲得一個緊湊的模型并提高性能
mimics a Learned Iterative Shrinkage and Thresholding Algorithm (LISTA) network to build a multi-layer neural network
LISTA階段由兩個線性層和一個非線性層組成,其中激活函數具有一個閾值(threshold),該閾值在訓練過程中被學習/更新。
為了簡化訓練,將非線性神經元分解為兩個(linear scaling layers)線性標度層和一個(unit-threshold neuron)單位閾值神經元
兩個尺度層是對角矩陣,它們互為倒數,例如,如果存在乘法尺度層,則在閾值單位之后進行除法
- LapSRN(Deep Laplacian pyramid super-resolution network)深度拉普拉斯金字塔超分辨率網絡
consists of three sub-networks that progressively predict the residual images up to a factor of ×8\times8×8
將每個子網絡的殘差圖像加入到輸入LR圖像中,得到SR圖像
(first sub-network) a residue of ×2\times2×2
(second sub-network) a residue of ×4\times4×4
(last sub-network) a residue of ×8\times8×8
將這些剩余圖像加入相應比例的上采樣圖像中,得到最終的超分辨圖像。
將the addition of bicubic images with the residue稱為image reconstruction branch
employed at every sub-network, resembling a multi-loss structure
Densely Connected Networks
DenseNet architecture
這種設計的主要動機是將沿著網絡深度可用的層次線索組合起來(combine hierarchical cues available along the network depth),以實現更高的靈活性和更豐富的特性表示。
- SR-DenseNet
based on the DenseNet which uses dense connections between the layers(a layer directly operates on the output from all previous layers)
這樣,只有高層次的特征被用于重建最終的SR圖像
跳躍連接用于組合低層次和高層次的特征
Since complementary features are encoded at multiple stages in the network, the combination of all feature maps gives the best performance
- RDN(Residual Dense Network)
combines residual skip connections (inspired by SR-ResNet) with dense connections (inspired by SR-DenseNet)
主要動機是充分利用(hierarchical feature representations)分層特性表示來學習(local patterns)局部模式
由于密集的連接會很快產生高維輸出,因此每個RDB使用了一種包含一個1×11\times11×1卷積的局部特征融合方法來減少維數
- D-DBPN(Dense deep back-projection network)致密深部反投影網絡
從傳統的SR方法中獲得靈感(迭代地執行反向投影,以了解LR和HR圖像之間的反饋錯誤信號)
其動機是,只有前饋方法不是建模從LR到HR圖像映射的最佳方法,而反饋機制可以極大地幫助實現更好的結果
將網絡中多個深度的HR圖像進行組合,得到最終的輸出
在upsampled feature map中添加residual signal 可提供錯誤反饋,并迫使網絡專注于精細細節
Multi-branch designs
多分支網絡的目標是在多個上下文范圍(multiple context scales)內獲得一組不同的特性,然后將這些互補信息融合在一起,得到更好的HR重構。
這種設計還支持多路徑信號流,從而在訓練過程中更好地進行前向和后向的信息交換
- CNF(Context-wise Network Fusion)
融合多個卷積神經網絡實現圖像超分辨率
每個SRCNN都由不同數量的層構成,然后,每個SRCNN的輸出通過一個單獨的卷積層傳遞,最終使用sum-pooling將它們融合在一起
The size of each patch is 33×3333 \times 3333×33 pixels of luminance channel only
(then)the fused network is trained (epochs = 10 ,learning rate =1e-4 )
- CMSC(Cascaded multi-scale cross-network)級聯多尺度交叉網絡
composed of a feature extraction layer, cascaded subnets, and a reconstruction network
每個MR塊由兩個并行的分支組成,每個分支有兩個卷積層,每個分支的(residual connections)剩余連接累積在一起,然后分別添加到兩個分支的輸出中
CMSC的每個子網均由四個MR塊組成,這些MR塊具有3×33\times33×3、5×55\times55×5和7×77\times77×7的不同接收字段,以多個尺度捕獲上下文信息
MR塊中的每個卷積層后面都是batch normalization和Leaky-ReLU
- IDN(Information Distillation Network)
consists of three blocks: a feature extraction block, multiple stacked information distillation blocks and a reconstruction block
(feature extraction block)composed of two convolutional layers to extract features
(distillation block)made up of two other blocks, an enhancement unit, and a compression unit.
enhancement unit :six convolutional layers followed by leaky ReLU
將第三個卷積層的輸出進行切片,將其中的一半與block的輸入進行拼接,將剩下的一半作為第四個convolutional layer的輸入
The output of the concatenated component (連接組件) is added with the output of the enhancement block. In total, four enhancement blocks are utilized.
compression unit :the compression unit is realized using a 1×11\times11×1 convolutional layer after each enhancement block.
( reconstruction block) a deconvolution layer with a kernel size of 17×1717\times1717×17 .
Loss Function: 首先利用(absolute mean error loss)絕對平均誤差損失對網絡進行訓練,然后利用(mean square error loss)均方誤差損失對網絡進行微調
Input :The input patch size is 26×2626\times2626×26
The initial learning rate is set to be 1e?41e-41e?4 for a total of 10510^5105 iterations
utilizing Adam as an optimizer
Attention-based Networks
在前面討論的網絡設計中,所有的空間位置和信道對于超分辨率都具有統一的重要性,在某些情況下,它有助于有選擇地關注給定層中的少數特性。
基于注意力的模型允許這種靈活性,并考慮到并非所有的特性都是超分辨率的必要條件,但它們的重要性各不相同。與深度網絡相結合,最近的基于注意力的模型顯示了SR的顯著改進。
- SelNet
a novel selection unit for the image super-resolution network
選擇單元由一個恒等映射和一個ReLU級聯、一個1×11\times 11×1卷積層和一個sigmoid層組成
- RCAN(Residual Channel Attention Network)
(a) 一種遞歸殘差設計,其中(residual connections)殘差連接存在于(global residual network)全局殘差網絡的每個塊中
(b) 每個(local residual block)局部剩余塊都有一個(channel attention mechanism)通道注意機制:the filter activations are collapsed from h×w×ch\times w \times ch×w×c to a vector with 1×1×c1\times 1\times c1×1×c dimensions (after passing through a bottleneck) that acts as a selective attention over channel maps
第二個貢獻是允許網絡將重點放在對最終任務更重要的選擇性特征映射上,并有效地建模特征映射之間的關系
- SRRAM(Residual Attention Module for SR)
SRRAM結構類似于RCAN,這兩種方法都受到了EDSR的啟發
The SRRAM can be divided into three parts :
SRRAM的基本單元,由residual blocks、spatial attention和channel attention組成,用于學習inter-channel and intra-channel dependencies通道間和通道內的依賴關系
Multiple-degradation handling networks
in reality, multiple degradations can simultaneously occur
- ZSSR(Zero-Shot Super-Resolution)
該方法在經典方法的基礎上,利用內部圖像統計信息,利用深度神經網絡對圖像進行超分辨
這里的目的是根據測試圖像生成的LR圖像預測測試圖像
一旦網絡學習了LR測試圖像和測試圖像之間的關系,就會使用相同的網絡以測試圖像為輸入來預測SR圖像
因此,它不需要對特定的退化訓練圖像,并且可以在推理過程中動態地學習特定于圖像的網絡
- SRMD(Super-resolution network for multiple degradations)
takes a concatenated low-resolution image and its degradation maps.
(First) a cascade of convolutional layers of 3×33 \times 33×3 filter size is applied to extracted features, followed by a sequence of Conv, ReLU and Batch normalization layers
(Furthermore)similar to ESPCN,利用卷積運算提取HR子圖像
(final) HR sub-images are transformed to the final single HR output
the connections from the first noise-level maps in the convolutional layers are removed
the rest of the architecture is similar to SRMD
學習速率降低的標準是基于the error change between successive epochs
然而,它聯合處理多種降解的能力提供了一種獨特的能力
GAN Models
采用博弈論方法,其中模型由兩個部分組成,即生成器和鑒別器。該生成器生成的SR圖像是鑒別器無法識別是否是真實HR圖像或人工超分辨輸出
這樣就產生了感知質量更好的HR圖像,相應的PSNR值通常會降低(PSNR值越小表示圖像失真越大)(這突出了SR文獻中流行的定量測量方法沒能很好的描述出生成的HR圖像的感知質量)
- SRGAN
SRGAN提出使用一種對抗目標函數來促使超分辨輸出近乎接近自然圖像。
(1)a MSE loss that encodes pixel-wise similarity
(2)a perceptual similarity metric in terms of a distance metric (defined over high-level image representation (e.g., deep network features))
(3)an adversarial loss
平衡了生成器和鑒別器之間的最小最大博弈(標準GAN目標)
image.
competitors:optimize direct data dependent measures (such as pixel-errors)
- EnhanceNet
這個網絡設計的重點是在高分辨率的超分辨率圖像中創建如實的紋理細節。
(the perceptual loss function)was defined on the intermediate feature representation of a pretrained network in the form of l1l_{1}l1? distance
(the texture matching loss)用于低分辨率和高分辨率圖像的紋理匹配 , is quantified as the l1l_{1}l1? loss between gram matrices computed from deep features
- SRFeat
another GAN-based Super-Resolution algorithm with Feature Discrimination
這項工作的重點是輸入圖像的真實感,使用一個額外的鑒別器來幫助生成器生成高頻結構特征(是通過鑒別機器生成圖像和真實圖像的特征來實現的),而不是noisy artifacts
followed by fine-tuning on augmented DIV2K dataset using learning rates of 10?410^{-4}10?4 to 10?610^{-6}10?6.
- ESRGAN(Enhanced Super-Resolution Generative Adversarial Networks)
在SRGAN的基礎上構建,刪除batch normalization和incorporating dense blocks
實驗評估
-
Dataset
Set5
Set14
BSD100
Urban100
DIV2K
Manga109 -
Quantitative Measures
PSNR(peak signal-to-noise ratio)
SSIM(structural similarity index)
-
Number of parameters
-
Choice of network loss
卷積神經網絡 : - 平均絕對誤差 l1l_{1}l1?
- 均方誤差 MSE l2l_{2}l2?
- 感知損失(對抗損失)
- 像素級損失(MSE)
-
Network Depth
目前這批CNNs正在加入更多的卷積層來構建更深層次的網絡,以提高圖像質量和數量,自SRCNN誕生以來,這一趨勢一直是深度SR的主導趨勢 -
Skip Connections
這些連接可以分為四種主要類型:全局連接、局部連接、遞歸連接和密集連接
生成對抗網絡(GANs):
未來方向
- Incorporation of Priors
- Objective Functions and Metrics
- Need for Unified Solutions
- Unsupervised Image SR
- Higher SR rates
- Arbitrary SR rates
- Real vs Artificial Degradation
總結
- 上一篇: 语音识别中的鸡尾酒会问题
- 下一篇: 密码学·常用网址