日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

【论文翻译】Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing

發(fā)布時(shí)間:2024/3/12 编程问答 39 豆豆
生活随笔 收集整理的這篇文章主要介紹了 【论文翻译】Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Abstract

于人臉認(rèn)證系統(tǒng)的高安全性需求,面部反欺騙(a.k.a演示攻擊檢測(cè))已引起越來越多的關(guān)注。當(dāng)訓(xùn)練和測(cè)試欺騙樣本擁有相似的模式時(shí),現(xiàn)有的基于CNN的方法通常很好地識(shí)別欺騙攻擊,但它們的性能會(huì)在未知場(chǎng)景的測(cè)試欺騙攻擊上急劇下降。在本文中,我們?cè)噲D通過設(shè)計(jì)兩個(gè)新穎性的CNN模型來提高方法的泛化能力和適用性。首先,我們針對(duì)CNN模型提出了一種簡(jiǎn)單但有效的總成對(duì)混淆(TPC)損失函數(shù),這增強(qiáng)了學(xué)習(xí)演示攻擊(PA)的通用性表現(xiàn)。其次,我們將快速域適應(yīng)(FDA)組件納入CNN模型,以減輕數(shù)據(jù)在不同域的變化帶來的負(fù)面影響。此外,我們提出的模型,名為可推廣面部認(rèn)證CNN(GFA-CNN),以多任務(wù)方式工作,同時(shí)執(zhí)行面部反欺騙和面部識(shí)別。實(shí)驗(yàn)結(jié)果表明,GFA-CNN優(yōu)于以前的人臉反欺騙方法,并且很好地保留了輸入人臉圖像的身份信息。

?

?

Face anti-spoofing (a.k.a presentation attack detection) has drawn growing attention due to the high security de- mand in face authentication systems. Existing CNN-based approaches usually well recognize the spoofing faces when training and testing spoofing samples display similar pat- terns, but their performance would drop drastically on test- ing spoofing faces of unseen scenes. In this paper, we try to boost the generalizability and applicability of these methods by designing a CNN model with two major novelties. First, we propose a simple yet effective Total Pairwise Confusion (TPC) loss for CNN training, which enhances the general- izability of the learned Presentation Attack (PA) representa- tions. Secondly, we incorporate a Fast Domain Adaptation (FDA) component into the CNN model to alleviate negative effects brought by domain changes. Besides, our proposed model, which is named Generalizable Face Authentication CNN (GFA-CNN), works in a multi-task manner, perform- ing face anti-spoofing and face recognition simultaneously. Experimental results show that GFA-CNN outperforms pre- vious face anti-spoofing approaches and also well preserves the identity information of input face images.

?

1. Introduction

?

盡管近年取得了顯著進(jìn)步,但人臉識(shí)別系統(tǒng)的安全性仍然容易受到打印照片或重放視頻的演示攻擊(PA)的影響。為了抵消PA,面部反欺騙[25,19]被開發(fā)出來并作為面部識(shí)別之前的一個(gè)步驟。

早期的面部反欺騙方法主要采用手工制作的功能,如LBP [8],HoG [16]和SURF [5],以找出現(xiàn)場(chǎng)和欺騙面孔之間的差異。在[27]中,CNN首次用于面部反欺騙,在數(shù)據(jù)庫內(nèi)測(cè)試中取得了顯著的性能。在他們的工作之后,已經(jīng)提出了許多基于CNN的方法,幾乎??所有方法都將面部反欺騙視為二元(實(shí)時(shí)與欺騙)分類問題。然而,考慮到CNN的巨大解決方案空間,這些方法往往會(huì)遭受過度擬合以及對(duì)新PA模式和環(huán)境的不良通用性。在這項(xiàng)工作中,我們?cè)噲D使反欺騙系統(tǒng)能夠部署在各種環(huán)境中,即具有良好的通用性。

?

?

Despite the recent noticeable advances, the security of face recognition systems is still vulnerable to Presentation Attacks (PA) with printed photos or replayed videos. To counteract PA, face anti-spoofing [25, 19] is developed and serves as a pre-step prior to face recognition.

Earlier face anti-spoofing approaches mainly adopt handcrafted features, like LBP [8], HoG [16] and SURF [5], to find the differences between live and spoofing faces. In [27], CNN was used for face anti-spoofing for the first time, with remarkable performance achieved in intra- database tests. Following their work, a number of CNN-based methods have been proposed, almost all treating face anti-spoofing as a binary (live vs. spoofing) classification problem. However, given the enormous solution space of CNN, these methods tend to suffer overfitting and poor gen- eralizability to new PA patterns and environments. In this work, we attempt to enable an anti-spoofing system to be deployed in various environments, i.e. with good generaliz- ability.

?

?

對(duì)于基于CNN的方法,區(qū)分現(xiàn)場(chǎng)與欺騙面部的重要線索是欺騙模式,包括顏色失真,莫爾圖案,形狀變形,欺騙偽像(例如反射)等。在CNN模型訓(xùn)練期間 ,強(qiáng)大的模式會(huì)產(chǎn)生更多的貢獻(xiàn),并且由此產(chǎn)生的模型對(duì)它們更具辨別力。 但是,如果測(cè)試數(shù)據(jù)中沒有這些模式,性能將嚴(yán)重下降。 基于CNN的方法傾向于過度學(xué)習(xí)某些強(qiáng)烈的欺騙模式,因此普遍性較差[19]。 除了過度擬合之外,數(shù)據(jù)的域轉(zhuǎn)移[18]也是面部反欺騙方法普遍性差的重要原因。 這里的域是指某個(gè)獲取圖像的環(huán)境,包括各種因素,如照明,背景,面部外觀,相機(jī)類型等。考慮到現(xiàn)實(shí)世界環(huán)境的巨大差異,不同的樣本具有不同的域是非常常見的。 例如,即使在相同面部的情況下,如果用不同的紙片(例如光澤紙和粗糙紙)再現(xiàn),兩個(gè)紙張攻擊的域可能是完全不同的。 這種域差異可能導(dǎo)致特征空間中不同樣本的分布不相似,并導(dǎo)致模型在新域上失敗。

?

?

For CNN-based methods, an important clue to differen- tiate live vs. spoofing faces is the spoof pattern, including color distortion, moir ?e pattern, shape deformation, spoofing artifacts (e.g., reflection), etc. During CNN model train- ing, strong patterns make more contributions, and the resultant model is more discriminative for them. However, if these patterns are absent in the testing data, the performance would severely drop. The CNN-based methods tend to over- fit to some strong spoof patterns and thus suffer poor generalizability [19]. Apart from overfitting, domain shift [18] is also an important reason for the poor generalizability of face anti-spoofing methods. A domain here refers to a certain environment where an image is acquisited, consisting of various factors such as illumination, background, facial appearance, camera type, etc. Considering the huge diver- sity of real world environments, it is very common that dif- ferent samples have different domains. For example, the domains of two paper attacks may be quite different even in case of the same face if reproduced with different pieces of paper (e.g. glossy vs. rough paper). Such domain variance may lead to distribution dissimilarity of different samples in the feature space and cause the models to fail on new domains.

?

?

圖1:我們的CNN框架以多任務(wù)方式工作,一次性解決人臉識(shí)別和面部反欺騙問題。 它利用完全成對(duì)混淆(TPC)丟失和快速域適應(yīng)(FDA)來增強(qiáng)學(xué)習(xí)的演示攻擊(PA)功能的通用性,并改善不同場(chǎng)景中的面部反欺騙性能。

?

?

基于上述觀察,我們提出了一種新的整體成對(duì)混合(TPC)損失,以平衡所有相關(guān)欺騙模式的影響,并采用快速域適應(yīng)(FDA)模型[11]來縮小特征空間域中不同樣本的分布差異。? 然后我們獲得一個(gè)廣義面部認(rèn)證CNN模型,簡(jiǎn)稱為GFA-CNN。 與現(xiàn)有的面部反欺騙方法不同,我們的GFA-CNN以多任務(wù)方式工作,同時(shí)進(jìn)行面部反欺騙和人臉識(shí)別,如圖1所示。 這兩個(gè)任務(wù)的CNN層共享相同的參數(shù),我們的模型工作效率很高。針對(duì)面部反欺騙的五個(gè)流行基準(zhǔn)的廣泛實(shí)驗(yàn)證明了我們的方法優(yōu)于現(xiàn)有技術(shù)的優(yōu)勢(shì)。 我們的代碼和經(jīng)過訓(xùn)練的模型將在論文接受后提供。 我們的貢獻(xiàn)總結(jié)如下:

?

?

Based on the above observations, we propose a new Total Pairwise Confusion (TPC) loss to balance the contributions of all involved spoof patterns, and also employ a Fast Domain Adaptation (FDA) model [11] to narrow the distribution discrepancy of samples from different domains in the feature space. We then obtain a Generalizable Face Authentication CNN model, shorted as GFA-CNN. Different from prior methods that take face anti-spoofing as a pre-step of face authentication, our GFA-CNN works in a multi-task manner, performing simultaneously face anti-spoofing and face recognition, as shown in Fig. 1. Since the CNN layers of the two tasks share the same parameters, our model works with high efficiency.

Extensive experiments on five popular benchmarks for face anti-spoofing demonstrate the superiority of our method over the state-of-the-arts. Our code and trained models will be available upon acceptance. Our contribu- tions are summarized as follows:

?

?

我們提出了總成對(duì)混淆(TPC)損失,以有效地減輕基于CNN的面部反欺騙模型的過度擬合問題到數(shù)據(jù)集特定的欺騙模式,這改善了面部反欺騙方法的普遍性。

?我們采用快速域適應(yīng)(FDA)模型來學(xué)習(xí)更強(qiáng)大的演示攻擊(PA)表示,從而減少特征空間中的域移位。

?我們開發(fā)了面向身份驗(yàn)證的多任務(wù)CNN模型。 我們的GFA-CNN同時(shí)執(zhí)行反欺騙和面部識(shí)別。

?

?

We propose a Total Pairwise Confusion (TPC) loss to effectively relieve the overfitting problems of CNN- based face anti-spoofing models to dataset-specific spoof patterns, which improves generalizability of face anti-spoofing methods.

? We incorporate the Fast Domain Adaptation (FDA) model to learn more robust Presentation Attack (PA) representations, which reduces domain shift in the feaure space.

? We develop a multi-task CNN model for face authentication. Our GFA-CNN performs jointly face anti- spoofing and face recognition.

圖2:擬議的GFA-CNN的架構(gòu)。 整個(gè)網(wǎng)絡(luò)包含兩個(gè)分支。 面部反欺騙分支(上部)將由FDA轉(zhuǎn)換的域自適應(yīng)圖像作為輸入,并通過TPC損失和防欺騙損失進(jìn)行優(yōu)化,面部識(shí)別分支(底部)將裁剪的面部圖像作為輸入并通過 最大限度地減少Recog損失。 結(jié)構(gòu)設(shè)置顯示在每個(gè)塊的頂部,其中“ID號(hào)”表示參與訓(xùn)練中的分類數(shù)。 兩個(gè)分支在訓(xùn)練期間共享參數(shù)。

?

?

大多數(shù)先前的面部反欺騙方法利用預(yù)定義的特征(例如LBP [8],HoG [16]和SURF [5])利用實(shí)時(shí)和欺騙面部之間的紋理差異,隨后將其饋送到監(jiān)督分類器(例如, SVM,LDA)用于二進(jìn)制分類。 然而,這種手工制作的特征對(duì)不同的照明條件,相機(jī)設(shè)備,特定身份等非常敏感。盡管在內(nèi)部數(shù)據(jù)集協(xié)議下實(shí)現(xiàn)了顯著的性能,但來自不同環(huán)境的樣本可能使模型失敗。 為了獲得具有更好的可普遍性的特征,一些方法利用時(shí)間信息,例如, 利用活體面部的自發(fā)動(dòng)作,如眨眼[20]和嘴唇運(yùn)動(dòng)[15]。 雖然這些方法對(duì)于照片攻擊是有效的,但是當(dāng)攻擊者通過切割眼睛/嘴巴位置的紙張模擬這些動(dòng)作時(shí),它們會(huì)變得脆弱。

?

Most previous approaches for face anti-spoofing exploit texture differences between live and spoofing faces with pre-defined features such as LBP [8], HoG [16], and SURF [5], which are subsequently fed to a supervised classifier (e.g., SVM, LDA) for binary classification. However, such handcrafted features are very sensitive to different illumination conditions, camera devices, specific identities, etc. Though noticeable performance achieved under the intradataset protocol, the sample from a different environment may fail the model. In order to obtain features with better generalizability, some approaches leverage temporal information, e.g. making use of the spontaneous motions of the live faces, such as eye-blinking [20] and lip motion [15]. Though these methods are effective against photo attacks, they become vulnerable when attackers simulate these mo- tions through a paper with eye/mouth positions cut.

?

?

最近,已經(jīng)提出了基于深度學(xué)習(xí)的方法[27,17]來解決面部反欺騙。 他們通過將面部反欺騙作為二元分類問題來使用CNN來學(xué)習(xí)高度辨別力的表示。 然而,他們中的大多數(shù)容易遭受過度擬合。 目前公開的面部反欺騙數(shù)據(jù)集太有限,無法涵蓋各種潛在的欺騙類型。 Liu等人最近的一項(xiàng)工作[19]。 利用深度圖和rPPG信號(hào)作為輔助監(jiān)督來訓(xùn)練CNN,而不是將面部反欺騙視為簡(jiǎn)單的二元分類問題,以避免過度擬合。 面部反欺騙的另一個(gè)關(guān)鍵問題是域名轉(zhuǎn)移。 為了彌補(bǔ)訓(xùn)練和測(cè)試域之間的差距,[17]通過最小化跨域特征分布的不相似性,即最小化表示之間的最大平均差異距離,將CNN推廣到未知條件。

?

?

Recently, deep learning based methods [27, 17] have been proposed to address face anti-spoofing. They use CNNs to learn highly discriminative representations by taking face anti-spoofing as a binary classification problem. However, most of them easily suffer overfitting. Current publicly available face anti-spoofing datasets are too limted to cover various potential spoofing types. A very recent work [19] by Liu et al. leverages the depth map and rPPG signal as auxiliary supervision to train CNN instead of treating face anti-spoofing as a simple binary classification problem in order to avoid overfitting. Another critical issue for face anti-spoofing is domain shift. To bridge the gap between training and testing domains, [17] generalizes CNN to unknown conditions by minimizing the feature distribution dissimilarity across domains, i.e. minimizing the Maximum Mean Discrepancy distance among representations.

?

?

圖3:學(xué)習(xí)的特征分布w /和w/o Ltpc的可視化比較。 沒有Ltpc,特征分布是多樣的和特定于人的(左),而對(duì)于Ltpc,特征分布變得緊湊和均勻(右)。 l是分類超平面。 從顏色角度易于看清。

?

據(jù)我們所知,幾乎所有以前的作品都將面部反欺騙作為面部識(shí)別前的一個(gè)步驟,并將其作為二元分類問題來解決。 與以往的文獻(xiàn)相比,我們一舉解決了面部反欺騙和人臉識(shí)別問題。 與我們最相關(guān)的工作是[23],它提出了一個(gè)雙層框架,以確保用戶對(duì)識(shí)別系統(tǒng)的真實(shí)性,即通過生物識(shí)別系統(tǒng)監(jiān)控用戶是否作為活體亦或是欺騙攻擊。 這個(gè)系統(tǒng)執(zhí)行基于指紋,手掌靜脈打印,面部等的認(rèn)證,具有兩個(gè)分離的層:反欺騙由CNN賦能學(xué)習(xí)表征,而識(shí)別人臉是基于諸如ORB點(diǎn)的預(yù)定義手工特征。

?

?

To our best knowledge, almost all previous works take face anti-spoofing as a pre-step prior to face recognition and address it as a binary classification problem. Compared with previous literature, we solve face anti-spoofing and face recognition at one shot. A most related work to ours is [23], which proposed a two-tier framework to ensure the authenticity of the user to the recognition system, namely, monitoring whether the user has passed the biometric system as a live or spoofing one. It performs authentication based on fingerprint, palm vein print, face, etc., with two separated tiers: the anti-spoofing is powered by CNN learned representations while the recognition is based on pre-defined handcrafted features like ORB points.

?

?

與[23]不同,我們以多任務(wù)方式構(gòu)建我們的GFA-CNN,我們的框架可以識(shí)別給定面部的身份,同時(shí)判斷面部是活的還是欺騙性的。 值得一提的是,對(duì)于人臉識(shí)別,我們的方法在LFW數(shù)據(jù)庫中實(shí)現(xiàn)了高達(dá)97.1%的單模型精度[12],這甚至可以與現(xiàn)有技術(shù)相媲美(97.1%的精度不值得夸耀,有點(diǎn)托大,by 譯者注)。

?

?

Different with [23], we build our GFA-CNN in a multi- task manner, our framework can recognize the identity of a given face, and meanwhile judge whether the face is a live or spoofing one. It is worth mentioning that for face recognition, our method achieves single-model accuracy up to 97.1% on the LFW database [12], which is even compa- rable to state-of-the-arts.

?

?

3. Generalizable Face Authentication CNN

?3.1. Multi-Task Network Architecture

所提出的普適面部認(rèn)證-CNN(GFA-CNN)能夠以相互提升的方式共同解決人臉識(shí)別和人臉反欺騙。 該網(wǎng)絡(luò)有兩個(gè)分支:面部反欺騙分支和人臉識(shí)別分支。 每個(gè)分支由5個(gè)CNN層和3個(gè)完全連接(FC)層組成,每個(gè)塊包含3個(gè)CNN層。 參數(shù)在這兩個(gè)分支之間共享。 通過最小化TPC損失和人臉反欺騙損失(防欺騙損失)訓(xùn)練面部反欺騙分支,同時(shí)通過優(yōu)化面部識(shí)別損失(Recg-loss)來訓(xùn)練面部識(shí)別分支。 反欺騙分支采用帶背景的人臉作為輸入原始圖像,而識(shí)別分支采用裁剪的人臉作為輸入。 在輸入到面部反欺騙分支之前,通過給定的目標(biāo)域圖像將訓(xùn)練圖像遷移到目標(biāo)域。 在測(cè)試階段,每個(gè)查詢圖像都會(huì)遷移到目標(biāo)域,然后傳播到網(wǎng)絡(luò)中。

?

?

The proposed Generalizable Face Authentication CNN (GFA-CNN) is able to jointly address face recognition and face anti-spoofing in a mutual boosting way. The network has two branches: the face anti-spoofing branch and the face recognition branch. Each branch consists of 5 blocks of CNN layers and 3 fully connected (FC) layers, and each block contains 3 CNN layers. The parameters are shared between these two branches. The face anti-spoofing branch is trained by minimizing TPC loss and face anti-spoofing loss (Anti-loss), while the face recognition branch is trained by optimizing face recognition loss (Recg-loss). The anti- spoofing branch takes as input raw face images with back- ground, while the recognition branch takes cropped faces as input. Before fed to the face anti-spoofing branch, the train- ing images are transferred to a target domain by a given target-domain image. In testing phase, each query image is transferred to the target domain and then propagated for- ward the network.

?

?

CNN模塊的結(jié)構(gòu)與VGG16的結(jié)構(gòu)部分相同。 在訓(xùn)練之前,首先在VGG面部數(shù)據(jù)集上訓(xùn)練CNN塊以獲得面部識(shí)別的基本權(quán)重。 除了最后一層FC層的輸出尺寸外,FC防欺騙和人臉識(shí)別分支的FC層具有相同的結(jié)構(gòu)。 面部反欺騙分支對(duì)于最后一個(gè)FC層采用2維,而面部識(shí)別分支中最后一個(gè)FC層的尺寸取決于訓(xùn)練中涉及的人臉類型數(shù)量。 總體目標(biāo)損失函數(shù)是

L = Lanti +λ1* Lid +λ2* Ltpc,(1)

其中Lanti和Lrecg分別是面部反欺騙和人臉識(shí)別的交叉熵?fù)p失,Ltpc是總成對(duì)混淆(TPC)損失,λ1和λ2是兩個(gè)損失中的加權(quán)參數(shù)。

?

?

The CNN blocks are structured the same with the convo- lution part of VGG16. Before training, the CNN blocks are first trained on the VGG-face dataset to obtain fundamental weights for face recognition. The FC layers of face anti- spoofing and face recognition branches have the same struc- ture except for the output dimension of the last FC layer. The face anti-spoofing branch takes 2 dimensions for the last FC layer, while the dimensions of the last FC layer in the face recognition branch depend on the number of sub- jects involved in training. The overall objective function is

L = Lanti +λ1 ?Lid +λ2 ?Ltpc, (1)

where Lanti and Lrecg are the cross entropy losses for face anti-spoofing and face recognition respectively, Ltpc is the Total Pairwise Confusion (TPC) loss, and λ1 and λ2 are the weighting parameters among different losses.

?

?

3.2. Total Pairwise Confusion Loss

?

?

為了學(xué)習(xí)適應(yīng)不同環(huán)境條件的演示攻擊(PA)表示,我們提出了一種新的總成對(duì)混淆(TPC)損失。 我們的靈感來自成對(duì)混淆(PC)損失[10],它通過故意在特征激活中引入混淆來解決細(xì)粒度視覺分類中的過度擬合問題。 我們修改他們的混淆實(shí)現(xiàn),使其適用于面部反欺騙任務(wù)。 我們的TPC損失定義為:

其中xi和xj是兩個(gè)隨機(jī)選擇的圖像(樣本對(duì)),M是訓(xùn)練中涉及的樣本對(duì)的總數(shù),ψ(x)表示面部反欺騙分支的第二個(gè)全連接層的表示(見圖2)。

(譯者注: 論文【10】可參考https://blog.csdn.net/Jadelyw/article/details/82988498 個(gè)人感覺類似孿生網(wǎng)絡(luò),或facenet的tripletloss的概念,區(qū)別是存在兩個(gè)fc層,在倒數(shù)第二層提取出特征進(jìn)行歐式距離的計(jì)算)

?

?

In order to learn Presentation Attack (PA) representations that are adaptable to varying environment conditions, we propose a novel Total Pairwise Confusion (TPC) loss. Our inspiration comes from the pairwise confusion (PC) loss [10] that tackles the overfitting issue in fine-grained visual classification by intentionally introducing confusion in the feature activations. We modify their confusion imple- mentation to make it applicable to the face anti-spoofing task. Our TPC loss is defined as

where xi and xj are two randomly selected images (sample pair), M is the total number of sample pairs involved in training and ψ(x) denotes the representations of the second fully connected layer of the face anti-spoofing branch.

?

?

我們的Ltpc與原始PC損失的區(qū)別有兩點(diǎn):1)TPC損失使來自訓(xùn)練集的隨機(jī)樣本對(duì)的分布距離最小化,而不是來自兩個(gè)不同類別的樣本對(duì),以迫使CNN學(xué)習(xí)更細(xì)微的判別性特征。 2)我們最小化特征空間中的歐幾里德距離,而原始PC損失為了使同一樣本對(duì)中的樣本具有相似的條件概率分布,最小化概率空間中的距離(softmax的輸出)

?

?

Our Ltpc differs from the original PC loss in two-fold: 1) TPC loss minimizes the distribution distance of a random sample pair from the training set, rather than the sample pair from two different categories, to force CNN to learn slightly less discriminative features. 2) We minimize the Euclidean distance in the feature space while the original PC loss min- imizes the distance in the probability space (output of soft- max) to make samples in the same pair have a similar con- ditional probability distribution.

?

?

們的修改基于以下考慮:1)將面部反欺騙視為二元分類時(shí),跨類別的混淆不會(huì)過分影響PA功能在區(qū)分活體和欺騙樣本的可區(qū)分性。 2)相同類型相關(guān)的面部樣本通常會(huì)聚集在特征空間中,而對(duì)所有樣本實(shí)施混淆可以壓縮并均勻化整個(gè)特征分布(參見圖3),從而有利于泛化性能。 3)作為更簡(jiǎn)單結(jié)構(gòu)的二元分類問題,在特征空間內(nèi)對(duì)模型進(jìn)行正則化比在輸出概率空間內(nèi)強(qiáng)制正則化更有用

?

?

Our modifications are based on below considerations: 1) With face anti-spoofing taken as a binary classification is- sue, confusion across categories would not excessively af- fect the discriminability of the PA feature on differentiating live vs. spoofing samples. 2) Face samples related to the same subject would usually cluster in the feature space, and implementing confusion on all samples could compact and homogenize the whole feature distribution (see Fig. 3), thus benefiting generalization performance. 3) As a binary classification problem of simpler structure, regularizing the model within the feature space would be more useful than imposing regularization within the output probabilistic space.

?

?

圖4:SSF的貢獻(xiàn)平衡過程。 FC層中較暗的顏色表示對(duì)分類的貢獻(xiàn)較高,而較淺的顏色表示較低的貢獻(xiàn)。 每個(gè)網(wǎng)格代表一個(gè)SSF。 Ltpc和Lanti之間的權(quán)衡游戲可以平衡SSF對(duì)最終決策的貢獻(xiàn)。

Figure 4: The contribution-balanced process of SSFs. Darker color in the FC layer indicates a higher contribu- tion to the classification while lighter color indicates lower. Each grid represents an SSF. The trade-off game between Ltpc and Lanti can balance the contributions of SSFs to the final decision.

?

?

我們的Ltpc可以有效地提高其PA表示的普遍性。可以理解如下。假設(shè)PA表示中有K個(gè)分量,每個(gè)對(duì)應(yīng)一個(gè)欺騙模式,稱為這項(xiàng)任務(wù)的欺騙模式特征(SSF)。如如圖4所示,不同的SSF對(duì)最終決定有不同程度的貢獻(xiàn)。如果我們將活體和欺騙樣本的特征分別定義為Fl =(f1l,f2l,...,fKl)和Fs =(f1s,f2s,...,fKs),其中fil是活體樣本的第i個(gè)SSF,fis是欺騙樣本的第i個(gè)SSF。 SSF基于他們對(duì)活體和欺騙分類的重要性排名。一方面,L anti為了更好的分類旨在擴(kuò)大在F1和Fs之間距離。另一方面Ltpc試圖縮小F1和之間的差異Fs。f1l / s對(duì)于活體和欺騙樣本來說擁有最大的差異貢獻(xiàn)值,但它將受到Ltpc的最大削弱。而不太重要的SSF的貢獻(xiàn),例如fK-1 l / s和fK? l / s ,將被L增強(qiáng)以抵消分類的損失。在這種權(quán)衡中,所有SSF的貢獻(xiàn)趨于均衡,這意味著決策中涉及更多的欺騙模式,而不僅僅是針對(duì)訓(xùn)練集的一些強(qiáng)大的欺騙模式。這可以有效地緩解過度擬合風(fēng)險(xiǎn)。如果某些欺騙模式在測(cè)試中消失,那么其他模式仍然可以做出公平的決定,確保CNN不會(huì)過度適應(yīng)某些特征

?

?

Our Ltpc can effectively improve the generalizability of PA representations. This can be understood as follows. Suppose there are K components in the PA representations, each corresponding to one spoof pattern, which is called a Spoof-pattern Specific Feature (SSF) in this work. As shown in Fig. 4, different SSFs contribute differently to the final decision. If we define the feature for a live and a spoofing sample as Fl = (f1l ,f2l ,...,fKl ) and Fs = (f1s,f2s,...,fKs ), respectively, where fi is the ith SSF of the live sample and fi ls is the ith SSF of the spoofing sample. The SSFs are ranked based on their importance to the classification of live vs. spoofing. On one hand, Lanti aims to enlarge the distance between Fl and Fs for better discrimination. On the other hand, Ltpc attempts to narrow the difference between Fl and Fs. As f1l/s contributes the most to the differentiation of live and spoofing samples, it will be impaired the most by Ltpc. However, the contributions of less important SSFs, such as fK?1 and fK , will be enhanced by L to offset l/s l/s anti the impaired discriminative ability. In this trade-off game, the contributions of all SSFs tend to be equalized, meaning more spoof patterns are involved in the decision rather than just a couple of strong spoof patterns specific to the train- ing set. This could effectively alleviate overfitting risks. If some spoof patterns disappear in testing, a fair decision can still be achieved by other patterns, ensuring CNN would not overfit to some specific features.

?

?

3.3. Fast Domain Adaptation

?

?

除了提出的TPC損失平衡每種欺騙模式之外,我們還應(yīng)用FDA來減少特征空間中的域遷移以進(jìn)一步提高我們框架的可通用性。通常,圖像包含兩個(gè)組件:內(nèi)容和外觀[21]。 外觀信息(例如,顏色,局部結(jié)構(gòu))構(gòu)成來自特定領(lǐng)域的圖像的風(fēng)格,并且主要由CNN的底層中的特征表示[13]。 對(duì)于面部反欺騙,面部樣本數(shù)據(jù)之間的域差異可能會(huì)在特征空間中引入分布差異并且會(huì)損害反欺騙性能。 在這里,我們使用FDA來減輕域變化帶來的負(fù)面影響。 FDA包括圖像變換網(wǎng)絡(luò)f(·),其從給定圖像x:y = f(x)生成合成圖像y,以及損失網(wǎng)絡(luò)φ(·),計(jì)算其內(nèi)容重建損失L content 和域重建損失 L domain。

(譯者注:【21】可以參考 https://blog.csdn.net/z0n1l2/article/details/81677178? 以及

?https://blog.csdn.net/sunyao_123/article/details/81294724

Besides the proposed TPC loss that balances the contribution of each spoof pattern, we also apply FDA to reduce domain shift in the feature space to further improve the gen- eralizability of our framework.

Generally, an image contains two components: content and appearance [21]. The appearance information (e.g., colors, localised structures) makes up the style of images from a certain domain and is mostly represented by features in the bottom layers of CNN [13]. For face anti-spoofing, the domain variance among face samples may introduce the distribution dissimilarity in the feature space and hurt anti- spoofing performance. Here, we employ the FDA to alleviate negative effects brought by domain changes. The FDA consists of an image transformation network f(·) that generates a synthetic image y from a given image x: y = f (x), and a loss network φ(·) that computes content reconstruction loss L content and domain reconstruction loss L domain.

?

?

圖5:FDA的結(jié)果示例。 中間列中圖像的左上和右下圖像是預(yù)期要傳輸?shù)哪繕?biāo)域圖像。 奇數(shù)行的圖像來自MSU-MFSD; 偶數(shù)行的圖像來自Replay-Attack。

Figure 5: Example results by FDA. The upper left and bot- tom right images of the images in the middle column are the target-domain images expected to be transferred. Images of odd rows are from MSU-MFSD; images of even rows are from Replay-Attack.

?

?

設(shè)φj(·)為網(wǎng)絡(luò)φ(·)的第j層,其形狀為Cj×Hj×Wj。 當(dāng)內(nèi)容重構(gòu)損失在輸入圖像y中偏離輸入x時(shí),內(nèi)容重建損失會(huì)對(duì)輸出圖像y進(jìn)行懲罰。 因此,我們最小化y和x的特征表示之間的歐幾里德距離:

域重建損失使輸出圖像y與目標(biāo)域圖像yd具有相同的域。 然后,我們最小化Y和yd的Gram矩陣之間差異的平方Frobenius范數(shù):

通過將φj重新整形為矩陣κ,Gj =κκT/ CjHjWj來計(jì)算Gram矩陣。 然后通過求解以下目標(biāo)函數(shù)生成最優(yōu)圖像y ?:

其中P是網(wǎng)絡(luò)f(·)的最佳參數(shù),x是內(nèi)容圖像,y = f(x),yd是目標(biāo)域圖像,λc和λs是標(biāo)量。 通過求解方程(5),x被轉(zhuǎn)移到y(tǒng),保留x的內(nèi)容和yd的域。

圖5顯示了我們的一些域轉(zhuǎn)移樣本。 從訓(xùn)練數(shù)據(jù)中采樣目標(biāo)域圖像。 第4.2節(jié)中提供了關(guān)于w /和FDA之間的特征多樣性的詳細(xì)分析。

?

?

Let φj(·) be the jth layer of φ(·) with the shape of Cj × Hj × Wj. The content reconstruction loss penalizes the output image y when it deviates in content from the input x. We thus minimize the Euclidean distance between the feature representations of y and x:

The domain reconstruction loss enables the output image y to have the same domain with the target-domain image yd. We then minimize the squared Frobenius norm of the difference between the Gram matrices of y and yd:

The Gram matrix is computed by reshaping φj into a matrix κ, Gj = κκ T /CjHjWj. Then the optimal image y? is generated by solving the following objective function:

where P is the optimal parameters of network f (·), x is the content image, y = f(x), yd is the target-domain image, and λc and λs are scalars. By solving Eqn. (5), x is trans- ferred to y?, preserving the content of x with the domain of yd.

Fig. 5 shows some of our domain transferred samples. The target-domain image is sampled from the training data. Detailed analysis on the feature diversity between domains w/ and w/o FDA is provided in Sec. 4.2.

?

?

4. Experiments

4.1. Experimental Setup

Datasets. We evaluate GFA-CNN on five face anti- spoofing benchmarks: CASIA-FASD [28], Replay-Attack [8], MSU-MFSD [26], Oulu-NPU [7] and SiW [19]. CASIA-FASD and MSU-MFSD are small datasets, con- taining 50 and 35 subjects, respectively. Oulu-NPU and SiW are high-resolution databases published very recently. Oulu-NPU contains 4 testing protocols: Protocol 1 evalu- ates the environment condition variations; Protocol 2 exam- ines the influences of different spoofing mediums; Protocol 3 estimates the effects of different input cameras; Protocol 4 considers all the challenges above. We conduct intradatabase tests on MSU-MFSD and Oulu-NPU, respectively. Cross-database tests are performed between CASIA-FASD vs. Replay-Attack and MSU-MFSD vs. Replay-Attack, re- spectively. The face recognition performance is evaluated on SiW, which contains 165 subjects with large variations in poses, illumination, expressions (PIE), and different dis- tances from subject to camera. The LFW, the most widely used benchmark for face recognition, is also used to evalu- ate the face recognition performance.

?

數(shù)據(jù)集。我們?cè)谖鍌€(gè)面部反欺騙基準(zhǔn)上評(píng)估GFA-CNN:CASIA-FASD [28],Replay-Attack [8],MSU-MFSD [26],Oulu-NPU [7]和SiW [19]。 CASIA-FASD和MSU-MFSD是小型數(shù)據(jù)集,分別包含50和35個(gè)科目。 Oulu-NPU和SiW是最近出版的高分辨率數(shù)據(jù)庫。 Oulu-NPU包含4個(gè)測(cè)試協(xié)議:協(xié)議1評(píng)估環(huán)境條件的變化;協(xié)議2檢查了不同欺騙媒介的影響;協(xié)議3估計(jì)了不同輸入攝像機(jī)的影響;協(xié)議4包括了上述所有挑戰(zhàn)。我們分別對(duì)MSU-MFSD和Oulu-NPU進(jìn)行數(shù)據(jù)庫內(nèi)測(cè)試。跨數(shù)據(jù)庫測(cè)試分別在CASA-FASD 與 Replay-Attack 和 MSU-MFSD與Replay-Attack之間進(jìn)行。人臉識(shí)別性能在SiW上進(jìn)行評(píng)估,其中包含165個(gè)受試者,其姿勢(shì),光照,表情(PIE)以及從受試者到相機(jī)的不同距離有很大差異。 LFW是最廣泛使用的人臉識(shí)別基準(zhǔn),也用于評(píng)估人臉識(shí)別性能。

?

?

實(shí)施細(xì)節(jié)。 我們使用TensorFlow [1]實(shí)現(xiàn)GFA-CNN。 使用Adam優(yōu)化器,學(xué)習(xí)率從0.0003開始,每2000步后衰減一半。 批量大小設(shè)置為32公式(1)中的λ1和λ2 分別設(shè)定為0.1和2.5e-5。 所有實(shí)驗(yàn)均根據(jù)數(shù)據(jù)集中提供的方案進(jìn)行。 CNN層在VGG-face數(shù)據(jù)集上進(jìn)行了預(yù)訓(xùn)練[22]。 考慮數(shù)據(jù)平衡,我們使用水平和垂直翻轉(zhuǎn)對(duì)CASIA-FASD,MSU-MFSD和Replay-Attack訓(xùn)練集中的活體樣本采用三倍,同時(shí)通過水平翻轉(zhuǎn)使SiW訓(xùn)練集中的活體樣本加倍。

?

?

Implementation Details. The proposed GFA-CNN is implemented with TensorFlow [1]. We use Adam optimizer with a learning rate beginning at 0.0003 and decaying half after every 2,000 steps. The batch size is set as 32. λ1 and λ2 in Eqn. (1) are set as 0.1 and 2.5e?5, respectively. All experiments are performed according to the protocols provided in the datasets. The CNN layers are pre-trained on the VGG-face dataset [22]. For data balance, we triple the live samples in the training set of CASIA-FASD, MSU- MFSD and Replay-Attack with horizontal and vertical flip- ping, while doubling the live samples in the training set of SiW by just flipping horizontally.

?

?

評(píng)估指標(biāo)。 我們有兩個(gè)評(píng)估協(xié)議,即測(cè)試內(nèi)和交叉測(cè)試,分別測(cè)試來自訓(xùn)練集領(lǐng)域的樣本,而不是來自訓(xùn)練集領(lǐng)域的樣本。 我們使用以下指標(biāo)報(bào)告結(jié)果。

測(cè)試內(nèi)評(píng)估:等錯(cuò)誤率(EER),攻擊呈現(xiàn)分類錯(cuò)誤率(APCER),善意呈現(xiàn)分類錯(cuò)誤率(BPCER)和ACER =(APCER + BPCER)/ 2。 交叉測(cè)試評(píng)估:HTER。

?

?

Evaluation Metrics. We have two evaluation protocols, intra-test and cross-test, which test samples from and not from the domain of the training set, respectively. We report our results with the following metrics. Intra-test evaluation:

Equal Error Rate (EER), Attack Presentation Classification Error Rate (APCER), Bona Fide Presentation Classification Error Rate (BPCER) and, ACER=(APCER+BPCER)/2. Cross-test evaluation: HTER.

?

?

表1:消融研究(HTER%)。 +“表示使用相應(yīng)的組件,而 - ”表示刪除組件。 粗體數(shù)字是最好的結(jié)果。

Table 1: Ablation study (HTER %). +” means the corre- sponding component is used, while -” indicates removing the component. The numbers in bold are the best results.

?

4.2. Ablation Study

(譯者注:ablation study就是為了研究模型中所提出的一些結(jié)構(gòu)是否有效而設(shè)計(jì)的實(shí)驗(yàn)。比如你提出了某某結(jié)構(gòu),但是要想確定這個(gè)結(jié)構(gòu)是否有利于最終的效果,那就要將去掉該結(jié)構(gòu)的網(wǎng)絡(luò)與加上該結(jié)構(gòu)的網(wǎng)絡(luò)所得到的結(jié)果進(jìn)行對(duì)比,這就是ablation study。說白了,ablation study就是一個(gè)模型簡(jiǎn)化測(cè)試,看看取消掉模塊后性能有沒有影響。根據(jù)奧卡姆剃刀法則,簡(jiǎn)單和復(fù)雜的方法能夠達(dá)到一樣的效果,那么簡(jiǎn)單的方法更好更可靠。)

我們進(jìn)行消融分析,以揭示TPC損失和FDA在我們的框架中的作用。我們通過添加/消除TPC和FDA來重新訓(xùn)練建議的網(wǎng)絡(luò)。如表格所示。 1,如果TPC被移除,MFSD的內(nèi)部測(cè)試HTER分別下降2.9%(w / FDA)和4.1%(不含F(xiàn)DA)。由于Replay-Attack通常沒有嚴(yán)重的過度擬合,因此在HTER上使用FDA,0.3%(w / FDA)和0.6%(不含F(xiàn)DA)時(shí),改善的性能并不顯著是合理的。

對(duì)于交叉測(cè)試,如果TPC被消融,對(duì)于MFSD→Replay1,HTER顯著降低超過10%,對(duì)于Replay→MFSD,無論是否使用FDA,超過8%。通過使用TPC和FDA實(shí)現(xiàn)了最佳的交叉測(cè)試結(jié)果,表明FDA可以進(jìn)一步提高所提出方法的普遍性。

為了評(píng)估w /和不含F(xiàn)DA的域之間的特征多樣性,我們通過對(duì)稱KL分歧來計(jì)算特征差異。類似于[21],我們將來自CNN的特征嵌入的信道的平均值表示為F.給定F的高斯分布,具有平均μ和方差σ2,域A和B之間的該信道的對(duì)稱KL發(fā)散是:

將D(FiA || FiB)表示為第i個(gè)通道的對(duì)稱KL散度。 然后將層的平均特征發(fā)散定義為

其中C是該層的通道號(hào)。 該度量測(cè)量域A和B的特征分布之間的距離。我們計(jì)算CNN模型中每個(gè)層的特征偏差以進(jìn)行比較。 特別是,我們分別從MSU-MFSD和Replay-Attack中隨機(jī)選擇5,000個(gè)面部樣本。 每個(gè)數(shù)據(jù)集都被視為一個(gè)域。 然后將這些樣品送入預(yù)先訓(xùn)練的VGG16 [24]模型,以計(jì)算Eqn后各層的KL散度。(8)。 比較結(jié)果如圖6所示。可以看出,在FDA中,MSU-MFSD與Replay-Attack之間的特征差異顯著減小。

?

圖6:MSU-MSFD和重放攻擊之間的特征差異比較。 x軸上的數(shù)字對(duì)應(yīng)于VGG16的CNN層。

Figure 6: Feature divergence comparison between MSU- MFSD and Replay-Attack. The numbers on x-axis corre- spond to the CNN layer of VGG16.

?

We perform ablation analysis to reveal the role of TPC loss and FDA in our framework. We retrain the proposed network by adding/ablating TPC and FDA. As shown in Tab. 1, if TPC is removed, the HTER of intra-test on MFSD drops by 2.9% (w/ FDA) and 4.1% (w/o FDA), respectively. Since Replay-Attack is usually free of severe overfitting, it is reasonable to see the improved performance is not sig- nificant when using FDA, 0.3% (w/ FDA) and 0.6% (w/o FDA) on HTER.

For cross-test, if TPC is ablated, the HTER dramatically decreases by over 10% for MFSD Replay1, and over 8% for Replay MFSD, no matter FDA is used or not. The best cross-test result is achieved by using both TPC and FDA, indicating FDA can further improve the generalizabil- ity of the proposed method.

To evaluate the feature diversity between domains w/ and w/o FDA, we calculate the feature divergence via symmet- ric KL divergence. Similar to [21], we denote the mean value of a channel from the feature embedding of CNN as F. Given a Gaussian distribution of F, with mean μ and variance σ2, the symmetric KL divergence of this channel between domain A and B is

Denote D(FiA||FiB) as the symmetric KL divergence of the ith channel. Then the average feature divergence of the layer is defined as

where C is the channel number of this layer. This metric measures the distance between the feature distributions of domain A and B. We calculate the feature divergence of each layer in a CNN model for comparison. In particular, we randomly select 5,000 face samples from MSU-MFSD and Replay-Attack, respectively. Each dataset is considered as one domain. These samples are then fed to a pre-trained VGG16 [24] model to calculate the KL divergence at each layer following Eqn. (8). The comparison results are shown in Fig. 6. As can be seen, with the FDA, the feature diver- gence between MSU-MFSD and Replay-Attack is signifi- cantly reduced.

?

4.3. Face Anti-spoofing Evaluation

內(nèi)部測(cè)試。我們對(duì)MSU-MFSD和Oulu-NPU進(jìn)行內(nèi)部測(cè)試。表2顯示了我們的方法與MSU-MFSD上其他最先進(jìn)方法的比較。對(duì)于Oulu-NPU,我們參考[2]中的面部反欺騙競(jìng)賽結(jié)果,并使用每個(gè)協(xié)議中最好的兩個(gè)進(jìn)行比較。所有結(jié)果均在表 3。

如表2所示,GFA-CNN達(dá)到7.5%的EER,在所有比較方法中排名第3。考慮到GFA-CNN不是為了在測(cè)試內(nèi)設(shè)置中追求高性能而盲目設(shè)計(jì),這個(gè)結(jié)果是令人滿意的。在我們的實(shí)驗(yàn)中,我們發(fā)現(xiàn)所提出的TPC損失可能會(huì)略微降低測(cè)試內(nèi)性能,主要是因?yàn)門PC損失會(huì)損害訓(xùn)練數(shù)據(jù)集中幾個(gè)最強(qiáng)SSF的貢獻(xiàn)。這些數(shù)據(jù)集特定功能的削弱可能反過來影響測(cè)試內(nèi)的性能(但是,它們可能會(huì)提高交叉測(cè)試的性能)。根據(jù)表3,我們的方法在4個(gè)協(xié)議中的3個(gè)中實(shí)現(xiàn)了最低的ACER。對(duì)于最具挑戰(zhàn)性的協(xié)議4,我們實(shí)現(xiàn)了8.9%的ACER,比最佳表現(xiàn)低1.1%。

?

Intra-Test. We perform intra-test on MSU-MFSD and Oulu-NPU. Tab. 2 shows the comparisons of our method with other state-of-the-art methods on MSU-MFSD. For Oulu-NPU, we refer to the face anti-spoofing competition results in [2] and use the best two for each protocol for com- parison. All results are reported in Tab. 3.

As shown in Tab. 2, GFA-CNN achieves the EER of 7.5%, ranking the 3rd among all the compared methods. This result is satisfactory considering GFA-CNN is not de- signed blindly to pursue high performance in the intra- test setting. In our experiments, we find the proposed TPC loss may slightly decrease the intra-test performance, mainly because TPC loss impairs the contributions of several strongest SSFs w.r.t the training datasets. The weaken- ing of these dataset-specific features may in turn affect the intra-test performance (however, they may improve the per- formance in cross-test). According to Tab. 3, our method achieves the lowest ACER in 3 out of 4 protocols. For the most challenging protocol 4, we achieve the ACER of 8.9%, which is 1.1% lower than the best performer.

?

?

交叉測(cè)試。為了證明GFA-CNN的強(qiáng)大普遍性,我們通過與其他現(xiàn)有技術(shù)進(jìn)行比較,對(duì)CASIA-FASD,Replay-Attack和MSU-MFSD進(jìn)行交叉測(cè)試。我們采用最廣泛使用的交叉測(cè)試設(shè)置:CASIA-FASD vs. Replay-Attack和MSU-MFSD vs. Replay-Attack,并在Tab中報(bào)告比較結(jié)果。 表4可以看出,GFA-CNN在交叉測(cè)試中達(dá)到了最低的HTER:CASIA→重播,MFSD→重播和重播→MFSD。特別是對(duì)于重播→MFSD,與最好的最先進(jìn)技術(shù)相比,GFA-CNN將交叉測(cè)試HTER降低了8.3%。然而,我們也觀察到GFA-CNN與Replay Attack→CASIA-FASD的最佳方法相比具有相對(duì)更差的HTER。這可能是由于當(dāng)要傳輸?shù)脑从驁D像的分辨率遠(yuǎn)高于目標(biāo)域圖像的分辨率時(shí),FDA的“質(zhì)量下降”。在Replay-Attack→CASIA-FASD的交叉測(cè)試中,目標(biāo)域圖像選自Replay-Attack,低分辨率為320×240。但是,CASIA-FASD包含大量高分辨率圖像這種“分辨率差距”導(dǎo)致FDA的“質(zhì)量下降”,如圖7中最右邊的圖像所示。

?

?

Cross-Test. To demonstrate the strong generalizability of GFA-CNN, we perform cross-test on CASIA-FASD, Replay-Attack, and MSU-MFSD by comparing with other state-of-the-arts. We adopt the most widely used cross- test settings: CASIA-FASD vs. Replay-Attack and MSU- MFSD vs. Replay-Attack, and report comparison results in Tab. 4. As can be seen, GFA-CNN achieves the lowest HTERs in cross-test: CASIA → Replay, MFSD → Replay and Replay → MFSD. Especially for Replay → MFSD, GFA-CNN reduces the cross-testing HTER by 8.3% com- pared with the best state-of-the-art.

However, we also observe GFA-CNN has a relatively worse HTER compared with the best method on Replay At- tack → CASIA-FASD. This is probably due to the “qual- ity degradation” by FDA when the resolution of a source- domain image to be transferred is much higher than that of the target-domain image. During the cross-testing on Replay-Attack → CASIA-FASD, the target-domain image is selected from Replay-Attack with a low-resolution of 320 × 240. However, CASIA-FASD contains quite a number of images with high-resolution of 720 × 1280. Such a “resolution gap” leads to a “quality degradation” of FDA, as shown in the rightmost image in Fig. 7.

?

?

表4:CASIA-FASD,重播攻擊和MSU-MFSD的交叉測(cè)試結(jié)果(HTER%)。 - “表示相應(yīng)的結(jié)果未提供用。 粗體數(shù)字是最好的結(jié)果。

Table 4: Cross-test results (HTER %) on CASIA-FASD, Replay-Attack, and MSU-MFSD. -” indicates the corresponding result is unavailable. The numbers in bold are the best results.

?

圖7:FDA以不同的分辨率轉(zhuǎn)移的結(jié)果。 左上角圖像是目標(biāo)域圖像。 對(duì)于每個(gè)塊的其他圖像,左邊的圖像是原始圖像,右邊是轉(zhuǎn)移的圖像。 位于每個(gè)圖像左上角的綠色數(shù)字表示分辨率。

Figure 7: Results transferred by FDA with different res- olutions. The top left image is the target-domain image. For other images of each block, the left one is the original image, and the right is the transferred image. The green number located at the top left of each image indicates the resolution.

4.4. Face Recognition Evaluation

我們進(jìn)一步評(píng)估了我們的GFA-CNN在SiW和LFW上的人臉識(shí)別性能。由于我們的方法不是專門針對(duì)人臉識(shí)別,我們只采用VGG-16作為基線。在LFW上,我們遵循提供的協(xié)議來執(zhí)行測(cè)試。在SiW上,我們使用90個(gè)科目進(jìn)行訓(xùn)練,另外75個(gè)科目進(jìn)行測(cè)試,這是它的默認(rèn)數(shù)據(jù)分割。該數(shù)據(jù)集還提供與每個(gè)主題相對(duì)應(yīng)的正面遺留面部圖像。在測(cè)試階段,我們選擇測(cè)試集的每個(gè)主題作為圖庫面的遺留圖像,并使用測(cè)試集中的所有圖像(包括實(shí)時(shí)和欺騙)作為探測(cè)面。

人臉驗(yàn)證的ROC曲線如圖8所示。可以觀察到,GFA-CNN在LFW上分別達(dá)到VGG16的競(jìng)爭(zhēng)結(jié)果,分別為97.1%和97.6%。然而,當(dāng)在SiW上測(cè)試時(shí),GFA-CNN的精確度下降遠(yuǎn)低于VGG16:GFA-CNN的準(zhǔn)確度降低了4.5%,而VGG16降低了14%。性能下降主要是由于欺騙介質(zhì)的面部再現(xiàn),其中一些更精細(xì)的面部細(xì)節(jié)可能會(huì)丟失。然而,與VGG16相比,GFA-CNN仍然達(dá)到了令人滿意的性能。這主要是因?yàn)槊娌糠雌垓_和面部識(shí)別任務(wù)相互增強(qiáng),使得面部識(shí)別學(xué)習(xí)的表現(xiàn)對(duì)欺騙模式不那么敏感。

?

?

?

We further evaluate the face recognition performance of our GFA-CNN on SiW and LFW. Since our method is not targeted specifically at face recognition, we only adopt VGG-16 as the baseline. On LFW, we follow the provided protocol to perform testing. On SiW we use 90 subjects for training and the other 75 subjects for testing, which is its default data splitting. This dataset also provides a frontal legacy face image corresponding to each subject. At the testing phase, we select the legacy image w.r.t each subject of the testing set as the gallery faces, and use all images in the testing set (including both live and spoofing) as the probe faces.

The ROC curves of face verification are shown in Fig. 8. As can be observed, GFA-CNN achieves competitive results to VGG16 on LFW, 97.1% and 97.6%, respectively. How- ever, when testing on SiW, the declined accuracy of GFA- CNN is much lower than that of VGG16: the accuracy of GFA-CNN reduces by 4.5%, while VGG16 drops by 14%. The degraded performance is mainly due to face reproduc- tion by spoofing mediums, in which some of the finer facial details might be lost. However, GFA-CNN still achieves satisfactory performance compared with VGG16. This is mainly because the face anti-spoofing and face recognition tasks mutually enhance each other, making the representa- tions learned for face recognition less sensitive to spoof patterns.

?

?

4.5. Discussions on Multi-task Setting

在本小節(jié)中,我們將研究多任務(wù)學(xué)習(xí)如何影響面部反欺騙的模型性能。我們?cè)跊]有面部識(shí)別分支的情況下重新訓(xùn)練我們的模型,保持超參數(shù)不變并使用與GFA-CNN相同的協(xié)議進(jìn)行評(píng)估。從實(shí)驗(yàn)中,我們觀察到多任務(wù)訓(xùn)練略微降低了面部反欺騙的測(cè)試內(nèi)性能(分別在MSU-MFSD和Replay-Attack上下降2.5%和0.3%)。這是合理的,因?yàn)閱蝹€(gè)模型學(xué)會(huì)執(zhí)行兩個(gè)不同的任務(wù)。然而,與單一任務(wù)訓(xùn)練相比,實(shí)現(xiàn)了兩個(gè)優(yōu)點(diǎn)。首先,訓(xùn)練過程變得更加穩(wěn)定,anti-loss損失逐漸減少,而不是經(jīng)過單一任務(wù)訓(xùn)練后的一些步驟急劇下降,這表明多任務(wù)設(shè)置有助于克服過度擬合。其次,如圖8所示,多任務(wù)訓(xùn)練有助于學(xué)習(xí)對(duì)用于面部識(shí)別的欺騙模式不太敏感的面部表示。這主要得益于在卷積層中共享參數(shù),提供更通用的融合特征。

?

?

In this subsection, we investigate how the multi-task learning affects model performance for face anti-spoofing. We retrain our model without the face recognition branch, keep hyper-parameters unchanged and evaluate with the same protocol as the GFA-CNN. From the experiments, we observe the multi-task training slightly decreases the intra- test performance of face anti-spoofing (dropping 2.5% and 0.3% on MSU-MFSD and Replay-Attack, respectively). This is reasonable, since the single model learns to perform two different tasks. However, two advantages are achieved compared with the single task training. Firstly, the training process becomes more stable with the Anti-loss decreasing gradually, rather than dropping sharply after some steps by single task training, suggesting multi-task setting can help overcome overfitting. Secondly, as shown in Fig. 8, multi- task training helps learn face representations less sensitive to spoof patterns for face recognition. This mainly benefits from sharing parameters in the convolutional layers, giving more generic fusion features.

?

5. Conclusion

本文提出了一種新穎的CNN模型,以相互促進(jìn)的方式共同修飾人臉識(shí)別和面對(duì)反欺騙。 為了學(xué)習(xí)面部反欺騙的更具概括性的演示攻擊(PA)表示,我們提出了一種新的總成對(duì)混淆(TPC)損失來平衡每個(gè)欺騙模式的貢獻(xiàn),防止PA表示過度擬合到數(shù)據(jù)集特定的欺騙模式。 快速域適應(yīng)(FDA)也被納入我們的框架,以減少來自不同領(lǐng)域的面部樣本的分布不相似性,進(jìn)一步增強(qiáng)PA表示的穩(wěn)健性。 面部反欺騙和人臉識(shí)別數(shù)據(jù)集的廣泛實(shí)驗(yàn)表明,我們的GFA-CNN不僅在交叉測(cè)試中實(shí)現(xiàn)了面部反欺騙的卓越性能,而且還實(shí)現(xiàn)了高精度的面部識(shí)別。

?

?

This paper presents a novel CNN model to jointly ad- dress face recognition and face anti-spoofing in a mutual boosting way. In order to learn more generalizable Pre- sentation Attack (PA) representations for face anti-spoofing, we propose a novel Total Pairwise Confusion (TPC) loss to balance the contribution of each spoof pattern, preventing the PA representations from overfitting to dataset-specific spoof patterns. The Fast Domain Adaptation (FDA) is also incorporated into our framework to reduce distribution dis- similarity of face samples from different domains, further enhancing the robustness of PA representations. Extensive experiments on both face anti-spoofing and face recognition datasets show that our GFA-CNN achieves not only superior performance for face anti-spoofing on cross-tests, but also high accuracy for face recognition.

?

Acknowledgement References

  • [1] ?M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensor- flow: a system for large-scale machine learning. In OSDI, volume 16, pages 265–283, 2016. 5 ?
  • [2] ?Z. Boulkenafet, J. Komulainen, Z. Akhtar, A. Benlam- oudi, D. Samai, S. E. Bekhouche, A. Ouafi, F. Dornaika, A. Taleb-Ahmed, L. Qin, et al. A competition on generalized software-based face presentation attack detection in mobile scenarios. In IJCB, pages 688–696, 2017. 6 ?
  • [3] ?Z. Boulkenafet, J. Komulainen, and A. Hadid. Face anti- spoofing based on color texture analysis. In ICIP, pages 2636–2640, 2015. 6 ?
  • [4] ?Z.Boulkenafet,J.Komulainen,andA.Hadid.Facespoofing detection using colour texture analysis. T-IFS, 11(8):1818– 1830, 2016. 6 ?
  • [5] ?Z. Boulkenafet, J. Komulainen, and A. Hadid. Face anti- spoofing using speeded-up robust features and fisher vector encoding. IEEE Signal Processing Letters, 24(2):141–145, 2017. 1, 2, 6 ?
  • [6] ?Z. Boulkenafet, J. Komulainen, and A. Hadid. On the gen- eralization of color texture-based face anti-spoofing. Image and Vision Computing, 77:1–9, 2018. 7 ?
  • [7] ?Z.Boulkenafet,J.Komulainen,L.Li,X.Feng,andA.Hadid. Oulu-npu: A mobile face presentation attack database with real-world variations. In FG, pages 612–618, 2017. 4 ?
  • [8] ?I.Chingovska,A.Anjos,andS.Marcel.Ontheeffectiveness of local binary patterns in face anti-spoofing. In BIOSIG, 2012. 1, 2, 4 ?
  • [9] ?T.deFreitasPereira,A.Anjos,J.M.DeMartino,andS.Mar- cel. Can face anti-spoofing countermeasures work in a real world scenario? In ICB, pages 1–8, 2013. 7 ?
  • [10] ?A. Dubey, O. Gupta, P. Guo, R. Raskar, R. Farrell, and N. Naik. Pairwise confusion for fine-grained visual classi- fication. In ECCV, pages 70–86, 2018. 3 ?
  • [11] ?L. Engstrom. Fast style transfer. https://github. com/lengstrom/fast-style-transfer/, 2016. 2 ?

[12] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller. La- beled faces in the wild: A database forstudying face recog- nition in unconstrained environments. In ECCVW, 2008. 3

[13] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, pages 694–711, 2016. 4

[14] A. Jourabloo, Y. Liu, and X. Liu. Anti-spoofing via noise modeling. arXiv:1807.09968, 2018. 7

Face de-spoofing:

arXiv preprint

[15] K. Kollreider, H. Fronthaler, M. I. Faraj, and J. Bigun. Real- time face detection and motion analysis with application in liveness assessment. T-IFS, 2(3):548–558, 2007. 2

[16] J.Komulainen,A.Hadid,andM.Pietikainen.Contextbased face anti-spoofing. In BTAS, pages 1–8, 2013. 1, 2

[17] H. Li, P. He, S. Wang, A. Rocha, X. Jiang, and A. C. Kot. Learning generalized deep feature representation for face anti-spoofing. T-IFS, 13(10):2639–2652, 2018. 2

[18] H. Li, W. Li, H. Cao, S. Wang, F. Huang, and A. C. Kot. Unsupervised domain adaptation for face anti-spoofing. T- IFS, 13(7):1794–1809, 2018. 1

[19] Y. Liu, A. Jourabloo, and X. Liu. Learning deep models for face anti-spoofing: Binary or auxiliary supervision. In CVPR, pages 389–398, 2018. 1, 2, 4, 7

[20] G. Pan, L. Sun, Z. Wu, and S. Lao. Eyeblink-based anti- spoofing in face recognition from a generic webcamera. 2007. 2

[21] X. Pan, P. Luo, J. Shi, and X. Tang. Two at once: Enhanc- ing learning and generalization capacities via ibn-net. arXiv preprint arXiv:1807.09441, 2018. 4, 5

[22] O. M. Parkhi, A. Vedaldi, A. Zisserman, et al. Deep face recognition. In BMVC, volume 1, page 6, 2015. 5

[23] M. Sajjad, S. Khan, T. Hussain, K. Muhammad, A. K. San- gaiah, A. Castiglione, C. Esposito, and S. W. Baik. Cnn- based anti-spoofing two-tier multi-factor authentication sys- tem. Pattern Recognition Letters, 2018. 3

[24] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 6

[25] X. Tan, Y. Li, J. Liu, and L. Jiang. Face liveness detection from a single image with sparse low rank bilinear discrimi- native model. In ECCV, pages 504–517, 2010. 1

[26] D. Wen, H. Han, and A. K. Jain. Face spoof detection with image distortion analysis. T-IFS, 10(4):746–761, 2015. 4, 6 [27] J. Yang, Z. Lei, and S. Z. Li. Learn convolutional neural net-

work for face anti-spoofing. arXiv preprint arXiv:1408.5601,

2014. 1, 2, 7?[28] Z. Zhang, J. Yan, S. Liu, Z. Lei, D. Yi, and S. Z. Li. A face

antispoofing database with diverse attacks. In ICB, pages 26–31, 2012. 4

?

總結(jié)

以上是生活随笔為你收集整理的【论文翻译】Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。