當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Face3D学习笔记（5）3DMM示例源码解析【中下】从二维图片的特征点重建三维模型——黄金标准算法

發布時間：2023/12/20 编程问答 20 豆豆

生活随笔收集整理的這篇文章主要介紹了 Face3D学习笔记（5）3DMM示例源码解析【中下】从二维图片的特征点重建三维模型——黄金标准算法小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

寫在前面

為了保證整個示例項目更加直觀，方便理解，在展示一些函數的源碼時會使用numpy版本進行展示，而在示例程序中并未使用numpy版本的庫，在Cython版本與numpy版本出現差異的原碼前會有標注，希望讀者留意。
3DMM實例程序的jupyter版本后續會更新，完全免費，歡迎大家下載

源碼解析

在上一篇文章了解3DMM模型以及用隨機的形狀系數和表情系數生成面部網格進行3DMM模型的前向重建過程后進入例程的后半部分—— 由2D圖像點和對應的3D頂點索引得到新的參數進而從二維圖片進行三維人臉的重建。

理論部分

理論這部分借鑒了大佬的文章和一些論文。
從上篇文章我們了解了3DMM模型的公式：

通過一張單張正臉照片，首先利用人臉對齊算法計算得到目標二維人臉的68個特征點坐標 $x_i$ ，在BFM模型中有對應的68個特征點 $X_i$ ，投影后忽略第三維，則特征點之間的對應關系如下：

根據這些信息求出 $α,β\alpha, \beta$ 系數，將平均臉模型與照片中的臉部進行擬合,即:

因此，三維人臉重建問題再次轉化為求解系數( $α,β\alpha,\beta$ )以滿足下列能量方程的問題：

人臉模型的三維點以及對應照片中的二維點存在映射關系，這個可以由一個3x4的仿射矩陣 $P$ 進行表示。即： $X=PA?X3dX=P_A\cdot X_{3d}$ 。

黃金標準算法

要計算出仿射矩陣，代碼中使用了黃金標準算法（Gold Standard algorithm)
算法如下：
目標為給定n>=4組3維( $X_i$ )到2維( $x_i$ ?)的圖像點對應，確定仿射攝像機投影矩陣的最大似然估計。

歸一化，對于二維點( $x_i$ ?)，計算一個相似變換 $T$ ，使得 $xˉ=Txi\bar{x} =Tx_i$ ?，同樣的對于三維點，計算 $Xˉ=UXi\bar{X}=UX_i$ ?
對于每組對應點 $x_i$ ~ $X_i$ ?，都有形如 $A x = b$ 的對應關系存在
求出A的偽逆
去掉歸一化，得到仿射矩陣

在Face3d中的求解過程

過程可以概述如下：
(1)初始化 $α,β\alpha,\beta$ 為0；
(2)利用黃金標準算法得到一個仿射矩陣 $P_A$ ，分解得到 $s,R,t_{2d}$ ；
(3)將(2)中求出的 $s,R,t_{2d}$ 帶入能量方程，解得 $α\alpha$ ；
(4)將(2)和(3)中求出的 $α\alpha$ 代入能量方程，解得 $β\beta$ ；
(5)更新 $α,β\alpha,\beta$ 的值，重復(2)-(4)進行迭代更新。

代碼部分

下面將從Face3D的例程到源碼一步步進行講解：

例程部分

x = projected_vertices[bfm.kpt_ind, :2] # 2d keypoint, which can be detected from image X_ind = bfm.kpt_ind # index of keypoints in 3DMM. fixed.# fit fitted_sp, fitted_ep, fitted_s, fitted_angles, fitted_t = bfm.fit(x, X_ind, max_iter = 3)# verify fitted parameters fitted_vertices = bfm.generate_vertices(fitted_sp, fitted_ep) transformed_vertices = bfm.transform(fitted_vertices, fitted_s, fitted_angles, fitted_t)image_vertices = mesh.transform.to_image(transformed_vertices, h, w) fitted_image = mesh.render.render_colors(image_vertices, bfm.triangles, colors, h, w)

x就是公式中的二維特征點 $X$ ,例程里面給的是上篇文章生成二維圖像時導出的二維數據。
X_ind是BFM模型三維特征點的索引，并非坐標。
然后執行了
fitted_sp, fitted_ep, fitted_s, fitted_angles, fitted_t = bfm.fit(x, X_ind, max_iter = 3
其中bfm.fit部分的源碼如下：

def fit(self, x, X_ind, max_iter = 4, isShow = False):''' fit 3dmm & pose parametersArgs:x: (n, 2) image pointsX_ind: (n,) corresponding Model vertex indicesmax_iter: iterationisShow: whether to reserve middle results for showReturns:fitted_sp: (n_sp, 1). shape parametersfitted_ep: (n_ep, 1). exp parameterss, angles, t'''if isShow:fitted_sp, fitted_ep, s, R, t = fit.fit_points_for_show(x, X_ind, self.model, n_sp = self.n_shape_para, n_ep = self.n_exp_para, max_iter = max_iter)angles = np.zeros((R.shape[0], 3))for i in range(R.shape[0]):angles[i] = mesh.transform.matrix2angle(R[i])else:fitted_sp, fitted_ep, s, R, t = fit.fit_points(x, X_ind, self.model, n_sp = self.n_shape_para, n_ep = self.n_exp_para, max_iter = max_iter)angles = mesh.transform.matrix2angle(R)return fitted_sp, fitted_ep, s, angles, t

標簽isShow給的是默認的False所以執行的else部分，里面執行了模型擬合部分代碼：
fitted_sp, fitted_ep, s, R, t = fit.fit_points(x, X_ind, self.model, n_sp = self.n_shape_para, n_ep = self.n_exp_para, max_iter = max_iter)
以及生成旋轉矩陣代碼：
angles = mesh.transform.matrix2angle(R)
其中模型擬合部分的fit.fit_points部分的源碼如下：

def fit_points(x, X_ind, model, n_sp, n_ep, max_iter = 4):'''Args:x: (n, 2) image pointsX_ind: (n,) corresponding Model vertex indicesmodel: 3DMMmax_iter: iterationReturns:sp: (n_sp, 1). shape parametersep: (n_ep, 1). exp parameterss, R, t'''x = x.copy().T#-- initsp = np.zeros((n_sp, 1), dtype = np.float32)ep = np.zeros((n_ep, 1), dtype = np.float32)#-------------------- estimateX_ind_all = np.tile(X_ind[np.newaxis, :], [3, 1])*3X_ind_all[1, :] += 1X_ind_all[2, :] += 2valid_ind = X_ind_all.flatten('F')shapeMU = model['shapeMU'][valid_ind, :]shapePC = model['shapePC'][valid_ind, :n_sp]expPC = model['expPC'][valid_ind, :n_ep]for i in range(max_iter):X = shapeMU + shapePC.dot(sp) + expPC.dot(ep)X = np.reshape(X, [int(len(X)/3), 3]).T#----- estimate poseP = mesh.transform.estimate_affine_matrix_3d22d(X.T, x.T)s, R, t = mesh.transform.P2sRt(P)rx, ry, rz = mesh.transform.matrix2angle(R)# print('Iter:{}; estimated pose: s {}, rx {}, ry {}, rz {}, t1 {}, t2 {}'.format(i, s, rx, ry, rz, t[0], t[1]))#----- estimate shape# expressionshape = shapePC.dot(sp)shape = np.reshape(shape, [int(len(shape)/3), 3]).Tep = estimate_expression(x, shapeMU, expPC, model['expEV'][:n_ep,:], shape, s, R, t[:2], lamb = 0.002)# shapeexpression = expPC.dot(ep)expression = np.reshape(expression, [int(len(expression)/3), 3]).Tsp = estimate_shape(x, shapeMU, shapePC, model['shapeEV'][:n_sp,:], expression, s, R, t[:2], lamb = 0.004)return sp, ep, s, R, t

fit.fit_points部分拆分講解

(1)初始化 $α,β\alpha,\beta$ 為0

x = x.copy().T#-- initsp = np.zeros((n_sp, 1), dtype = np.float32)ep = np.zeros((n_ep, 1), dtype = np.float32)

x取轉置，格式變為（2,68）
sp即 $α\alpha$ ,ep即 $β\beta$ 。將它們賦值為格式（199,1）的零向量。

X_{3d}

進行坐標轉換
由于BFM模型中的頂點坐標儲存格式為{

x_1

y_1

z_1

x_2

y_2

z_2

x_3

y_3

,…}
而在X_ind中只給出了三位特征點坐標的位置，所以應該根據X_ind獲取

X_{3d}

的XYZ坐標數據。

X_ind_all = np.tile(X_ind[np.newaxis, :], [3, 1])*3X_ind_all[1, :] += 1X_ind_all[2, :] += 2valid_ind = X_ind_all.flatten('F')

X_ind數據如下，是一個（68,1）的位置數據。

X_ind_all = np.tile(X_ind[np.newaxis, :], [3, 1])*3
X_ind_all拓展為（3,68）并乘3來定位到坐標位置：

X_ind_all[1, :] += 1
X_ind_all[2, :] += 2
再將第二行加一、第三行加二來對于Y坐標和Z坐標。

然后將它們合并

valid_ind = X_ind_all.flatten('F')

flatten是numpy.ndarray.flatten的一個函數，即返回一個折疊成一維的數組。但是該函數只能適用于numpy對象，即array或者mat，普通的list列表是不行的。
'F’表示以列優先展開。
合并后的結果valid_ind如下圖：

通過合并后的valid_ind得到對應特征點的人臉形狀、形狀主成分、表情主成分這三種數據。
shapeMU = model['shapeMU'][valid_ind, :]
shapePC = model['shapePC'][valid_ind, :n_sp]
expPC = model['expPC'][valid_ind, :n_ep]

人臉形狀shapeMU數據格式（68*3,1）

形狀主成分shapePC數據格式（68*3,199）

表情主成分expPC數據格式（68*3,29）

for i in range(max_iter):X = shapeMU + shapePC.dot(sp) + expPC.dot(ep)X = np.reshape(X, [int(len(X)/3), 3]).T#----- estimate poseP = mesh.transform.estimate_affine_matrix_3d22d(X.T, x.T)s, R, t = mesh.transform.P2sRt(P)rx, ry, rz = mesh.transform.matrix2angle(R)# print('Iter:{}; estimated pose: s {}, rx {}, ry {}, rz {}, t1 {}, t2 {}'.format(i, s, rx, ry, rz, t[0], t[1]))#----- estimate shape# expressionshape = shapePC.dot(sp)shape = np.reshape(shape, [int(len(shape)/3), 3]).Tep = estimate_expression(x, shapeMU, expPC, model['expEV'][:n_ep,:], shape, s, R, t[:2], lamb = 0.002)# shapeexpression = expPC.dot(ep)expression = np.reshape(expression, [int(len(expression)/3), 3]).Tsp = estimate_shape(x, shapeMU, shapePC, model['shapeEV'][:n_sp,:], expression, s, R, t[:2], lamb = 0.004)return sp, ep, s, R, t

循環中的max_iter是自行定義的迭代次數，這里的輸入為4。
X = shapeMU + shapePC.dot(sp) + expPC.dot(ep)
X = np.reshape(X, [int(len(X)/3), 3]).T
這里的 $X$ 就是經過如下的運算的 $S_{newmodel}$ ,就是新的 $X_{3d}$ 。

真正重點的是mesh.transform.estimate_affine_matrix_3d22d(X.T, x.T)，這是網格的擬合部分。
源碼如下：

estimate_affine_matrix_3d22d(X, x):''' Using Golden Standard Algorithm for estimating an affine cameramatrix P from world to image correspondences.See Alg.7.2. in MVGCV Code Ref: https://github.com/patrikhuber/eos/blob/master/include/eos/fitting/affine_camera_estimation.hppx_homo = X_homo.dot(P_Affine)Args:X: [n, 3]. corresponding 3d points(fixed)x: [n, 2]. n>=4. 2d points(moving). x = PXReturns:P_Affine: [3, 4]. Affine camera matrix'''X = X.T; x = x.Tassert(x.shape[1] == X.shape[1])n = x.shape[1]assert(n >= 4)#--- 1. normalization# 2d pointsmean = np.mean(x, 1) # (2,)x = x - np.tile(mean[:, np.newaxis], [1, n])average_norm = np.mean(np.sqrt(np.sum(x**2, 0)))scale = np.sqrt(2) / average_normx = scale * xT = np.zeros((3,3), dtype = np.float32)T[0, 0] = T[1, 1] = scaleT[:2, 2] = -mean*scaleT[2, 2] = 1# 3d pointsX_homo = np.vstack((X, np.ones((1, n))))mean = np.mean(X, 1) # (3,)X = X - np.tile(mean[:, np.newaxis], [1, n])m = X_homo[:3,:] - Xaverage_norm = np.mean(np.sqrt(np.sum(X**2, 0)))scale = np.sqrt(3) / average_normX = scale * XU = np.zeros((4,4), dtype = np.float32)U[0, 0] = U[1, 1] = U[2, 2] = scaleU[:3, 3] = -mean*scaleU[3, 3] = 1# --- 2. equationsA = np.zeros((n*2, 8), dtype = np.float32);X_homo = np.vstack((X, np.ones((1, n)))).TA[:n, :4] = X_homoA[n:, 4:] = X_homob = np.reshape(x, [-1, 1])# --- 3. solutionp_8 = np.linalg.pinv(A).dot(b)P = np.zeros((3, 4), dtype = np.float32)P[0, :] = p_8[:4, 0]P[1, :] = p_8[4:, 0]P[-1, -1] = 1# --- 4. denormalizationP_Affine = np.linalg.inv(T).dot(P.dot(U))return P_Affinedef P2sRt(P):''' decompositing camera matrix PArgs: P: (3, 4). Affine Camera Matrix.Returns:s: scale factor.R: (3, 3). rotation matrix.t: (3,). translation. '''t = P[:, 3]R1 = P[0:1, :3]R2 = P[1:2, :3]s = (np.linalg.norm(R1) + np.linalg.norm(R2))/2.0r1 = R1/np.linalg.norm(R1)r2 = R2/np.linalg.norm(R2)r3 = np.cross(r1, r2)R = np.concatenate((r1, r2, r3), 0)return s, R, t

下面對這部分進行詳細解讀。

(2) 利用黃金標準算法得到一個仿射矩陣 $P_A$ ，分解得到 $s,R,t_{2d}$ ；

estimate_affine_matrix_3d22d部分即黃金標準算法具體過程

a) 歸一化

對于二維點 $X$ ，計算一個相似變換 $T$ ，使得 $Xˉ=TX\bar{X}=TX$ ，同樣的對于三維點 $X_{3d}$ ，計算 $Xˉ3d=UX3d\bar{X}_{3d}=UX_{3d}$ 。
歸一化部分的概念在Multiple View Geometry in Computer Vision一書中描述如下：
所以歸一化可以概述為以下三步：

平移所有坐標點，使它們的質心位于原點。

然后對這些點進行縮放，使到原點的平均距離等于

2\sqrt{2}

。

將該變換應用于圖像中的每一幅。

下面結合代碼進行講解：
輸入檢測，確保輸入的二維和三維特征點的數目一致以及特征點數目大于4。

X = X.T; x = x.Tassert(x.shape[1] == X.shape[1])n = x.shape[1]assert(n >= 4)

二維數據歸一化：

#--- 1. normalization# 2d pointsmean = np.mean(x, 1) # (2,)x = x - np.tile(mean[:, np.newaxis], [1, n])average_norm = np.mean(np.sqrt(np.sum(x**2, 0)))scale = np.sqrt(2) / average_normx = scale * xT = np.zeros((3,3), dtype = np.float32)T[0, 0] = T[1, 1] = scaleT[:2, 2] = -mean*scaleT[2, 2] = 1

平移所有坐標點，使它們的質心位于原點。
經過x=x.T后x的格式變為（2，68）
通過mean = np.mean(x, 1)獲取x的X坐標和Y坐標平均值mean,格式為(2,)
這一步x = x - np.tile(mean[:, np.newaxis], [1, n])
x的所有XY坐標都減去剛剛算出的平均值,此時x中的坐標點被平移到了質心位于原點的位置。

然后對這些點進行縮放，使到原點的平均距離等于

2\sqrt{2}

。
average_norm = np.mean(np.sqrt(np.sum(x**2, 0)))
算出所有此時所有二維點到原點的平均距離average_norm，這是一個數值。
scale = np.sqrt(2) / average_norm
x = scale * x
算出scale再用scale去乘x坐標，相當與x所有的坐標除以當前的平均距離之后乘以

2\sqrt{2}

。
這樣算出來的所有點到原點的平均距離就被縮放到了

2\sqrt{2}

。

同時通過計算出的scale和mean可以算出相似變換T
T = np.zeros((3,3), dtype = np.float32)
T[0, 0] = T[1, 1] = scale
T[:2, 2] = -mean*scale
T[2, 2] = 1

# 3d pointsX_homo = np.vstack((X, np.ones((1, n))))mean = np.mean(X, 1) # (3,)X = X - np.tile(mean[:, np.newaxis], [1, n])m = X_homo[:3,:] - Xaverage_norm = np.mean(np.sqrt(np.sum(X**2, 0)))scale = np.sqrt(3) / average_normX = scale * XU = np.zeros((4,4), dtype = np.float32)U[0, 0] = U[1, 1] = U[2, 2] = scaleU[:3, 3] = -mean*scaleU[3, 3] = 1

三位歸一化的原理與二維相似，區別就是所有點到原點的平均距離要被縮放到 $3\sqrt{3}$ ,以及生成的相似變換矩陣 $U$ 格式為（4，4）。這里不贅述了。

b) 對于每組對應點 $x_i$ ~ $X_i$ ?，都有形如 $A x = b$ 的對應關系存在

# --- 2. equationsA = np.zeros((n*2, 8), dtype = np.float32);X_homo = np.vstack((X, np.ones((1, n)))).TA[:n, :4] = X_homoA[n:, 4:] = X_homob = np.reshape(x, [-1, 1])

這里結合下面的公式來看：
A對應其中的 $[XˉiT0T0TXiT]\left [\begin{array}{l} \bar{X}_i^T & 0^T\\0^T & {X}_i^T\end{array}\right ]$
b是展開為（68*2,1）格式的x。

c) 求出A的偽逆

# --- 3. solutionp_8 = np.linalg.pinv(A).dot(b)P = np.zeros((3, 4), dtype = np.float32)P[0, :] = p_8[:4, 0]P[1, :] = p_8[4:, 0]P[-1, -1] = 1

關于A的偽逆的概念和求取方法可以參照Multiple View Geometry in Computer Vision書中的P590以后的內容。這里A的偽逆是利用numpy里面的函數np.linalg.pinv直接計算出來的，非常方便。

d)去掉歸一化，得到仿射矩陣

# --- 4. denormalizationP_Affine = np.linalg.inv(T).dot(P.dot(U))return P_Affine

這部分的代碼參照公式：

以上四步就是黃金標準算法的完整過程
得到的 $P_{Affine}$ 就是式中的 $P_A$ ，到這里，我們通過黃金標準算法得到了 $X=PA?X3dX=P_A\cdot X_{3d}$ 中的 $P_A$ ?。

將仿射矩陣 $R_A$ 分解得到 $s,R,t_{2d}$

s, R, t = mesh.transform.P2sRt(P) rx, ry, rz = mesh.transform.matrix2angle(R)

其中mesh.transform.P2sRt部分的源碼如下:

def P2sRt(P):''' decompositing camera matrix PArgs: P: (3, 4). Affine Camera Matrix.Returns:s: scale factor.R: (3, 3). rotation matrix.t: (3,). translation. '''t = P[:, 3]R1 = P[0:1, :3]R2 = P[1:2, :3]s = (np.linalg.norm(R1) + np.linalg.norm(R2))/2.0r1 = R1/np.linalg.norm(R1)r2 = R2/np.linalg.norm(R2)r3 = np.cross(r1, r2)R = np.concatenate((r1, r2, r3), 0)return s, R, t

這部分就是將仿射矩陣 ${R_A}$ 分解為下圖的縮放比例s、旋轉矩陣R以及平移矩陣t。

這部分代碼比較簡單，讀者可以自行理解。
篇幅原因，這邊只給出（1）（2）的源碼解析部分，求解 $α,β\alpha,\beta$ 的過程將在下篇文章講解。

總結

以上是生活随笔為你收集整理的Face3D学习笔记（5）3DMM示例源码解析【中下】从二维图片的特征点重建三维模型——黄金标准算法的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： html中post和get区别
下一篇： vs.net打包生成可执行文件的方法