當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

svm rbf人脸识别 yale_实操课——机器学习之人脸识别

發布時間：2023/11/27 生活经验 26 豆豆

生活随笔收集整理的這篇文章主要介紹了 svm rbf人脸识别 yale_实操课——机器学习之人脸识别小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

SVM(Support Vector Machine)指的是支持向量機，是常見的一種判別方法。在機器學習領域，是一個有監督的學習模型，通常用來進行模式識別、分類以及回歸分析。在n維空間中找到一個分類超平面，將空間上的點分類。一般而言，一個點距離超平面的遠近可以表示為分類預測的確信或準確程度。SVM就是要最大化這個間隔值。而在虛線上的點便叫做支持向量Supprot Verctor。通過本任務，您將掌握以下內容：1、理解支持向量機(support vector machine)算法思想。2、掌握sklearn庫對SVM的用法。3、熟悉機器學習構建模型并預測數據的思想。4、理解訓練集和測試集的作用。5、掌握如何通過matplotlib庫繪制圖形。6、學會評估模型好壞的方法。

實驗原理：

支持向量機(support vector machine)是一種分類算法，通過尋求結構化風險最小來提高學習機泛化能力，實現經驗風險和置信范圍的最小化，從而達到在統計樣本量較少的情況下，亦能獲得良好統計規律的目的。通俗來講，它是一種二類分類模型，其基本模型定義為特征空間上的間隔最大的線性分類器，即支持向量機的學習策略便是間隔最大化，最終可轉化為一個凸二次規劃問題的求解。

具體原理：

1. 在n維空間中找到一個分類超平面，將空間上的點分類。如下圖是線性分類的例子。

2. 一般而言，一個點距離超平面的遠近可以表示為分類預測的確信或準確程度。SVM就是要最大化這個間隔值。而在虛線上的點便叫做支持向量Supprot Verctor。

3. 實際中，我們會經常遇到線性不可分的樣例，此時，我們的常用做法是把樣例特征映射到高維空間中去(如下圖)；

3. 線性不可分映射到高維空間，可能會導致維度大小高到可怕的(19維乃至無窮維的例子)，導致計算復雜。核函數的價值在于它雖然也是講特征進行從低維到高維的轉換，但核函數絕就絕在它事先在低維上進行計算，而將實質上的分類效果表現在了高維上，也就如上文所說的避免了直接在高維空間中的復雜計算。

4.使用松弛變量處理數據噪音

sklearn中SVM的結構，及各個參數說明如下

sklearn.svm.SVC ：

view plain?copy

sklearn.svm.SVC(C=1.0,?kernel='rbf',?degree=3,?gamma='auto',?coef0=0.0,?shrinking=True,?probability=False,tol=0.001,?cache_size=200,?class_weight=None,?verbose=False,?max_iter=-1,?decision_function_shape=None,random_state=None)??

參數說明：

view plain?copy

C：C-SVC的懲罰參數C?默認值是1.0??
C越大，相當于懲罰松弛變量，希望松弛變量接近0，即對誤分類的懲罰增大，趨向于對訓練集全分對的情況，這樣對訓練集測試時準確率很高，但泛化能力弱。C值小，對誤分類的懲罰減小，允許容錯，將他們當成噪聲點，泛化能力較強。??
kernel ：核函數，默認是rbf，可以是‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’??
??　　0?–?線性：u'v ?
　　? 1 –?多項式：(gamma*u'*v + coef0)^degree ?
??　　2 – RBF函數：exp(-gamma|u-v|^2)??
??　　3 –sigmoid：tanh(gamma*u'*v + coef0)??
degree ：多項式poly函數的維度，默認是3，選擇其他核函數時會被忽略。??
gamma ：?‘rbf’,‘poly’?和‘sigmoid’的核函數參數。默認是’auto’，則會選擇1/n_features ?
coef0?：核函數的常數項。對于‘poly’和?‘sigmoid’有用。??
probability ：是否采用概率估計？.默認為False ?
shrinking ：是否采用shrinking heuristic方法，默認為true??
tol ：停止訓練的誤差值大小，默認為1e-3 ?
cache_size ：核函數cache緩存大小，默認為200??
class_weight ：類別的權重，字典形式傳遞。設置第幾類的參數C為weight*C(C-SVC中的C)??
verbose ：允許冗余輸出？??
max_iter ：最大迭代次數。-1為無限制。??
decision_function_shape ：‘ovo’, ‘ovr’ or None, default=None3??
random_state ：數據洗牌時的種子值，int值??

主要調節的參數有：C、kernel、degree、gamma、coef0。

系統環境

Linux Ubuntu 16.04

Python3.6

任務內容

用SVM算法對fetch_lfw_people數據進行人臉識別，并將預測結果可視化。

任務步驟

1.創建目錄并下載實驗所需的數據。

view plain?copy

mkdir?-p?/home/zhangyu/scikit_learn_data/lfw_home??
cd?/home/zhangyu/scikit_learn_data/lfw_home??
wget?http://192.168.1.100:60000/allfiles/ma_learn/lfwfunneled.tgz??
wget?http://192.168.1.100:60000/allfiles/ma_learn/pairsDevTest.txt??
wget?http://192.168.1.100:60000/allfiles/ma_learn/pairsDevTrain.txt??
wget?http://192.168.1.100:60000/allfiles/ma_learn/pairs.txt??
tar?xzvf?lfwfunneled.tgz??

2.新建Python project ，名為python15.

在python15項目下，新建Python file，名為SVM

3.用SVM算法對fetch_lfw_people數據進行人臉識別，并將預測結果可視化,完整代碼如下：

view plain?copy

from?__future__?import?print_function??
from?time?import?time??
import?logging??
import?matplotlib.pyplot?as?plt??
from?sklearn.model_selection?import?train_test_split??
from?sklearn.datasets?import?fetch_lfw_people??
from?sklearn.model_selection?import?GridSearchCV??
from?sklearn.metrics?import?classification_report??
from?sklearn.metrics?import?confusion_matrix??
from?sklearn.decomposition?import?PCA??
from?sklearn.svm?import?SVC??
#?Display?progress?logs?on?stdout??
logging.basicConfig(level=logging.INFO,?format='%(asctime)s?%(message)s')??
###############################################################################??
#?Download?the?data,?if?not?already?on?disk?and?load?it?as?numpy?arrays??
lfw_people?=?fetch_lfw_people(min_faces_per_person=70,?resize=0.4)??
#?introspect?the?images?arrays?to?find?the?shapes?(for?plotting)??
n_samples,?h,?w?=?lfw_people.images.shape??
#?for?machine?learning?we?use?the?2?data?directly?(as?relative?pixel??
#?positions?info?is?ignored?by?this?model)??
X?=?lfw_people.data??
n_features?=?X.shape[1]??
#?the?label?to?predict?is?the?id?of?the?person??
y?=?lfw_people.target??
target_names?=?lfw_people.target_names??
n_classes?=?target_names.shape[0]??
print("Total?dataset?size:")??
print("n_samples:?%d"?%?n_samples)??
print("n_features:?%d"?%?n_features)??
print("n_classes:?%d"?%?n_classes)??
###############################################################################??
#?Split?into?a?training?set?and?a?test?set?using?a?stratified?k?fold??
#?split?into?a?training?and?testing?set??
X_train,?X_test,?y_train,?y_test?=?train_test_split(??
????X,?y,?test_size=0.25)??
###############################################################################??
#?Compute?a?PCA?(eigenfaces)?on?the?face?dataset?(treated?as?unlabeled??
#?dataset):?unsupervised?feature?extraction?/?dimensionality?reduction??
n_components?=?150??
print("Extracting?the?top?%d?eigenfaces?from?%d?faces"??
??????%?(n_components,?X_train.shape[0]))??
t0?=?time()??
pca?=?PCA(svd_solver='randomized',n_components=n_components,?whiten=True).fit(X_train)??
print("done?in?%0.3fs"?%?(time()?-?t0))??
eigenfaces?=?pca.components_.reshape((n_components,?h,?w))??
print("Projecting?the?input?data?on?the?eigenfaces?orthonormal?basis")??
t0?=?time()??
X_train_pca?=?pca.transform(X_train)??
X_test_pca?=?pca.transform(X_test)??
print("done?in?%0.3fs"?%?(time()?-?t0))??
###############################################################################??
#?Train?a?SVM?classification?model??
print("Fitting?the?classifier?to?the?training?set")??
t0?=?time()??
param_grid?=?{'C':?[1e3,?5e3,?1e4,?5e4,?1e5],??
??????????????'gamma':?[0.0001,?0.0005,?0.001,?0.005,?0.01,?0.1],?}??
clf?=?GridSearchCV(SVC(kernel='rbf',?class_weight='balanced'),?param_grid)??
clf?=?clf.fit(X_train_pca,?y_train)??
print("done?in?%0.3fs"?%?(time()?-?t0))??
print("Best?estimator?found?by?grid?search:")??
print(clf.best_estimator_)??
###############################################################################??
#?Quantitative?evaluation?of?the?model?quality?on?the?test?set??
print("Predicting?people's?names?on?the?test?set")??
t0?=?time()??
y_pred?=?clf.predict(X_test_pca)??
print("done?in?%0.3fs"?%?(time()?-?t0))??
print(classification_report(y_test,?y_pred,?target_names=target_names))??
print(confusion_matrix(y_test,?y_pred,?labels=range(n_classes)))??
###############################################################################??
#?Qualitative?evaluation?of?the?predictions?using?matplotlib??
def?plot_gallery(images,?titles,?h,?w,?n_row=3,?n_col=4):??
????"""Helper?function?to?plot?a?gallery?of?portraits"""??
????plt.figure(figsize=(1.8?*?n_col,?2.4?*?n_row))??
????plt.subplots_adjust(bottom=0,?left=.01,?right=.99,?top=.90,?hspace=.35)??
????for?i?in?range(n_row?*?n_col):??
????????plt.subplot(n_row,?n_col,?i?+?1)??
????????plt.imshow(images[i].reshape((h,?w)),?cmap=plt.cm.gray)??
????????plt.title(titles[i],?size=12)??
????????plt.xticks(())??
????????plt.yticks(())??
#?plot?the?result?of?the?prediction?on?a?portion?of?the?test?set??
def?title(y_pred,?y_test,?target_names,?i):??
????pred_name?=?target_names[y_pred[i]].rsplit('?',?1)[-1]??
????true_name?=?target_names[y_test[i]].rsplit('?',?1)[-1]??
????return?'predicted:?%s\ntrue:??????%s'?%?(pred_name,?true_name)??
prediction_titles?=?[title(y_pred,?y_test,?target_names,?i)??
?????????????????????for?i?in?range(y_pred.shape[0])]??
plot_gallery(X_test,?prediction_titles,?h,?w)??
#?plot?the?gallery?of?the?most?significative?eigenfaces??
eigenface_titles?=?["eigenface?%d"?%?i?for?i?in?range(eigenfaces.shape[0])]??
plot_gallery(eigenfaces,?eigenface_titles,?h,?w)??
plt.show()??

4.對完整代碼進行分部描述，用import導入實驗所用到的包

view plain?copy

from?__future__?import?print_function??
from?time?import?time??
import?logging??
import?matplotlib.pyplot?as?plt??
from?sklearn.model_selection?import?train_test_split??
from?sklearn.datasets?import?fetch_lfw_people??
from?sklearn.model_selection?import?GridSearchCV??
from?sklearn.metrics?import?classification_report??
from?sklearn.metrics?import?confusion_matrix??
from?sklearn.decomposition?import?PCA??
from?sklearn.svm?import?SVC??

5.提取數據

view plain?copy

lfw_people?=?fetch_lfw_people(min_faces_per_person=70,?resize=0.4)??
#?introspect?the?images?arrays?to?find?the?shapes?(for?plotting)??
n_samples,?h,?w?=?lfw_people.images.shape??
#?for?machine?learning?we?use?the?2?data?directly?(as?relative?pixel??
#?positions?info?is?ignored?by?this?model)??
X?=?lfw_people.data??
n_features?=?X.shape[1]??
#?the?label?to?predict?is?the?id?of?the?person??
y?=?lfw_people.target??
target_names?=?lfw_people.target_names??
n_classes?=?target_names.shape[0]??
print("Total?dataset?size:")??
print("n_samples:?%d"?%?n_samples)??
print("n_features:?%d"?%?n_features)??
print("n_classes:?%d"?%?n_classes)??

運行結果：

6.特征提取

view plain?copy

X_train,?X_test,?y_train,?y_test?=?train_test_split(??
????X,?y,?test_size=0.25)??
###############################################################################??
#?Compute?a?PCA?(eigenfaces)?on?the?face?dataset?(treated?as?unlabeled??
#?dataset):?unsupervised?feature?extraction?/?dimensionality?reduction??
n_components?=?150??
print("Extracting?the?top?%d?eigenfaces?from?%d?faces"??
??????%?(n_components,?X_train.shape[0]))??
t0?=?time()??
pca?=?PCA(svd_solver='randomized',n_components=n_components,?whiten=True).fit(X_train)??
print("done?in?%0.3fs"?%?(time()?-?t0))??
eigenfaces?=?pca.components_.reshape((n_components,?h,?w))??
print("Projecting?the?input?data?on?the?eigenfaces?orthonormal?basis")??
t0?=?time()??
X_train_pca?=?pca.transform(X_train)??
X_test_pca?=?pca.transform(X_test)??
print("done?in?%0.3fs"?%?(time()?-?t0))??

運行結果：

7.建立SVM分類模型

view plain?copy

print("Fitting?the?classifier?to?the?training?set")??
t0?=?time()??
param_grid?=?{'C':?[1e3,?5e3,?1e4,?5e4,?1e5],??
??????????????'gamma':?[0.0001,?0.0005,?0.001,?0.005,?0.01,?0.1],?}??
clf?=?GridSearchCV(SVC(kernel='rbf',?class_weight='balanced'),?param_grid)??
clf?=?clf.fit(X_train_pca,?y_train)??
print("done?in?%0.3fs"?%?(time()?-?t0))??
print("Best?estimator?found?by?grid?search:")??
print(clf.best_estimator_)??

運行結果：

8.模型評估

view plain?copy

print("Predicting?people's?names?on?the?test?set")??
t0?=?time()??
y_pred?=?clf.predict(X_test_pca)??
print("done?in?%0.3fs"?%?(time()?-?t0))??
print(classification_report(y_test,?y_pred,?target_names=target_names))??
print(confusion_matrix(y_test,?y_pred,?labels=range(n_classes)))??

運行結果：

9.預測結果可視化

view plain?copy

def?plot_gallery(images,?titles,?h,?w,?n_row=3,?n_col=4):??
????"""Helper?function?to?plot?a?gallery?of?portraits"""??
????plt.figure(figsize=(1.8?*?n_col,?2.4?*?n_row))??
????plt.subplots_adjust(bottom=0,?left=.01,?right=.99,?top=.90,?hspace=.35)??
????for?i?in?range(n_row?*?n_col):??
????????plt.subplot(n_row,?n_col,?i?+?1)??
????????plt.imshow(images[i].reshape((h,?w)),?cmap=plt.cm.gray)??
????????plt.title(titles[i],?size=12)??
????????plt.xticks(())??
????????plt.yticks(())??
#?plot?the?result?of?the?prediction?on?a?portion?of?the?test?set??
def?title(y_pred,?y_test,?target_names,?i):??
????pred_name?=?target_names[y_pred[i]].rsplit('?',?1)[-1]??
????true_name?=?target_names[y_test[i]].rsplit('?',?1)[-1]??
????return?'predicted:?%s\ntrue:??????%s'?%?(pred_name,?true_name)??
prediction_titles?=?[title(y_pred,?y_test,?target_names,?i)??
?????????????????????for?i?in?range(y_pred.shape[0])]??
plot_gallery(X_test,?prediction_titles,?h,?w)??
#?plot?the?gallery?of?the?most?significative?eigenfaces??
eigenface_titles?=?["eigenface?%d"?%?i?for?i?in?range(eigenfaces.shape[0])]??
plot_gallery(eigenfaces,?eigenface_titles,?h,?w)??
plt.show()??

運行結果：

eigenface：

總結

以上是生活随笔為你收集整理的svm rbf人脸识别 yale_实操课——机器学习之人脸识别的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：上海欢乐谷买了门票进去玩还要钱吗
下一篇： mysql数据库验证登陆不上_MySQL