日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 运维知识 > windows >内容正文

windows

推荐系统算法总结(三)——FM与DNN DeepFM

發(fā)布時(shí)間:2024/1/17 windows 45 豆豆
生活随笔 收集整理的這篇文章主要介紹了 推荐系统算法总结(三)——FM与DNN DeepFM 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

來源:https://blog.csdn.net/qq_23269761/article/details/81366939,如有不妥,請(qǐng)隨時(shí)聯(lián)系溝通,謝謝~

0.瘋狂安利一個(gè)博客

FM的前世今生:?
https://tracholar.github.io/machine-learning/2017/03/10/factorization-machine.html#%E7%BB%BC%E8%BF%B0

1.FM 與 DNN和embedding的關(guān)系

先來復(fù)習(xí)一下FM?
?
?
對(duì)FM模型進(jìn)行求解后,對(duì)于每一個(gè)特征xi都能夠得到對(duì)應(yīng)的隱向量vi,那么這個(gè)vi到底是什么呢?

想一想Google提出的word2vec,word2vec是word embedding方法的一種,word embedding的意思就是,給出一個(gè)文檔,文檔就是一個(gè)單詞序列,比如 “A B A C B F G”, 希望對(duì)文檔中每個(gè)不同的單詞都得到一個(gè)對(duì)應(yīng)的向量(往往是低維向量)表示。比如,對(duì)于這樣的“A B A C B F G”的一個(gè)序列,也許我們最后能得到:A對(duì)應(yīng)的向量為[0.1 0.6 -0.5],B對(duì)應(yīng)的向量為[-0.2 0.9 0.7] 。

所以結(jié)論就是:?
FM算法是一個(gè)特征組合以及降維的工具,它能夠?qū)⒃疽驗(yàn)閛ne-hot編碼產(chǎn)生的稀疏特征,進(jìn)行兩兩組合后還能做一個(gè)降維!!降到多少維呢?就是FM中隱因子的個(gè)數(shù)k

2.FNN

利用FM做預(yù)訓(xùn)練實(shí)現(xiàn)embedding,再通過DNN進(jìn)行訓(xùn)練?
?
這樣的模型則是考慮了高階特征,而在最后sigmoid輸出時(shí)忽略了低階特征本身。

3.DeepFM

鑒于上述理論,目前新出的很多基于深度學(xué)習(xí)的CTR模型都從wide、deep(即低階、高階)兩方面同時(shí)進(jìn)行考慮,進(jìn)一步提高模型的泛化能力,比如DeepFM。?
參考博客:https://blog.csdn.net/zynash2/article/details/79348540?
?
可以看到,整個(gè)模型大體分為兩部分:FM和DNN。簡(jiǎn)單敘述一下模型的流程:借助FNN的思想,利用FM進(jìn)行embedding,之后的wide和deep模型共享embedding之后的結(jié)果。DNN的輸入完全和FNN相同(這里不用預(yù)訓(xùn)練,直接把embedding層看作一層的NN),而通過一定方式組合后,模型在wide上完全模擬出了FM的效果(至于為什么,論文中沒有詳細(xì)推導(dǎo),本文會(huì)稍后給出推導(dǎo)過程),最后將DNN和FM的結(jié)果組合后激活輸出。

需要著重強(qiáng)調(diào)理解的時(shí)模型中關(guān)于FM的部分,究竟時(shí)如何搭建網(wǎng)絡(luò)計(jì)算2階特征的?
**劃重點(diǎn):**embedding層對(duì)于DNN來說時(shí)在提取特征,對(duì)于FM來說就是他的2階特征啊!!!!只不過FM和DNN共享embedding層而已。

4.DeepFM代碼解讀

先放代碼鏈接:?
https://github.com/ChenglongChen/tensorflow-DeepFM?
數(shù)據(jù)下載地址:?
https://www.kaggle.com/c/porto-seguro-safe-driver-prediction

4.0 項(xiàng)目目錄

?
data:存儲(chǔ)訓(xùn)練數(shù)據(jù)與測(cè)試數(shù)據(jù)?
output/fig:用來存放輸出結(jié)果和訓(xùn)練曲線?
config:數(shù)據(jù)獲取和特征工程中一些參數(shù)的設(shè)置?
DataReader:特征工程,獲得真正用于訓(xùn)練的特征集合?
main:主程序入口?
mertics:定義了gini指標(biāo)作為評(píng)價(jià)指標(biāo)?
DeepFM:模型定義

4.1 整體過程

推薦一篇此數(shù)據(jù)集的EDA分析,看過可以對(duì)數(shù)據(jù)集的全貌有所了解:?
https://blog.csdn.net/qq_37195507/article/details/78553581

  • 1._load_data()
  • def _load_data():

  • ?
  • dfTrain = pd.read_csv(config.TRAIN_FILE)

  • dfTest = pd.read_csv(config.TEST_FILE)

  • ?
  • def preprocess(df):

  • cols = [c for c in df.columns if c not in ["id", "target"]]

  • df["missing_feat"] = np.sum((df[cols] == -1).values, axis=1)

  • df["ps_car_13_x_ps_reg_03"] = df["ps_car_13"] * df["ps_reg_03"]

  • return df

  • ?
  • dfTrain = preprocess(dfTrain)

  • dfTest = preprocess(dfTest)

  • ?
  • cols = [c for c in dfTrain.columns if c not in ["id", "target"]]

  • cols = [c for c in cols if (not c in config.IGNORE_COLS)]

  • ?
  • X_train = dfTrain[cols].values

  • y_train = dfTrain["target"].values

  • X_test = dfTest[cols].values

  • ids_test = dfTest["id"].values

  • cat_features_indices = [i for i,c in enumerate(cols) if c in config.CATEGORICAL_COLS]

  • ?
  • return dfTrain, dfTest, X_train, y_train, X_test, ids_test, cat_features_indices

  • 首先讀取原始數(shù)據(jù)文件TRAIN_FILE,TEST_FILE?
    preprocess(df)添加了兩個(gè)特征分別是missing_feat【缺失特征個(gè)數(shù)】與ps_car_13_x_ps_reg_03【兩個(gè)特征的乘積】?
    返回:?
    dfTrain, dfTest :所有特征都存在的Dataframe形式?
    X_train, X_test:刪掉了IGNORE_COLS的ndarray格式 【X_test后面都沒有用到啊】?
    y_train: label?
    ids_test:測(cè)試集的id,ndarray?
    cat_features_indices:類別特征的特征indices

    • 利用X_train, y_train 進(jìn)行了K折均衡交叉驗(yàn)證切分?jǐn)?shù)據(jù)集
    • DeepFM參數(shù)設(shè)置
    • 2._run_base_model_dfm
  • def _run_base_model_dfm(dfTrain, dfTest, folds, dfm_params):

  • fd = FeatureDictionary(dfTrain=dfTrain, dfTest=dfTest,

  • numeric_cols=config.NUMERIC_COLS,

  • ignore_cols=config.IGNORE_COLS)

  • data_parser = DataParser(feat_dict=fd)

  • Xi_train, Xv_train, y_train = data_parser.parse(df=dfTrain, has_label=True)

  • Xi_test, Xv_test, ids_test = data_parser.parse(df=dfTest)

  • ?
  • dfm_params["feature_size"] = fd.feat_dim

  • dfm_params["field_size"] = len(Xi_train[0])

  • ?
  • y_train_meta = np.zeros((dfTrain.shape[0], 1), dtype=float)

  • y_test_meta = np.zeros((dfTest.shape[0], 1), dtype=float)

  • _get = lambda x, l: [x[i] for i in l]

  • gini_results_cv = np.zeros(len(folds), dtype=float)

  • gini_results_epoch_train = np.zeros((len(folds), dfm_params["epoch"]), dtype=float)

  • gini_results_epoch_valid = np.zeros((len(folds), dfm_params["epoch"]), dtype=float)

  • for i, (train_idx, valid_idx) in enumerate(folds):

  • Xi_train_, Xv_train_, y_train_ = _get(Xi_train, train_idx), _get(Xv_train, train_idx), _get(y_train, train_idx)

  • Xi_valid_, Xv_valid_, y_valid_ = _get(Xi_train, valid_idx), _get(Xv_train, valid_idx), _get(y_train, valid_idx)

  • ?
  • dfm = DeepFM(**dfm_params)

  • dfm.fit(Xi_train_, Xv_train_, y_train_, Xi_valid_, Xv_valid_, y_valid_)

  • ?
  • y_train_meta[valid_idx,0] = dfm.predict(Xi_valid_, Xv_valid_)

  • y_test_meta[:,0] += dfm.predict(Xi_test, Xv_test)

  • ?
  • gini_results_cv[i] = gini_norm(y_valid_, y_train_meta[valid_idx])

  • gini_results_epoch_train[i] = dfm.train_result

  • gini_results_epoch_valid[i] = dfm.valid_result

  • ?
  • y_test_meta /= float(len(folds))

  • ?
  • # save result

  • if dfm_params["use_fm"] and dfm_params["use_deep"]:

  • clf_str = "DeepFM"

  • elif dfm_params["use_fm"]:

  • clf_str = "FM"

  • elif dfm_params["use_deep"]:

  • clf_str = "DNN"

  • print("%s: %.5f (%.5f)"%(clf_str, gini_results_cv.mean(), gini_results_cv.std()))

  • filename = "%s_Mean%.5f_Std%.5f.csv"%(clf_str, gini_results_cv.mean(), gini_results_cv.std())

  • _make_submission(ids_test, y_test_meta, filename)

  • ?
  • _plot_fig(gini_results_epoch_train, gini_results_epoch_valid, clf_str)

  • ?
  • return y_train_meta, y_test_meta

  • 經(jīng)過?
    DataReader中的FeatureDictionary?
    這個(gè)對(duì)象中有一個(gè)self.feat_dict屬性,長(zhǎng)下面這個(gè)樣子:

    {'missing_feat': 0, 'ps_ind_18_bin': {0: 254, 1: 255}, 'ps_reg_01': 256, 'ps_reg_02': 257, 'ps_reg_03': 258}
    • ?

    DataReader中的DataParser

  • class DataParser(object):

  • def __init__(self, feat_dict):

  • self.feat_dict = feat_dict #這個(gè)feat_dict是FeatureDictionary對(duì)象實(shí)例

  • ?
  • def parse(self, infile=None, df=None, has_label=False):

  • assert not ((infile is None) and (df is None)), "infile or df at least one is set"

  • assert not ((infile is not None) and (df is not None)), "only one can be set"

  • if infile is None:

  • dfi = df.copy()

  • else:

  • dfi = pd.read_csv(infile)

  • if has_label:

  • y = dfi["target"].values.tolist()

  • dfi.drop(["id", "target"], axis=1, inplace=True)

  • else:

  • ids = dfi["id"].values.tolist()

  • dfi.drop(["id"], axis=1, inplace=True)

  • # dfi for feature index

  • # dfv for feature value which can be either binary (1/0) or float (e.g., 10.24)

  • dfv = dfi.copy()

  • for col in dfi.columns:

  • if col in self.feat_dict.ignore_cols:

  • dfi.drop(col, axis=1, inplace=True)

  • dfv.drop(col, axis=1, inplace=True)

  • continue

  • if col in self.feat_dict.numeric_cols:

  • dfi[col] = self.feat_dict.feat_dict[col]

  • else:

  • dfi[col] = dfi[col].map(self.feat_dict.feat_dict[col])

  • dfv[col] = 1.

  • #dfi.to_csv('dfi.csv')

  • #dfv.to_csv('dfv.csv')

  • ?
  • # list of list of feature indices of each sample in the dataset

  • Xi = dfi.values.tolist()

  • # list of list of feature values of each sample in the dataset

  • Xv = dfv.values.tolist()

  • if has_label:

  • return Xi, Xv, y

  • else:

  • return Xi, Xv, ids

  • 這里Xi,Xv都是二位數(shù)組,可以將dfi,dfv存在csv文件中看一下長(zhǎng)什么樣子,長(zhǎng)的很奇怪【可能后面模型需要吧~】?
    dfi:value值為特征index,也就是上文中feat_dict屬性保存的值?

    dfv:如果是數(shù)值變量,則保持原本的值,如果是分類變量,則value為1?

    4.2 模型架構(gòu)

  • def _init_graph(self):

  • self.graph = tf.Graph()

  • with self.graph.as_default():

  • ?
  • tf.set_random_seed(self.random_seed)

  • ?
  • self.feat_index = tf.placeholder(tf.int32, shape=[None, None],

  • name="feat_index") # None * F

  • self.feat_value = tf.placeholder(tf.float32, shape=[None, None],

  • name="feat_value") # None * F

  • self.label = tf.placeholder(tf.float32, shape=[None, 1], name="label") # None * 1

  • self.dropout_keep_fm = tf.placeholder(tf.float32, shape=[None], name="dropout_keep_fm")

  • self.dropout_keep_deep = tf.placeholder(tf.float32, shape=[None], name="dropout_keep_deep")

  • self.train_phase = tf.placeholder(tf.bool, name="train_phase")

  • ?
  • self.weights = self._initialize_weights()

  • ?
  • # model

  • self.embeddings = tf.nn.embedding_lookup(self.weights["feature_embeddings"],

  • self.feat_index) # None * F * K

  • ?
  • #print(self.weights["feature_embeddings"]) shape=[259,8] n*k個(gè)隱向量

  • #print(self.embeddings) shape=[?,39,8] f*k 每個(gè)field取出一個(gè)隱向量[這不是FFM每個(gè)field取是在取非0量,減少計(jì)算]

  • feat_value = tf.reshape(self.feat_value, shape=[-1, self.field_size, 1])

  • #print(feat_value) shape=[?,39*1] 某一個(gè)樣本的39個(gè)Feature值

  • self.embeddings = tf.multiply(self.embeddings, feat_value) #multiply在有一個(gè)維度不同時(shí),較少的維度會(huì)自行擴(kuò)展

  • #print(self.embeddings) shape=[?,39*8]

  • # 所以這個(gè)multiply之后得到的矩陣是Vixi,方便以后進(jìn)行<Vi,Vj>*xi*xj=<Vi*xi,Vj*xj>的計(jì)算,后面的計(jì)算FM被簡(jiǎn)化為了

  • # sum_square part-square_sum part的形式,采用上面multiply的形式更方便啊!

  • ?
  • # ---------- first order term ----------

  • self.y_first_order = tf.nn.embedding_lookup(self.weights["feature_bias"], self.feat_index) # None * F * 1

  • self.y_first_order = tf.reduce_sum(tf.multiply(self.y_first_order, feat_value), 2) # None * F

  • self.y_first_order = tf.nn.dropout(self.y_first_order, self.dropout_keep_fm[0]) # None * F

  • ?
  • # ---------- second order term ---------------

  • # sum_square part

  • self.summed_features_emb = tf.reduce_sum(self.embeddings, 1) # None * K

  • self.summed_features_emb_square = tf.square(self.summed_features_emb) # None * K

  • ?
  • # square_sum part

  • self.squared_features_emb = tf.square(self.embeddings)

  • self.squared_sum_features_emb = tf.reduce_sum(self.squared_features_emb, 1) # None * K

  • ?
  • # second order

  • self.y_second_order = 0.5 * tf.subtract(self.summed_features_emb_square, self.squared_sum_features_emb) # None * K

  • self.y_second_order = tf.nn.dropout(self.y_second_order, self.dropout_keep_fm[1]) # None * K

  • ?
  • # ---------- Deep component ----------

  • self.y_deep = tf.reshape(self.embeddings, shape=[-1, self.field_size * self.embedding_size]) # None * (F*K)

  • self.y_deep = tf.nn.dropout(self.y_deep, self.dropout_keep_deep[0])

  • for i in range(0, len(self.deep_layers)):

  • self.y_deep = tf.add(tf.matmul(self.y_deep, self.weights["layer_%d" %i]), self.weights["bias_%d"%i]) # None * layer[i] * 1

  • if self.batch_norm:

  • self.y_deep = self.batch_norm_layer(self.y_deep, train_phase=self.train_phase, scope_bn="bn_%d" %i) # None * layer[i] * 1

  • self.y_deep = self.deep_layers_activation(self.y_deep)

  • self.y_deep = tf.nn.dropout(self.y_deep, self.dropout_keep_deep[1+i]) # dropout at each Deep layer

  • ?
  • # ---------- DeepFM ----------

  • if self.use_fm and self.use_deep:

  • concat_input = tf.concat([self.y_first_order, self.y_second_order, self.y_deep], axis=1)

  • elif self.use_fm:

  • concat_input = tf.concat([self.y_first_order, self.y_second_order], axis=1)

  • elif self.use_deep:

  • concat_input = self.y_deep

  • self.out = tf.add(tf.matmul(concat_input, self.weights["concat_projection"]), self.weights["concat_bias"])

  • 不知道為什么這篇代碼把FM寫的看起來很復(fù)雜。人家復(fù)雜是有原因的!!避免了使用one-hot編碼后的大大大矩陣?
    其實(shí)就是embedding層Deep和FM共用了隱向量【feature_size*k】矩陣

    所以這個(gè)實(shí)現(xiàn)的重點(diǎn)在embedding層啊,這里的實(shí)現(xiàn)方式通過Xi,Xv兩個(gè)較小的矩陣【n*field】注意這里field不是FFM中的F,而是未one-hot編碼前的Feature數(shù)量。?

    根據(jù)內(nèi)積的公式我們可以得到

    總結(jié)

    以上是生活随笔為你收集整理的推荐系统算法总结(三)——FM与DNN DeepFM的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。