05.序列模型 W2.自然语言处理与词嵌入(作业:词向量+Emoji表情生成)
文章目錄
- 作業(yè)1:
- 1. 余弦相似度
- 2. 單詞類比
- 3. 詞向量糾偏
- 3.1 消除對非性別詞語的偏見
- 3.2 性別詞的均衡算法
- 作業(yè)2:Emojify表情生成
- 1. Baseline model: Emojifier-V1
- 1.1 數(shù)據(jù)集
- 1.2 模型預(yù)覽
- 1.3 實現(xiàn) Emojifier-V1
- 1.4 在訓(xùn)練集上測試
- 2. Emojifier-V2: Using LSTMs in Keras
- 2.1 模型預(yù)覽
- 2.2 Keras and mini-batching
- 2.3 Embedding 層
- 2.3 建立 Emojifier-V2
測試題:參考博文
筆記:W2.自然語言處理與詞嵌入
作業(yè)1:
- 加載預(yù)訓(xùn)練的 單詞向量,用 cos(θ)cos(\theta)cos(θ) 余弦夾角 測量相似度
- 使用詞嵌入解決類比問題
- 修改詞嵌入降低性比歧視
這個作業(yè)使用 50-維的 GloVe vectors 表示單詞
words, word_to_vec_map = read_glove_vecs('data/glove.6B.50d.txt')1. 余弦相似度
CosineSimilarity(u,?v)=u.v∣∣u∣∣2∣∣v∣∣2=cos(θ)\text{CosineSimilarity(u, v)} = \frac {u . v} {||u||_2 ||v||_2} = cos(\theta)CosineSimilarity(u,?v)=∣∣u∣∣2?∣∣v∣∣2?u.v?=cos(θ)
其中 ∣∣u∣∣2=∑i=1nui2||u||_2 = \sqrt{\sum_{i=1}^{n} u_i^2}∣∣u∣∣2?=∑i=1n?ui2??
# GRADED FUNCTION: cosine_similaritydef cosine_similarity(u, v):"""Cosine similarity reflects the degree of similariy between u and vArguments:u -- a word vector of shape (n,) v -- a word vector of shape (n,)Returns:cosine_similarity -- the cosine similarity between u and v defined by the formula above."""distance = 0.0### START CODE HERE #### Compute the dot product between u and v (≈1 line)dot = np.dot(u, v)# Compute the L2 norm of u (≈1 line)norm_u = np.linalg.norm(u)# Compute the L2 norm of v (≈1 line)norm_v = np.linalg.norm(v)# Compute the cosine similarity defined by formula (1) (≈1 line)cosine_similarity = dot/(norm_u*norm_v)### END CODE HERE ###return cosine_similarity2. 單詞類比
例如:男人:女人 --> 國王:王后
# GRADED FUNCTION: complete_analogydef complete_analogy(word_a, word_b, word_c, word_to_vec_map):"""Performs the word analogy task as explained above: a is to b as c is to ____. Arguments:word_a -- a word, stringword_b -- a word, stringword_c -- a word, stringword_to_vec_map -- dictionary that maps words to their corresponding vectors. Returns:best_word -- the word such that v_b - v_a is close to v_best_word - v_c, as measured by cosine similarity"""# convert words to lower caseword_a, word_b, word_c = word_a.lower(), word_b.lower(), word_c.lower()### START CODE HERE #### Get the word embeddings v_a, v_b and v_c (≈1-3 lines)e_a, e_b, e_c = word_to_vec_map[word_a],word_to_vec_map[word_b],word_to_vec_map[word_c]### END CODE HERE ###words = word_to_vec_map.keys()max_cosine_sim = -100 # Initialize max_cosine_sim to a large negative numberbest_word = None # Initialize best_word with None, it will help keep track of the word to output# loop over the whole word vector setfor w in words: # to avoid best_word being one of the input words, pass on them.if w in [word_a, word_b, word_c] :continue### START CODE HERE #### Compute cosine similarity between the vector (e_b - e_a) and the vector ((w's vector representation) - e_c) (≈1 line)cosine_sim = cosine_similarity(e_b-e_a, word_to_vec_map[w]-e_c)# If the cosine_sim is more than the max_cosine_sim seen so far,# then: set the new max_cosine_sim to the current cosine_sim and the best_word to the current word (≈3 lines)if cosine_sim > max_cosine_sim:max_cosine_sim = cosine_simbest_word = w### END CODE HERE ###return best_word測試:
triads_to_try = [('italy', 'italian', 'spain'), ('india', 'delhi', 'japan'), ('man', 'woman', 'boy'), ('small', 'smaller', 'large')] for triad in triads_to_try:print ('{} -> {} :: {} -> {}'.format( *triad, complete_analogy(*triad,word_to_vec_map)))輸出:
italy -> italian :: spain -> spanish india -> delhi :: japan -> tokyo man -> woman :: boy -> girl small -> smaller :: large -> larger額外測試:
good -> ok :: bad -> oops(糟糕) father -> dad :: mother -> mom3. 詞向量糾偏
研究反映在單詞嵌入中的性別偏見,并探索減少這種偏見的算法
g = word_to_vec_map['woman'] - word_to_vec_map['man'] print(g)輸出:向量(50維)
[-0.087144 0.2182 -0.40986 -0.03922 -0.1032 0.94165-0.06042 0.32988 0.46144 -0.35962 0.31102 -0.868240.96006 0.01073 0.24337 0.08193 -1.02722 -0.211220.695044 -0.00222 0.29106 0.5053 -0.099454 0.404450.30181 0.1355 -0.0606 -0.07131 -0.19245 -0.06115-0.3204 0.07165 -0.13337 -0.25068714 -0.14293 -0.224957-0.149 0.048882 0.12191 -0.27362 -0.165476 -0.204260.54376 -0.271425 -0.10245 -0.32108 0.2516 -0.33455-0.04371 0.01258 ] print ('List of names and their similarities with constructed vector:')# girls and boys name name_list = ['john', 'marie', 'sophie', 'ronaldo', 'priya', 'rahul', 'danielle', 'reza', 'katy', 'yasmin']for w in name_list:print (w, cosine_similarity(word_to_vec_map[w], g))輸出:
List of names and their similarities with constructed vector: john -0.23163356145973724 marie 0.315597935396073 sophie 0.31868789859418784 ronaldo -0.31244796850329437 priya 0.17632041839009402 rahul -0.16915471039231716 danielle 0.24393299216283895 reza -0.07930429672199553 katy 0.2831068659572615 yasmin 0.2331385776792876可以看出,
- 女性的名字往往與向量 𝑔 有正的余弦相似性,
- 而男性的名字往往有負的余弦相似性。結(jié)果似乎可以接受。
試試其他的詞語
print('Other words and their similarities:') word_list = ['lipstick', 'guns', 'science', 'arts', 'literature', 'warrior','doctor', 'tree', 'receptionist', 'technology', 'fashion', 'teacher', 'engineer', 'pilot', 'computer', 'singer'] for w in word_list:print (w, cosine_similarity(word_to_vec_map[w], g))輸出:
Other words and their similarities: lipstick 0.2769191625638267 guns -0.1888485567898898 science -0.06082906540929701 arts 0.008189312385880337 literature 0.06472504433459932 warrior -0.20920164641125288 doctor 0.11895289410935041 tree -0.07089399175478091 receptionist 0.3307794175059374 technology -0.13193732447554302 fashion 0.03563894625772699 teacher 0.17920923431825664 engineer -0.0803928049452407 pilot 0.0010764498991916937 computer -0.10330358873850498 singer 0.1850051813649629這些結(jié)果反映了某些性別歧視。例如,“computer 計算機”更接近“man 男人”,“l(fā)iterature 文學(xué)”更接近“woman 女人”。
下面看到如何使用Boliukbasi等人2016年提出的算法來減少這些向量的偏差。
請注意,有些詞對,如“演員”/“女演員”或“祖母”/“祖父”應(yīng)保持性別特異性,而其他詞如“接待員”或“技術(shù)”應(yīng)保持中立,即與性別無關(guān)。糾偏時,你必須區(qū)別對待這兩種類型的單詞
3.1 消除對非性別詞語的偏見
ebias_component=e?g∣∣g∣∣22?ge^{bias\_component} = \frac{e \cdot g}{||g||_2^2} * gebias_component=∣∣g∣∣22?e?g??g
edebiased=e?ebias_componente^{debiased} = e - e^{bias\_component}edebiased=e?ebias_component
def neutralize(word, g, word_to_vec_map):"""Removes the bias of "word" by projecting it on the space orthogonal to the bias axis. This function ensures that gender neutral words are zero in the gender subspace.Arguments:word -- string indicating the word to debiasg -- numpy-array of shape (50,), corresponding to the bias axis (such as gender)word_to_vec_map -- dictionary mapping words to their corresponding vectors.Returns:e_debiased -- neutralized word vector representation of the input "word""""### START CODE HERE #### Select word vector representation of "word". Use word_to_vec_map. (≈ 1 line)e = word_to_vec_map[word]# Compute e_biascomponent using the formula give above. (≈ 1 line)e_biascomponent = np.dot(e, g)/np.linalg.norm(g)**2*g# Neutralize e by substracting e_biascomponent from it # e_debiased should be equal to its orthogonal projection. (≈ 1 line)e_debiased = e - e_biascomponent### END CODE HERE ###return e_debiased測試:
e = "receptionist" print("cosine similarity between " + e + " and g, before neutralizing: ", cosine_similarity(word_to_vec_map["receptionist"], g))e_debiased = neutralize("receptionist", g, word_to_vec_map) print("cosine similarity between " + e + " and g, after neutralizing: ", cosine_similarity(e_debiased, g))輸出:
cosine similarity between receptionist and g, before neutralizing: 0.3307794175059374 cosine similarity between receptionist and g, after neutralizing: -2.099120994400013e-17糾偏以后,receptionist(接待員)與性別的相似度接近于 0,既不偏向男人,也不偏向女人
3.2 性別詞的均衡算法
如何將糾偏應(yīng)用于單詞對,例如“女演員”和“演員”。
均衡化應(yīng)用:只希望通過性別屬性而有所不同的單詞對。
作為一個具體的例子,假設(shè)“女演員”比“演員”更接近“保姆”,通過對“保姆”進行中性化,我們可以減少與保姆相關(guān)的性別刻板印象。但這仍然不能保證“演員”和“女演員”與“保姆”的距離相等,均衡算法可以處理這一點。
μ=ew1+ew22\mu = \frac{e_{w1} + e_{w2}}{2}μ=2ew1?+ew2??
μB=μ?bias_axis∣∣bias_axis∣∣22?bias_axis\mu_{B} = \frac {\mu \cdot \text{bias\_axis}}{||\text{bias\_axis}||_2^2} *\text{bias\_axis}μB?=∣∣bias_axis∣∣22?μ?bias_axis??bias_axis
μ⊥=μ?μB\mu_{\perp} = \mu - \mu_{B}μ⊥?=μ?μB?
ew1B=ew1?bias_axis∣∣bias_axis∣∣22?bias_axise_{w1B} = \frac {e_{w1} \cdot \text{bias\_axis}}{||\text{bias\_axis}||_2^2} *\text{bias\_axis}ew1B?=∣∣bias_axis∣∣22?ew1??bias_axis??bias_axis
ew2B=ew2?bias_axis∣∣bias_axis∣∣22?bias_axise_{w2B} = \frac {e_{w2} \cdot \text{bias\_axis}}{||\text{bias\_axis}||_2^2} *\text{bias\_axis}ew2B?=∣∣bias_axis∣∣22?ew2??bias_axis??bias_axis
ew1Bcorrected=∣1?∣∣μ⊥∣∣22∣?ew1B?μB∣(ew1?μ⊥)?μB)∣e_{w1B}^{corrected} = \sqrt{ |{1 - ||\mu_{\perp} ||^2_2} |} * \frac{e_{\text{w1B}} - \mu_B} {|(e_{w1} - \mu_{\perp}) - \mu_B)|}ew1Bcorrected?=∣1?∣∣μ⊥?∣∣22?∣??∣(ew1??μ⊥?)?μB?)∣ew1B??μB??
ew2Bcorrected=∣1?∣∣μ⊥∣∣22∣?ew2B?μB∣(ew2?μ⊥)?μB)∣e_{w2B}^{corrected} = \sqrt{ |{1 - ||\mu_{\perp} ||^2_2} |} * \frac{e_{\text{w2B}} - \mu_B} {|(e_{w2} - \mu_{\perp}) - \mu_B)|}ew2Bcorrected?=∣1?∣∣μ⊥?∣∣22?∣??∣(ew2??μ⊥?)?μB?)∣ew2B??μB??
e1=ew1Bcorrected+μ⊥e_1 = e_{w1B}^{corrected} + \mu_{\perp}e1?=ew1Bcorrected?+μ⊥?
e2=ew2Bcorrected+μ⊥e_2 = e_{w2B}^{corrected} + \mu_{\perp}e2?=ew2Bcorrected?+μ⊥?
def equalize(pair, bias_axis, word_to_vec_map):"""Debias gender specific words by following the equalize method described in the figure above.Arguments:pair -- pair of strings of gender specific words to debias, e.g. ("actress", "actor") bias_axis -- numpy-array of shape (50,), vector corresponding to the bias axis, e.g. genderword_to_vec_map -- dictionary mapping words to their corresponding vectorsReturnse_1 -- word vector corresponding to the first worde_2 -- word vector corresponding to the second word"""### START CODE HERE #### Step 1: Select word vector representation of "word". Use word_to_vec_map. (≈ 2 lines)w1, w2 = pair[0], pair[1]e_w1, e_w2 = word_to_vec_map[w1], word_to_vec_map[w2]# Step 2: Compute the mean of e_w1 and e_w2 (≈ 1 line)mu = (e_w1+e_w2)/2# Step 3: Compute the projections of mu over the bias axis and the orthogonal axis (≈ 2 lines)mu_B = np.dot(mu, bias_axis)/np.linalg.norm(bias_axis)**2*bias_axismu_orth = mu-mu_B# Step 4: Use equations (7) and (8) to compute e_w1B and e_w2B (≈2 lines)e_w1B = np.dot(e_w1,bias_axis)/np.linalg.norm(bias_axis)**2*bias_axise_w2B = np.dot(e_w2,bias_axis)/np.linalg.norm(bias_axis)**2*bias_axis# Step 5: Adjust the Bias part of e_w1B and e_w2B using the formulas (9) and (10) given above (≈2 lines)corrected_e_w1B = np.sqrt(np.abs(1-np.linalg.norm(mu_orth)**2))*np.divide((e_w1B-mu_B),np.abs(e_w1-mu_orth-mu_B))corrected_e_w2B = np.sqrt(np.abs(1-np.linalg.norm(mu_orth)**2))*np.divide((e_w2B-mu_B),np.abs(e_w2-mu_orth-mu_B))# Step 6: Debias by equalizing e1 and e2 to the sum of their corrected projections (≈2 lines)e1 = corrected_e_w1B+mu_orthe2 = corrected_e_w2B+mu_orth### END CODE HERE ###return e1, e2測試:
print("cosine similarities before equalizing:") print("cosine_similarity(word_to_vec_map[\"man\"], gender) = ", cosine_similarity(word_to_vec_map["man"], g)) print("cosine_similarity(word_to_vec_map[\"woman\"], gender) = ", cosine_similarity(word_to_vec_map["woman"], g)) print() e1, e2 = equalize(("man", "woman"), g, word_to_vec_map) print("cosine similarities after equalizing:") print("cosine_similarity(e1, gender) = ", cosine_similarity(e1, g)) print("cosine_similarity(e2, gender) = ", cosine_similarity(e2, g))輸出:
cosine similarities before equalizing: cosine_similarity(word_to_vec_map["man"], gender) = -0.11711095765336832 cosine_similarity(word_to_vec_map["woman"], gender) = 0.35666618846270376cosine similarities after equalizing: cosine_similarity(e1, gender) = -0.7165727525843935 cosine_similarity(e2, gender) = 0.7396596474928909平衡以后,相似度符號相反,數(shù)值接近
作業(yè)2:Emojify表情生成
使用 word vector representations 建立 Emojifier
讓你的消息更有表現(xiàn)力😁,使用單詞向量的話,可以是你的單詞沒有在該表情的關(guān)聯(lián)里面,也能學(xué)習(xí)到可以使用該表情。
- 導(dǎo)入一些包
1. Baseline model: Emojifier-V1
1.1 數(shù)據(jù)集
X:127個句子(字符串)
Y:整型 標(biāo)簽 0 - 4 ,是相關(guān)的句子的表情
- 加載數(shù)據(jù)集,訓(xùn)練集(127個樣本),測試集(56個樣本)
輸出:
['I', 'am', 'so', 'impressed', 'by', 'your', 'dedication', 'to', 'this', 'project']最長的句子是10個單詞
- 查看數(shù)據(jù)集
輸出:
Miss you so much ??
1.2 模型預(yù)覽
為了方便,把 Y 的形狀從 (m,1)(m,1)(m,1) 改成 one-hot 表示 (m,5)(m,5)(m,5)
輸出:
3 is converted into one hot [0. 0. 0. 1. 0.]1.3 實現(xiàn) Emojifier-V1
使用預(yù)訓(xùn)練的 50-dimensional GloVe embeddings
word_to_index, index_to_word, word_to_vec_map = read_glove_vecs('data/glove.6B.50d.txt')- 檢查下是否正確
輸出:
the index of cucumber in the vocabulary is 113317 the 289846th word in the vocabulary is potatos實現(xiàn) sentence_to_avg():
- 轉(zhuǎn)換每個句子為小寫,并切分成單詞
- 每個句子的單詞,使用 GloVe 向量表示,然后求句子的平均
測試:
avg = sentence_to_avg("Morrocan couscous is my favorite dish", word_to_vec_map) print("avg = ", avg)輸出:
avg = [-0.008005 0.56370833 -0.50427333 0.258865 0.55131103 0.03104983-0.21013718 0.16893933 -0.09590267 0.141784 -0.15708967 0.185258670.6495785 0.38371117 0.21102167 0.11301667 0.02613967 0.260377670.05820667 -0.01578167 -0.12078833 -0.02471267 0.4128455 0.51520610.38756167 -0.898661 -0.535145 0.33501167 0.68806933 -0.21562651.797155 0.10476933 -0.36775333 0.750785 0.10282583 0.348925-0.27262833 0.66768 -0.10706167 -0.283635 0.59580117 0.28747333-0.3366635 0.23393817 0.34349183 0.178405 0.1166155 -0.0764330.1445417 0.09808667]模型
用sentence_to_avg() 處理完以后,進行前向傳播、計算損失、后向傳播更新參數(shù)
z(i)=W.avg(i)+bz^{(i)} = W . avg^{(i)} + bz(i)=W.avg(i)+b
a(i)=softmax(z(i))a^{(i)} = softmax(z^{(i)})a(i)=softmax(z(i))
L(i)=?∑k=0ny?1Yohk(i)?log(ak(i))\mathcal{L}^{(i)} = - \sum_{k = 0}^{n_y - 1} Yoh^{(i)}_k * log(a^{(i)}_k)L(i)=?k=0∑ny??1?Yohk(i)??log(ak(i)?)
# GRADED FUNCTION: modeldef model(X, Y, word_to_vec_map, learning_rate = 0.01, num_iterations = 400):"""Model to train word vector representations in numpy.Arguments:X -- input data, numpy array of sentences as strings, of shape (m, 1)Y -- labels, numpy array of integers between 0 and 7, numpy-array of shape (m, 1)word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representationlearning_rate -- learning_rate for the stochastic gradient descent algorithmnum_iterations -- number of iterationsReturns:pred -- vector of predictions, numpy-array of shape (m, 1)W -- weight matrix of the softmax layer, of shape (n_y, n_h)b -- bias of the softmax layer, of shape (n_y,)"""np.random.seed(1)# Define number of training examplesm = Y.shape[0] # number of training examplesn_y = 5 # number of classes n_h = 50 # dimensions of the GloVe vectors # Initialize parameters using Xavier initializationW = np.random.randn(n_y, n_h) / np.sqrt(n_h)b = np.zeros((n_y,))# Convert Y to Y_onehot with n_y classesY_oh = convert_to_one_hot(Y, C = n_y) # Optimization loopfor t in range(num_iterations): # Loop over the number of iterationsfor i in range(m): # Loop over the training examples### START CODE HERE ### (≈ 4 lines of code)# Average the word vectors of the words from the i'th training exampleavg = sentence_to_avg(X[i], word_to_vec_map)# Forward propagate the avg through the softmax layerz = np.dot(W, avg)+ba = softmax(z)# Compute cost using the i'th training label's one hot representation and "A" (the output of the softmax)cost = - sum(Y_oh[i]*np.log(a))### END CODE HERE #### Compute gradients dz = a - Y_oh[i]dW = np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))db = dz# Update parameters with Stochastic Gradient DescentW = W - learning_rate * dWb = b - learning_rate * dbif t % 100 == 0:print("Epoch: " + str(t) + " --- cost = " + str(cost))pred = predict(X, Y, W, b, word_to_vec_map)return pred, W, b1.4 在訓(xùn)練集上測試
print("Training set:") pred_train = predict(X_train, Y_train, W, b, word_to_vec_map) print('Test set:') pred_test = predict(X_test, Y_test, W, b, word_to_vec_map)輸出:
Training set: Accuracy: 0.9772727272727273 Test set: Accuracy: 0.8571428571428571隨機猜測的話,平均概率是 20%(1/5),模型的效果很不錯,在只有127個訓(xùn)練樣本的情況下
讓我們來測試:
- 我們在訓(xùn)練集里看到了 I love you 有標(biāo)簽 ??
- 我們來檢查下使用 adore(愛慕) (該詞沒有在訓(xùn)練集出現(xiàn)過)
輸出:
Accuracy: 0.8333333333333334(5/6,最后一個錯了)
i adore you ??(adore 跟 love 有相似的 embedding )
i love you ??
funny lol 😄
lets play with a ball ?
food is ready 🍴
not feeling happy 😄(識別錯誤,不能發(fā)現(xiàn) not 這類組合詞)
檢查錯誤:
打印混淆矩陣可以幫助了解哪些樣本模型預(yù)測不準(zhǔn)。
一個混淆矩陣顯示了一個標(biāo)簽是一個類(真實標(biāo)簽)的例子被算法用不同的類(預(yù)測錯誤)錯誤標(biāo)記的頻率
2. Emojifier-V2: Using LSTMs in Keras
讓我們構(gòu)建一個LSTM模型,它將單詞序列作為輸入。這個模型將能夠考慮單詞順序。
Emojifier-V2 將繼續(xù)使用預(yù)先訓(xùn)練過的 word embeddings 來表示單詞,將把它們輸入LSTM,LSTM的任務(wù)是預(yù)測最合適的表情符號。
- 導(dǎo)入一些包
2.1 模型預(yù)覽
2.2 Keras and mini-batching
為了使樣本能夠批量訓(xùn)練,我們必須處理句子,使他們的長度都一樣長,長度不夠最大長度的,后面補上一些 0 向量 (ei,elove,eyou,0?,0?,…,0?)(e_{i}, e_{love}, e_{you}, \vec{0}, \vec{0}, \ldots, \vec{0})(ei?,elove?,eyou?,0,0,…,0)
2.3 Embedding 層
https://keras.io/zh/layers/embeddings/
- 先把所有句子的單詞對應(yīng)的 idx 填好
實現(xiàn) pretrained_embedding_layer()
- 初始化 詞嵌入矩陣,注意 shape
- 填充 詞嵌入矩陣,從word_to_vec_map里抽取
- 定義 Keras embedding 層,注意設(shè)置trainable = False,使之不可被訓(xùn)練,如果為True,則允許算法修改詞嵌入的值
- 將 嵌入權(quán)重 設(shè)置為與 嵌入矩陣 相等
2.3 建立 Emojifier-V2
https://keras.io/zh/layers/core/#input
https://keras.io/zh/layers/embeddings/#embedding
https://keras.io/zh/layers/recurrent/#lstm
https://keras.io/zh/layers/core/#dropout
https://keras.io/zh/layers/core/#dense
https://keras.io/zh/activations/
https://keras.io/zh/models/about-keras-models/#model
- 創(chuàng)建模型
輸出:
Model: "model_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_3 (InputLayer) (None, 10) 0 _________________________________________________________________ embedding_4 (Embedding) (None, 10, 50) 20000050 _________________________________________________________________ lstm_3 (LSTM) (None, 10, 128) 91648 _________________________________________________________________ dropout_1 (Dropout) (None, 10, 128) 0 _________________________________________________________________ lstm_4 (LSTM) (None, 128) 131584 _________________________________________________________________ dropout_2 (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 5) 645 _________________________________________________________________ activation_1 (Activation) (None, 5) 0 ================================================================= Total params: 20,223,927 Trainable params: 223,877 Non-trainable params: 20,000,050 注:(400,001個單詞*50詞向量維度) _________________________________________________________________- 配置模型
- 訓(xùn)練模型
轉(zhuǎn)換 X,Y 的格式
X_train_indices = sentences_to_indices(X_train, word_to_index, maxLen) Y_train_oh = convert_to_one_hot(Y_train, C = 5)訓(xùn)練
model.fit(X_train_indices, Y_train_oh, epochs = 50, batch_size = 32, shuffle=True)輸出:
WARNING:tensorflow:From c:\program files\python37\lib\site-packages\keras\backend\tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.Epoch 1/50 132/132 [==============================] - 1s 5ms/step - loss: 1.6088 - accuracy: 0.1970 Epoch 2/50 132/132 [==============================] - 0s 582us/step - loss: 1.5221 - accuracy: 0.3636 Epoch 3/50 132/132 [==============================] - 0s 574us/step - loss: 1.4762 - accuracy: 0.3939 (省略) Epoch 49/50 132/132 [==============================] - 0s 597us/step - loss: 0.0115 - accuracy: 1.0000 Epoch 50/50 132/132 [==============================] - 0s 582us/step - loss: 0.0182 - accuracy: 0.9924在訓(xùn)練集上的準(zhǔn)確率幾乎 100%
- 在測試集上測試
輸出:
56/56 [==============================] - 0s 2ms/stepTest accuracy = 0.875測試集上準(zhǔn)確率為 87.5%
- 查看預(yù)測錯誤的樣本
輸出:
Expected emoji:😞 prediction: work is hard 😄
Expected emoji:😞 prediction: This girl is messing with me ??
Expected emoji:😞 prediction: work is horrible 😄
Expected emoji:🍴 prediction: any suggestions for dinner 😄
Expected emoji:😄 prediction: you brighten my day ??
Expected emoji:😞 prediction: go away ?
Expected emoji:🍴 prediction: I did not have breakfast ??
- 用自己的例子測試
not feeling happy 😞 (這次 LSTM 可以預(yù)測 not 這類的組合詞了)
not very happy 😞
very happy 😄
i really love my wife ??
總結(jié):
- 如果你有一個訓(xùn)練集很小的NLP任務(wù),使用單詞嵌入可以顯著地幫助你的算法。單詞嵌入允許模型處理測試集中沒有出現(xiàn)在訓(xùn)練集中的單詞
- 在Keras(和大多數(shù)其他深度學(xué)習(xí)框架中)中訓(xùn)練序列模型需要一些重要的細節(jié):
本文地址:https://michael.blog.csdn.net/article/details/108902060
我的CSDN博客地址 https://michael.blog.csdn.net/
長按或掃碼關(guān)注我的公眾號(Michael阿明),一起加油、一起學(xué)習(xí)進步!
總結(jié)
以上是生活随笔為你收集整理的05.序列模型 W2.自然语言处理与词嵌入(作业:词向量+Emoji表情生成)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: [编程启蒙游戏] 2. 奇偶数
- 下一篇: TensorFlow 2.0 - tf.