日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 人文社科 > 生活经验 >内容正文

生活经验

Bi-LSTM-CRF for Sequence Labeling

發布時間:2023/11/28 生活经验 38 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Bi-LSTM-CRF for Sequence Labeling 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

做了一段時間的Sequence Labeling的工作,發現在NER任務上面,很多論文都采用LSTM-CRFs的結構。CRF在最后一層應用進來可以考慮到概率最大的最優label路徑,可以提高指標。

一般的深度學習框架是沒有CRF layer的,需要手動實現。最近在學習PyTorch,里面有一個Bi-LSTM-CRF的tutorial實現。不得不說PyTorch的tutorial真是太良心了,基本涵蓋了NLP領域各個流行的model實現。在這里從頭梳理一遍,也記錄下學習過程中的一些問題。

?

Bi-LSTM-CRF的結構一般如上,最后一層利用CRF來學習一個最優路徑。Bi-LSTM layer的輸出維度是tag size,這就相當于是每個詞映射到tag的發射概率值,設Bi-LSTM的輸出矩陣為,其中代表詞映射到的非歸一化概率。對于CRF來說,我們假定存在一個轉移矩陣,則代表轉移到的轉移概率。

?

對于輸入序列對應的輸出tag序列,定義分數為

?

利用Softmax函數,我們為每一個正確的tag序列定義一個概率值(代表所有的tag序列,包括不可能出現的)

?

?

因而在訓練中,我們只需要最大化似然概率即可,這里我們利用對數似然

?

所以我們將損失函數定義為,就可以利用梯度下降法來進行網絡的學習了。

在對損失函數進行計算的時候,的計算很簡單,而(下面記作logsumexp)的計算稍微復雜一些,因為需要計算每一條可能路徑的分數。這里用一種簡便的方法,對于到詞的路徑,可以先把到詞的logsumexp計算出來,因為

?

因此先計算每一步的路徑分數和直接計算全局分數相同,但這樣可以大大減少計算的時間。下面是PyTorch中的代碼

?

def _forward_alg(self, feats):# Do the forward algorithm to compute the partition functioninit_alphas = torch.Tensor(1, self.tagset_size).fill_(-10000.)# START_TAG has all of the score.init_alphas[0][self.tag_to_ix[START_TAG]] = 0.# Wrap in a variable so that we will get automatic backpropforward_var = autograd.Variable(init_alphas)# Iterate through the sentencefor feat in feats:alphas_t = []  # The forward variables at this timestepfor next_tag in range(self.tagset_size):# broadcast the emission score: it is the same regardless of# the previous tagemit_score = feat[next_tag].view(1, -1).expand(1, self.tagset_size)# the ith entry of trans_score is the score of transitioning to# next_tag from itrans_score = self.transitions[next_tag].view(1, -1)# The ith entry of next_tag_var is the value for the# edge (i -> next_tag) before we do log-sum-expnext_tag_var = forward_var + trans_score + emit_score# The forward variable for this tag is log-sum-exp of all the# scores.alphas_t.append(log_sum_exp(next_tag_var))forward_var = torch.cat(alphas_t).view(1, -1)terminal_var = forward_var + self.transitions[self.tag_to_ix[STOP_TAG]]alpha = log_sum_exp(terminal_var)return alpha

在解碼時,采用Viterbi算法

def _viterbi_decode(self, feats):backpointers = []# Initialize the viterbi variables in log spaceinit_vvars = torch.Tensor(1, self.tagset_size).fill_(-10000.)init_vvars[0][self.tag_to_ix[START_TAG]] = 0# forward_var at step i holds the viterbi variables for step i-1forward_var = autograd.Variable(init_vvars)for feat in feats:bptrs_t = []  # holds the backpointers for this stepviterbivars_t = []  # holds the viterbi variables for this stepfor next_tag in range(self.tagset_size):# next_tag_var[i] holds the viterbi variable for tag i at the# previous step, plus the score of transitioning# from tag i to next_tag.# We don't include the emission scores here because the max# does not depend on them (we add them in below)next_tag_var = forward_var + self.transitions[next_tag]best_tag_id = argmax(next_tag_var)bptrs_t.append(best_tag_id)viterbivars_t.append(next_tag_var[0][best_tag_id])# Now add in the emission scores, and assign forward_var to the set# of viterbi variables we just computedforward_var = (torch.cat(viterbivars_t) + feat).view(1, -1)backpointers.append(bptrs_t)# Transition to STOP_TAGterminal_var = forward_var + self.transitions[self.tag_to_ix[STOP_TAG]]best_tag_id = argmax(terminal_var)path_score = terminal_var[0][best_tag_id]# Follow the back pointers to decode the best path.best_path = [best_tag_id]for bptrs_t in reversed(backpointers):best_tag_id = bptrs_t[best_tag_id]best_path.append(best_tag_id)# Pop off the start tag (we dont want to return that to the caller)start = best_path.pop()assert start == self.tag_to_ix[START_TAG]  # Sanity checkbest_path.reverse()return path_score, best_path

全部代碼實現可以移步Bi-LSTM-CRF。

參考

Bidirectional LSTM-CRF Models for Sequence Tagging
Neural Architectures for Named Entity Recognition
Advanced: Making Dynamic Decisions and the Bi-LSTM CRF

總結

以上是生活随笔為你收集整理的Bi-LSTM-CRF for Sequence Labeling的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。