當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

简洁优美的深度学习包-bert4keras

發布時間：2024/1/18 pytorch 35 豆豆

生活随笔收集整理的這篇文章主要介紹了简洁优美的深度学习包-bert4keras 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

新手友好bert4keras

在鵝廠實習階段，follow蘇神（科學空間）的博客，啟發了idea，成功改進了線上的一款模型。想法產出和實驗進展很大一部分得益于蘇神設計的bert4keras，清晰輕量、基于keras，可以很簡潔的實現bert，同時附上了很多易讀的example，對nlp新手及其友好！本文推薦幾篇基于bert4keras的項目，均來自蘇神，對新手入門bert比較合適~

項目1：測試bert的mlm

項目地址：basic_masked_language_model

tokenizer：分詞器，主要方法：encode,decode。
build_transformer_model：建立bert模型，建議看源碼，可以加載多種權重和模型結構（如unilm）。

import numpy as np from bert4keras.models import build_transformer_model from bert4keras.tokenizers import Tokenizer from bert4keras.snippets import to_arrayconfig_path = '/root/kg/bert/chinese_L-12_H-768_A-12/bert_config.json' checkpoint_path = '/root/kg/bert/chinese_L-12_H-768_A-12/bert_model.ckpt' dict_path = '/root/kg/bert/chinese_L-12_H-768_A-12/vocab.txt'tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分詞器 model = build_transformer_model(config_path=config_path, checkpoint_path=checkpoint_path, with_mlm=True ) # 建立模型，加載權重token_ids, segment_ids = tokenizer.encode(u'科學技術是第一生產力')# mask掉“技術” token_ids[3] = token_ids[4] = tokenizer._token_mask_id token_ids, segment_ids = to_array([token_ids], [segment_ids])# 用mlm模型預測被mask掉的部分 probas = model.predict([token_ids, segment_ids])[0] print(tokenizer.decode(probas[3:5].argmax(axis=1))) # 結果正是“技術”

項目2：句子對分類任務

項目地址：task_sentence_similarity_lcqmc
核心模型代碼：

句子1和句子2拼接在一起輸入bert。
bert模型的pooler輸出經dropout和mlp投影到2維空間，做分類問題。
最終整個模型是一個標準的keras model。

class data_generator(DataGenerator):"""數據生成器"""def __iter__(self, random=False):batch_token_ids, batch_segment_ids, batch_labels = [], [], []for is_end, (text1, text2, label) in self.sample(random):token_ids, segment_ids = tokenizer.encode(text1, text2, maxlen=maxlen)batch_token_ids.append(token_ids)batch_segment_ids.append(segment_ids)batch_labels.append([label])if len(batch_token_ids) == self.batch_size or is_end:batch_token_ids = sequence_padding(batch_token_ids)batch_segment_ids = sequence_padding(batch_segment_ids)batch_labels = sequence_padding(batch_labels)yield [batch_token_ids, batch_segment_ids], batch_labelsbatch_token_ids, batch_segment_ids, batch_labels = [], [], []# 加載預訓練模型 bert = build_transformer_model(config_path=config_path,checkpoint_path=checkpoint_path,with_pool=True,return_keras_model=False, )output = Dropout(rate=0.1)(bert.model.output) output = Dense(units=2, activation='softmax', kernel_initializer=bert.initializer )(output)model = keras.models.Model(bert.model.input, output)

項目3：標題生成任務

項目地址：task_seq2seq_autotitle
NLG任務很方便用unilm結構實現，而bert4keras實現unilm只需一個參數。

model = build_transformer_model(config_path,checkpoint_path,application='unilm',keep_tokens=keep_tokens, # 只保留keep_tokens中的字，精簡原字表 )

NLG任務的loss是交叉熵，示例中的實現很美觀：

CrossEntropy類繼承Loss類，重寫compute_loss。
將參與計算loss的變量過一遍CrossEntropy，這個過程中loss會被計算，具體閱讀Loss類源碼。
最終整個模型是一個標準的keras model。

class CrossEntropy(Loss):"""交叉熵作為loss，并mask掉輸入部分"""def compute_loss(self, inputs, mask=None):y_true, y_mask, y_pred = inputsy_true = y_true[:, 1:] # 目標token_idsy_mask = y_mask[:, 1:] # segment_ids，剛好指示了要預測的部分y_pred = y_pred[:, :-1] # 預測序列，錯開一位loss = K.sparse_categorical_crossentropy(y_true, y_pred)loss = K.sum(loss * y_mask) / K.sum(y_mask)return lossmodel = build_transformer_model(config_path,checkpoint_path,application='unilm',keep_tokens=keep_tokens, # 只保留keep_tokens中的字，精簡原字表 )output = CrossEntropy(2)(model.inputs + model.outputs)model = Model(model.inputs, output) model.compile(optimizer=Adam(1e-5)) model.summary()

預測階段自回歸解碼，繼承AutoRegressiveDecoder類可以很容易實現beam_search。

項目4：SimBert

項目地址：SimBert
融合了unilm和對比學習，data generator和loss類的設計很巧妙，值得仔細閱讀，建議看不懂的地方打開jupyter對著一行一行print來理解。

項目5：SPACES：“抽取-生成”式長文本摘要

項目地址：SPACES
一個比較全面的項目，核心部分是BioCopyNet+Unilm。

總結

bert4keras項目的優點：

build_transformer_model一句代碼構建bert模型，一個參數即可切換為unilm結構。
繼承Loss類，重寫compute_loss方法，很容易計算loss。
深度基于keras，訓練、保存和keras一致。
豐富的example！蘇神的前沿算法研究也會附上bert4keras實現。

總結

以上是生活随笔為你收集整理的简洁优美的深度学习包-bert4keras的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： 59、PPP
下一篇：梳理百年深度学习发展史-七月在线机器学习