日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

bert 句向量 的 各向异性问题 及与 对比学习 的联系

發布時間:2024/1/18 编程问答 42 豆豆
生活随笔 收集整理的這篇文章主要介紹了 bert 句向量 的 各向异性问题 及与 对比学习 的联系 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

? ? ? ? 本文主要介紹了 為什么基于bert產出的句向量,在語義相似相關的任務上表現較差的原因及相關解釋(各向異性,表示退化,錐形空間),另外介紹了simcse 中 論述的 對比學習 與 各向異性 的聯系。

????????主要是涉及的相關論文和主要論點,留存用。

目錄

問題引入:

相關論文解釋:

1. REPRESENTATION DEGENERATION PROBLEM IN TRAINING NATURAL LANGUAGE GENERATION MODELS

2. bert-flow : chap2 : Understanding the Sentence Embedding Space of BERT

2.1 The Connection between Semantic Similarity and BERT Pre-training :

2.2 Anisotropic Embedding Space Induces Poor Semantic Similarity:

3. simcse : chap5 : Connection to Anisotropy

4. Alignment and Uniformity

相關論文:



問題引入:

why do the BERT-induced sentence embeddings perform poorly to retrieve semantically similar sentences?

即,為什么基于bert,來產出句向量,在語義相似相關的任務上表現極差?

Reimers and Gurevych (2019) demonstrate that such BERT sentence embeddings lag behind the state-of-the-art sentence embeddings in terms of semantic similarity. On the STS-B dataset, BERT sentence embeddings are even less competitive to averaged GloVe (Pennington et al., 2014) embed- dings, which is a simple and non-contextualized baseline proposed several years ago.

相關論文解釋:

1. REPRESENTATION DEGENERATION PROBLEM IN TRAINING NATURAL LANGUAGE GENERATION MODELS

主要引進了表示退化問題(各向異性)

We observe that when training a model for natural language genera- tion tasks through likelihood maximization with the weight tying trick, especially with big training datasets, most of the learnt word embeddings tend to degenerate and be distributed into a narrow cone, which largely limits the representation power of word embeddings.

......

2. bert-flow : chap2 : Understanding the Sentence Embedding Space of BERT

主要介紹了bert類預訓練任務和語義相似的聯系,以及對語義相似表現較差的分析

2.1 The Connection between Semantic Similarity and BERT Pre-training :

  • The similarity between BERT sentence embed- dings can be reduced to the similarity betweenT2BERT context embeddings hc hc′ . However, as shown in Equation 1, the pretraining of BERT does not explicitly involve the computation of hTc hc′ . Therefore, we can hardly derive a mathematical formulation of what h?c hc′ exactly represents.
  • Co-Occurrence Statistics as the Proxy for Semantic Similarity: roughly speaking, it is semantically meaningful to compute the dot product be- tween a context embedding and a word embedding
  • Higher-Order Co-Occurrence Statistics as Context-Context Semantic Similarity: During pretraining, the semantic relationship between two contexts c and c′ could be inferred and reinforced with their connections to words.

2.2 Anisotropic Embedding Space Induces Poor Semantic Similarity:

  • To investigate the underlying problem of the fail- ure, we use word embeddings as a surrogate be- cause words and contexts share the same embed- ding space. If the word embeddings exhibits some misleading properties, the context embeddings will also be problematic, and vice versa.
  • Gao et al. (2019) and Wang et al. (2020) have pointed out that, for language modeling, the max- imum likelihood training with Equation 1 usually produces an anisotropic word embedding space. “Anisotropic” means word embeddings occupy a narrow cone in the vector space.
  • Observation 1: Word Frequency Biases the Embedding Space
  • Observation 2: Low-Frequency Words Dis- perse Sparsely We observe that, in the learned anisotropic embedding space, high-frequency words concentrates densely and low-frequency words disperse sparsely.
  • Due to the sparsity, many “holes” could be formed around the low-frequency word embed- dings in the embedding space, where the semantic meaning can be poorly defined. Note that BERT sentence embeddings are produced by averaging the context embeddings, which is a convexity- preserving operation. However, the holes violate the convexity of the embedding space

3. simcse : chap5 : Connection to Anisotropy

主要介紹了simcse 與各向異性的聯系,及為什么simcse會有效

we take a singular spectrum perspective—which is a common practice in analyzing word embeddings (Mu and Viswanath, 2018; Gao et al., 2019; Wang et al., 2020), and show that the contrastive objective can “flatten” the singular value distribution of sentence embeddings and make the representations more isotropic.

......

4. Alignment and Uniformity

主要引進了Alignment and Uniformity 來分析和評估(訓練)句向量

......

相關論文:

  • Jun Gao, Di He, Xu Tan, Tao Qin, Liwei Wang, and Tieyan Liu. 2019. Representation degenera- tion problem in training natural language generation models. In International Conference on Learning Representations (ICLR).
  • https://openreview.net/pdf?id=ByxY8CNtvr?: IMPROVING NEURAL LANGUAGE GENERATION WITH SPECTRUM CONTROL
  • bert-flow: On the Sentence Embeddings from Pre-trained Language Models
  • SimCSE: Simple Contrastive Learning of Sentence Embeddings
  • http://proceedings.mlr.press/v119/wang20k/wang20k.pdf Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere

總結

以上是生活随笔為你收集整理的bert 句向量 的 各向异性问题 及与 对比学习 的联系的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。

主站蜘蛛池模板: 91久久综合 | 成人小视频免费在线观看 | 亚洲天堂精品在线观看 | 在线性视频| 熟女少妇a性色生活片毛片 亚洲伊人成人网 | 老熟妇精品一区二区三区 | 成人综合区一区 | 99热思思| 欧美日韩在线综合 | av永久 | 成人精品免费看 | 91蜜桃网| 久久久久久亚洲中文字幕无码 | 久久五 | 亚洲综合涩 | 亚洲色图在线视频 | 麻豆成人91精品二区三区 | 97超碰资源总站 | 日本孰妇毛茸茸xxxx | 肥老熟妇伦子伦456视频 | 国模视频一区 | 18在线观看视频 | 制服丝袜成人动漫 | 91精品国产乱码久久久 | 国产福利一区二区视频 | 天堂av免费在线观看 | 亚洲第一色视频 | 不卡的中文字幕 | 亚洲美女性视频 | 在线观看的av网址 | 永久av在线 | 国产精品久久久久久免费播放 | www.好吊色 | 日韩成人在线网站 | 午夜精品久久久久久久99黑人 | 国产色秀 | 中日韩中文字幕 | 韩国三级在线播放 | 97狠狠操 | 亚洲网站在线 | www中文在线 | 99r精品视频 | 欧美性生活一区二区三区 | 免费看又黄又无码的网站 | 久久精品夜 | 国产人妻黑人一区二区三区 | 五月婷婷中文 | 一区www| 顶臀精品视频www | 亚洲精品一区二区三区婷婷月 | 国产精品扒开腿做爽爽爽a片唱戏 | 国产91嫩草 | 狠狠操影视 | 韩国av一区二区三区 | 免费观看污视频 | 91手机视频在线观看 | 国产成人小视频 | 欧美日本精品 | 男人影院在线 | 黄视频免费观看 | 日产精品久久久久久久蜜臀 | 欧美美女色图 | 亚洲成人黄色影院 | 波多野结衣潜藏淫欲 | 狂野欧美性猛交xxxx777 | 国产免费a | 精品免费 | 天天草天天干 | 欧美影院一区二区三区 | 久久人人澡 | 成人做受视频试看60秒 | 日本少妇做爰全过程毛片 | 女生的胸无遮挡 | 丰满的人妻hd高清日本 | 国产真实乱人偷精品视频 | 秋霞一级全黄大片 | 91精品国产综合久久久久久久 | 欧美有码视频 | 在线免费观看av片 | 一级a毛片免费观看久久精品 | 免费三级在线 | 日韩视频不卡 | 欧美午夜一区二区 | japanese中文字幕| 精品成人国产 | 亚洲a在线观看 | 欧美激情一区二区三区 | 黄色小视频大全 | 蜜臀免费av| 国产精品丝袜一区二区 | 午夜影院福利社 | 成人高潮视频 | 亚洲好看站 | 欧美日韩一区二区在线观看视频 | 无码人妻一区二区三区免费 | 日韩操| 欧美18av| 国产91熟女高潮一区二区 | 全部免费毛片在线播放 |