日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

今日arXiv精选 | 4篇EMNLP 2021最新论文

發布時間:2024/10/8 编程问答 38 豆豆
生活随笔 收集整理的這篇文章主要介紹了 今日arXiv精选 | 4篇EMNLP 2021最新论文 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

?關于?#今日arXiv精選?

這是「AI 學術前沿」旗下的一檔欄目,編輯將每日從arXiv中精選高質量論文,推送給讀者。

Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding

Comment: Long paper at EMNLP 2021

Link:?http://arxiv.org/abs/2109.01583

Abstract

Lack of training data presents a grand challenge to scaling out spokenlanguage understanding (SLU) to low-resource languages. Although various dataaugmentation approaches have been proposed to synthesize training data inlow-resource target languages, the augmented data sets are often noisy, andthus impede the performance of SLU models. In this paper we focus on mitigatingnoise in augmented data. We develop a denoising training approach. Multiplemodels are trained with data produced by various augmented methods. Thosemodels provide supervision signals to each other. The experimental results showthat our method outperforms the existing state of the art by 3.05 and 4.24percentage points on two benchmark datasets, respectively. The code will bemade open sourced on github.

Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation

Comment: Findings of EMNLP 2021

Link:?http://arxiv.org/abs/2109.01484

Abstract

Exemplar-Guided Paraphrase Generation (EGPG) aims to generate a targetsentence which conforms to the style of the given exemplar while encapsulatingthe content information of the source sentence. In this paper, we propose a newmethod with the goal of learning a better representation of the style andthecontent. This method is mainly motivated by the recent success of contrastivelearning which has demonstrated its power in unsupervised feature extractiontasks. The idea is to design two contrastive losses with respect to the contentand the style by considering two problem characteristics during training. Onecharacteristic is that the target sentence shares the same content with thesource sentence, and the second characteristic is that the target sentenceshares the same style with the exemplar. These two contrastive losses areincorporated into the general encoder-decoder paradigm. Experiments on twodatasets, namely QQP-Pos and ParaNMT, demonstrate the effectiveness of ourproposed constrastive losses.

Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT

Comment: EMNLP 2021

Link:?http://arxiv.org/abs/2109.01396

Abstract

Differently from the traditional statistical MT that decomposes thetranslation task into distinct separately learned components, neural machinetranslation uses a single neural network to model the entire translationprocess. Despite neural machine translation being de-facto standard, it isstill not clear how NMT models acquire different competences over the course oftraining, and how this mirrors the different models in traditional SMT. In thiswork, we look at the competences related to three core SMT components and findthat during training, NMT first focuses on learning target-side languagemodeling, then improves translation quality approaching word-by-wordtranslation, and finally learns more complicated reordering patterns. We showthat this behavior holds for several models and language pairs. Additionally,we explain how such an understanding of the training process can be useful inpractice and, as an example, show how it can be used to improve vanillanon-autoregressive neural machine translation by guiding teacher modelselection.

Detecting Speaker Personas from Conversational Texts

Comment: Accepted by EMNLP 2021

Link:?http://arxiv.org/abs/2109.01330

Abstract

Personas are useful for dialogue response prediction. However, the personasused in current studies are pre-defined and hard to obtain before aconversation. To tackle this issue, we study a new task, named Speaker PersonaDetection (SPD), which aims to detect speaker personas based on the plainconversational text. In this task, a best-matched persona is searched out fromcandidates given the conversational text. This is a many-to-many semanticmatching task because both contexts and personas in SPD are composed ofmultiple sentences. The long-term dependency and the dynamic redundancy amongthese sentences increase the difficulty of this task. We build a dataset forSPD, dubbed as Persona Match on Persona-Chat (PMPC). Furthermore, we evaluateseveral baseline models and propose utterance-to-profile (U2P) matchingnetworks for this task. The U2P models operate at a fine granularity whichtreat both contexts and personas as sets of multiple sequences. Then, eachsequence pair is scored and an interpretable overall score is obtained for acontext-persona pair through aggregation. Evaluation results show that the U2Pmodels outperform their baseline counterparts significantly.

·

與50位技術專家面對面20年技術見證,附贈技術全景圖

總結

以上是生活随笔為你收集整理的今日arXiv精选 | 4篇EMNLP 2021最新论文的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。