﻿{"id":6849,"date":"2019-04-14T20:42:52","date_gmt":"2019-04-14T12:42:52","guid":{"rendered":"http:\/\/www.nlpir.org\/wordpress\/?p=6849"},"modified":"2019-04-21T21:16:31","modified_gmt":"2019-04-21T13:16:31","slug":"pay-less-attention-with-lightweight-and-dynamic-convolutions","status":"publish","type":"post","link":"http:\/\/www.nlpir.org\/wordpress\/2019\/04\/14\/pay-less-attention-with-lightweight-and-dynamic-convolutions\/","title":{"rendered":"Pay Less Attention with Lightweight and Dynamic Convolutions"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" style=\"text-align:center\"><strong>NLPIR SEMINAR Y2019#10<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"> INTRO <\/h3>\n\n\n\n<p>        In the new semester, our Lab, Web Search Mining and Security Lab, plans to hold an academic seminar every Monday, and each time a keynote speaker will share understanding of papers on his\/her related research with you.<br><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Arrangement<br><\/h3>\n\n\n\n<p>This week&#8217;s seminar is organized as follows: <\/p>\n\n\n\n<ol><li>The seminar time is 1.pm, Mon, at Zhongguancun Technology Park ,Building 5, 1306.<\/li><li>The lecturer is <strong>Zhaoyou Liu<\/strong> , the paper&#8217;s title is <strong>Pay Less Attention with Lightweight and Dynamic Convolutions<\/strong>.<\/li><li>The seminar will be hosted by Gang Wang.<\/li><li>Attachment is the paper of this seminar, please download in advance.<\/li><\/ol>\n\n\n\n<p>Everyone interested in this topic is welcomed to join us. the following is the abstract for this week\u2019s paper.<\/p>\n\n\n\n<p>\n\t<div style=\"border:dashed windowtext 1.0pt;padding:1.0pt 4.0pt 1.0pt 4.0pt;\">\n\t\t<p class=\"MsoNormal\" align=\"center\" style=\"text-align:center;\">\n\t\t\t<span>Pay Less Attention with Lightweight and Dynamic Convolutions<\/span>\n\t\t<\/p>\n\t\t<p class=\"MsoNormal\" align=\"center\" style=\"text-align:center;\">\n\t\t\t<span>FelixWu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli<\/span>\n\t\t<\/p>\n\t\t<p class=\"MsoNormal\" align=\"center\" style=\"text-align:center;\">\n\t\t\t<span>abstract<\/span>\n\t\t<\/p>\n\t\t<p class=\"MsoNormal\" style=\"text-indent:21.0pt;\">\n\t\t\t<span>Self-attention is a\nuseful mechanism to build generative models for language and images. It\ndetermines the importance of context elements by comparing each element to the\ncurrent time step. In this paper, we show that a very lightweight convolution\ncan perform competitively to the best reported self-attention results. Next, we\nintroduce dynamic convolutions which are simpler and more efficient than\nself-attention. We predict separate convolution kernels based solely on the\ncurrent time-step in order to determine the importance of context elements. The\nnumber of operations required by this approach scales linearly in the input\nlength, whereas self-attention is quadratic. Experiments on large-scale machine\ntranslation, language modeling and abstractive summarization show that dynamic\nconvolutions improve over strong self-attention models. On the WMT\u201914\nEnglish-German test set dynamic convolutions achieve a new state of the art of\n29.7 BLEU.<\/span>\n\t\t<\/p>\n\t<\/div>\n<\/p>\n\n\n\n<div class=\"wp-block-file aligncenter\"><a href=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/Pay-Less-Attention-with-Lightweight-and-Dynamic-Convolutions.pdf\">Pay Less Attention with Lightweight and Dynamic Convolutions<\/a><a href=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/Pay-Less-Attention-with-Lightweight-and-Dynamic-Convolutions.pdf\" class=\"wp-block-file__button\" download>\u4e0b\u8f7d<\/a><\/div>\n\n\n\n<!--nextpage-->\n\n\n\n<h2 class=\"wp-block-heading\" style=\"text-align:center\" id=\"mce_0\"><strong>NLPIR SEMINAR 23rd ISSUE COMPLETED<\/strong><\/h2>\n\n\n\n<p>        Last Monday, <strong>Zhaoyou\u00a0Liu<\/strong>  gave a presentation about the paper, <strong>Pay Less Attention with Lightweight and Dynamic Convolutions<\/strong>, and shared some opinion on it.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/slige-2-1024x576.jpg\" alt=\"\" class=\"wp-image-6870\" srcset=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/slige-2-1024x576.jpg 1024w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/slige-2-300x169.jpg 300w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/slige-2-768x432.jpg 768w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/slige-2-80x45.jpg 80w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/04\/slige-2.jpg 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>This paper was published as a conference paper at ICLR 2019.<br>Dynamic convolutions build on lightweight convolutions. The kernel is a function of the current time-step only as opposed to the entire context as in self-attention. This approach shares similarity to location-based attention which does not access the context to determine attention weights.<br>And the experiments show that dynamic convolutions perform as well as or better than self-attention with less time.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>NLPIR SEMINAR Y2019#10 INTRO In the new  &hellip; <a href=\"http:\/\/www.nlpir.org\/wordpress\/2019\/04\/14\/pay-less-attention-with-lightweight-and-dynamic-convolutions\/\">\u7ee7\u7eed\u9605\u8bfb <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":862,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37,38],"tags":[],"_links":{"self":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/6849"}],"collection":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/users\/862"}],"replies":[{"embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/comments?post=6849"}],"version-history":[{"count":2,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/6849\/revisions"}],"predecessor-version":[{"id":6871,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/6849\/revisions\/6871"}],"wp:attachment":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/media?parent=6849"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/categories?post=6849"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/tags?post=6849"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}