﻿{"id":6987,"date":"2019-06-30T21:10:21","date_gmt":"2019-06-30T13:10:21","guid":{"rendered":"http:\/\/www.nlpir.org\/wordpress\/?p=6987"},"modified":"2019-07-07T21:24:02","modified_gmt":"2019-07-07T13:24:02","slug":"automatic-spelling-correction-for-resource-scarce-languages-using-deep-learning","status":"publish","type":"post","link":"http:\/\/www.nlpir.org\/wordpress\/2019\/06\/30\/automatic-spelling-correction-for-resource-scarce-languages-using-deep-learning\/","title":{"rendered":"Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" style=\"text-align:center\"><strong>NLPIR SEMINAR Y2019#<\/strong>21<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">INTRO <\/h3>\n\n\n\n<p>         In the new semester, our Lab, Web Search Mining and Security Lab, plans to hold an academic seminar every Monday, and each time a keynote speaker will share understanding of papers on his\/her related research with you.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Arrangement<\/h3>\n\n\n\n<p>This week&#8217;s seminar is organized as follows: <\/p>\n\n\n\n<ol><li>The seminar time is 1.pm, Mon, at Zhongguancun Technology Park ,Building 5, 1306.<\/li><li>The lecturer is <strong>Changhe Li<\/strong>, the paper&#8217;s title is <strong>Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning<\/strong>.<\/li><li> Zhaoyang Wang give the presentation of his work .<\/li><li>The seminar will be hosted by ShenLi.<\/li><li>Attachment is the paper of this seminar, please download in advance.<\/li><\/ol>\n\n\n\n<p> Everyone interested in this topic is welcomed to join us. the following is the abstract for this week\u2019s paper.<\/p>\n\n\n\n<div style=\"border:dashed windowtext 1.0pt;padding:1.0pt 4.0pt 1.0pt 4.0pt;\">\n\t<p align=\"center\" style=\"text-align:center;font-weight: bold\">\n\t\tAutomatic Spelling Correction for Resource-Scarce Languages\nusing Deep Learning\n\t<\/p>\n\t<p align=\"center\" style=\"text-align:center;font-size: 0.5em\">\n\t\tPravallika Etoori, Manoj Chinnakotla, Radhika Mamidi\n\t<\/p>\n\t<p align=\"center\" style=\"text-align:center;\">\n\t\tAbstract\n\t<\/p>\n\t<p style=\"text-indent:2em;\">\n\t\tSpelling correction is a well-known task in Natural Language Processing (NLP). Automatic spelling correction is important for many NLP applications like web search engines, text summarization, sentiment analysis etc. Most approaches use parallel data of noisy and correct word mappings from different sources as training data for automatic spelling correction. Indic languages are resource-scarce and do not have such parallel data due to low volume of queries and nonexistence of such prior implementations. In this paper, we show how to build an automatic spelling corrector for resourcescarce languages. We propose a sequence-to-sequence deep learning model which trains end-to-end. We perform experiments on synthetic datasets created for Indic languages, Hindi and Telugu, by incorporating the spelling mistakes committed at character level. A comparative evaluation shows that our model is competitive with the existing spell checking and correction techniques for Indic languages.\n        <\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-file aligncenter\"><a href=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/06\/Automatic-Spelling-Correction-for-Resource-Scarce-Languages-using-Deep-Learning.pdf\">Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning<\/a><a href=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/06\/Automatic-Spelling-Correction-for-Resource-Scarce-Languages-using-Deep-Learning.pdf\" class=\"wp-block-file__button\" download>\u4e0b\u8f7d<\/a><\/div>\n\n\n\n<!--nextpage-->\n\n\n\n<h2 class=\"wp-block-heading\" style=\"text-align:center\"><strong>NLPIR SEMINAR 34th ISSUE COMPLETED<\/strong><\/h2>\n\n\n\n<p> Last Monday,  <strong>Changhe Li<\/strong> gave a presentation about the paper, Zhaoyang Wang introduced his research work.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/07\/IMG_20190701-1024x768.jpg\" alt=\"\" class=\"wp-image-6996\" srcset=\"http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/07\/IMG_20190701-1024x768.jpg 1024w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/07\/IMG_20190701-300x225.jpg 300w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/07\/IMG_20190701-768x576.jpg 768w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/07\/IMG_20190701-200x150.jpg 200w, http:\/\/www.nlpir.org\/wordpress\/wp-content\/uploads\/2019\/07\/IMG_20190701-80x60.jpg 80w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The paper proposed  a character based Sequence-to-sequence text Correction Model for Indic Languages (SCMIL). The encoder and decoder both use LSTM. A dataset for Hindi and Telugu spelling errors is published on Github.<\/p>\n\n\n\n<p>The method is not complex and the model&#8217;s performance is not that good nowadays.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>NLPIR SEMINAR Y2019#21 INTRO In the new  &hellip; <a href=\"http:\/\/www.nlpir.org\/wordpress\/2019\/06\/30\/automatic-spelling-correction-for-resource-scarce-languages-using-deep-learning\/\">\u7ee7\u7eed\u9605\u8bfb <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":862,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37,38],"tags":[],"_links":{"self":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/6987"}],"collection":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/users\/862"}],"replies":[{"embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/comments?post=6987"}],"version-history":[{"count":4,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/6987\/revisions"}],"predecessor-version":[{"id":6998,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/6987\/revisions\/6998"}],"wp:attachment":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/media?parent=6987"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/categories?post=6987"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/tags?post=6987"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}