﻿{"id":74,"date":"2017-08-03T00:00:00","date_gmt":"2017-08-02T16:00:00","guid":{"rendered":""},"modified":"2018-12-14T11:00:44","modified_gmt":"2018-12-14T03:00:44","slug":"incorporating-new-words-detection-with-chinese-word-segmentation","status":"publish","type":"post","link":"http:\/\/www.nlpir.org\/wordpress\/2017\/08\/03\/incorporating-new-words-detection-with-chinese-word-segmentation\/","title":{"rendered":"Incorporating New Words Detection with Chinese Word Segmentation"},"content":{"rendered":"<p><P><SPAN style=\"FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10pt; mso-fareast-font-family: \u5b8b\u4f53; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA\" lang=EN-US><br \/>\n<TABLE style=\"WIDTH: 651pt; BORDER-COLLAPSE: collapse\" border=0 cellSpacing=0 cellPadding=0 width=868 x:str><br \/>\n<COLGROUP><br \/>\n<COL style=\"WIDTH: 651pt; mso-width-source: userset; mso-width-alt: 27776\" width=868><br \/>\n<TBODY><br \/>\n<TR style=\"HEIGHT: 27pt\" height=36><br \/>\n<TD style=\"BORDER-BOTTOM: #ffffff; BORDER-LEFT: #ffffff; BACKGROUND-COLOR: transparent; WIDTH: 651pt; HEIGHT: 27pt; BORDER-TOP: #ffffff; BORDER-RIGHT: #ffffff\" class=xl24 height=36 width=868>Hua-Ping ZHANG,Jian GAO,Qian MO,He-Yan HUANG.Incorporating New Words Detection with Chinese Word Segmentation.In Proceedings of CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP 2010).Beijing, China.2010.8 .p249-251.<\/TD><\/TR><\/TBODY><\/TABLE><\/SPAN><\/P><SPAN style=\"FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10pt; mso-fareast-font-family: \u5b8b\u4f53; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA\" lang=EN-US><br \/>\n<P style=\"MARGIN: 12pt 0cm\" class=AbstractHeading><SPAN lang=EN-US><STRONG><FONT size=3>Abstract<\/FONT><\/STRONG><\/SPAN><\/P><br \/>\n<P><SPAN style=\"FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10pt; mso-fareast-font-family: \u5b8b\u4f53; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA\" lang=EN-US>With development in Chinese words segmentation, in-vocabulary word segmentation and named entity recognition achieves state-of-art performance. However, new words become bottleneck to Chinese word segmentation. <\/SPAN><SPAN style=\"FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10pt; mso-fareast-font-family: 'MS Mincho'; mso-ansi-language: EN-US; mso-fareast-language: DE; mso-bidi-language: AR-SA\" lang=EN-US>This paper presents <\/SPAN><SPAN style=\"FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10pt; mso-fareast-font-family: \u5b8b\u4f53; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA\" lang=EN-US>the result from Beijing Institute of Technology (BIT) in <\/SPAN><SPAN style=\"FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10pt; mso-fareast-font-family: \u5b8b\u4f53; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA; mso-bidi-font-size: 11.0pt\" lang=EN-US>the Sixth International Chinese Word Segmentation Bakeoff in 2010. Firstly, the author reviewed the problem caused by the new words in Chinese texts, then introduced the algorithm of new words detection. The final section provided the official evaluation result in this bakeoff and gave conclusions. <\/SPAN><\/SPAN><\/P><br \/>\n<P><SPAN style=\"FONT-FAMILY: 'Times New Roman'; FONT-SIZE: 10pt; mso-fareast-font-family: \u5b8b\u4f53; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA\" lang=EN-US><A href=\"http:\/\/www.nlpir.org\/wordpress\/attachments\/2011\/04\/WordSegmentation-BIT0723.pdf\" target=_blank><IMG border=0 src=\"http:\/\/www.nlpir.org\/images\/base\/attachment.gif\"> WordSegmentation-BIT0723.pdf(80.5 KB)<\/A><\/SPAN><\/P><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hua-Ping ZHANG,Jian GAO,Qian MO,He-Yan H &hellip; <a href=\"http:\/\/www.nlpir.org\/wordpress\/2017\/08\/03\/incorporating-new-words-detection-with-chinese-word-segmentation\/\">\u7ee7\u7eed\u9605\u8bfb <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[31],"tags":[],"_links":{"self":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/74"}],"collection":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/comments?post=74"}],"version-history":[{"count":1,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/74\/revisions"}],"predecessor-version":[{"id":1510,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/posts\/74\/revisions\/1510"}],"wp:attachment":[{"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/media?parent=74"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/categories?post=74"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.nlpir.org\/wordpress\/wp-json\/wp\/v2\/tags?post=74"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}