NLPIR SEMINAR Y2019#37
In the new semester, our Lab, Web Search Mining and Security Lab, plans to hold an academic seminar every Monday, and each time a keynote speaker will share understanding of papers on his/her related research with you.
Tomorrow’s seminar is organized as follows:
- The seminar time is 1:20.pm, Mon (November 18, 2019), at Zhongguancun Technology Park ,Building 5, 1306.
- Nada is going to give a presentation on the paper, Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning ( WWW 2018, April 23-27, 2018, Lyon, France )
- XuQian will give a lecture about a review paper.
- The seminar will be hosted by Baohua Zhang.
Everyone interested in this topic is welcomed to join us.
Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning
Meng Qu, Xiang Ren, Yu Zhan, Jiawei Han
Extracting relations from text corpora is an important task with wide applications. However, it becomes particularly challenging when focusing on weakly-supervised relation extraction, that is, utilizing a few relation instances (i.e., a pair of entities and their re- lation) as seeds to extract from corpora more instances of the same relation. Existing distributional approaches leverage the corpus-level co-occurrence statistics of entities to predict their relations, and require a large number of labeled instances to learn effective relation classifiers. Alternatively, pattern-based approaches perform boostrapping or apply neural networks to model the local contexts, but still rely on a large number of labeled instances to build reliable models. In this paper, we study the integration of distributional and pattern-based methods in a weakly-supervised setting such that the two kinds of methods can provide complemen- tary supervision for each other to build an effective, unified model. We propose a novel co-training framework with a distributional module and a pattern module. During training, the distributional module helps the pattern module discriminate between the informa- tive patterns and other patterns, and the pattern module generates some highly-confident instances to improve the distributional module. The whole framework can be effectively optimized by iterating between improving the pattern module and updating the distributional module. We conduct experiments on two tasks: knowledge base completion with text corpora and corpus-level relation extraction. Experimental results prove the effectiveness of our framework over many competitive baselines.