Download PDFOpen PDF in browser

A Hybrid Generative/Discriminative Model for Rapid Prototyping of Domain-Specific Named Entity Recognition

EasyChair Preprint 856

16 pagesDate: March 26, 2019

Abstract

We propose PYHSCRF, a novel tagger for domain-specific named entity recognition that only requires a few seed terms, in addition to unannotated corpora, and thus permits the iterative and incremental design of named entity (NE) classes for new domains. The proposed model is a hybrid of a generative model named PYHSMM and a semi-Markov CRF-based discriminative model, which play complementary roles in generalizing seed terms and in distinguishing between NE chunks and non-NE words. It also allows a smooth transition to full-scale annotation because the discriminative model makes effective use of annotated data when available. Experiments involving two languages and three domains demonstrate that the proposed method outperforms baselines.

Keyphrases: Bayesian model, Named Entity Recognition, weakly supervised learning

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:856,
  author    = {Suzushi Tomori and Yugo Murawaki and Shinsuke Mori},
  title     = {A Hybrid Generative/Discriminative Model for Rapid Prototyping of Domain-Specific Named Entity Recognition},
  doi       = {10.29007/2zkv},
  howpublished = {EasyChair Preprint 856},
  year      = {EasyChair, 2019}}
Download PDFOpen PDF in browser