Download PDFOpen PDF in browserA Hybrid Generative/Discriminative Model for Rapid Prototyping of Domain-Specific Named Entity RecognitionEasyChair Preprint 85616 pages•Date: March 26, 2019AbstractWe propose PYHSCRF, a novel tagger for domain-specific named entity recognition that only requires a few seed terms, in addition to unannotated corpora, and thus permits the iterative and incremental design of named entity (NE) classes for new domains. The proposed model is a hybrid of a generative model named PYHSMM and a semi-Markov CRF-based discriminative model, which play complementary roles in generalizing seed terms and in distinguishing between NE chunks and non-NE words. It also allows a smooth transition to full-scale annotation because the discriminative model makes effective use of annotated data when available. Experiments involving two languages and three domains demonstrate that the proposed method outperforms baselines. Keyphrases: Bayesian model, Named Entity Recognition, weakly supervised learning
|