Download PDFOpen PDF in browserResearch on Document-Level Person Relation Extraction in ChineseEasyChair Preprint 1520214 pages•Date: October 6, 2024AbstractThis study aims to develop a joint entityrelation extraction framework that can be applied to real-world web data. Addressing the limitations of existing datasets, which are often derived from a single source and primarily focused on sentence-level content, we utilize large language models (such as Gemini and GPT-3.5) to annotate articlelevel content and build a more generalized dataset using Chinese Common Crawl data. To enhance the reliability of annotations and the completeness of entity pair sampling, we employ cross-validation and entity augmentation methods. Additionally, we fine-tune pre-trained models to validate and improve the performance of entity-relation extraction in real-world scenarios. Keyphrases: 命名實體識別, 文章級關係擷取, 聯合實體關係擷取
|