Download PDFOpen PDF in browserTo What Extent Do LLMs Understand a Verdict? A Case Study on Traffic Accident Information ExtractionEasyChair Preprint 1524310 pages•Date: October 18, 2024AbstractThis study explores the application of large language models (LLMs) in the task of information extraction from traffic accident rulings in Taiwan. We compared large-parameter models like GPT and GEMINI with smaller-parameter models such as LLAMA-8B, and designed three types of prompts (basic, advanced, and one-shot) to evaluate the performance of each model under different scenarios.The results show that different prompts significantly affect the performance of various models, likely due to their ability to handle long texts. Specifically, GPT, when using the one-shot prompt, which includes more context, significantly outperformed other prompts in string-based tasks, achieving an average accuracy of 89.2%. However, for the GEMINI model, longer prompts resulted in reduced performance, particularly when handling lengthy texts, suggesting that the model has limited capacity for processing overly long prompts. This indicates that the compatibility between prompt design and model architecture plays a crucial role in performance.Fine-tuning results demonstrate that GPT showed significant improvement in extracting both string and numerical data, with post-tuning accuracy reaching 97.9% and 95.3% for the "depreciation method" and "repair cost" fields, respectively. In contrast, although the pre-tuned Chinese-LLAMA initially performed well, it exhibited limited improvement after fine-tuning, indicating a lower responsiveness to fine-tuning. On the other hand, prototype models like instruct-LLAMA showed substantial accuracy improvements for string data after fine-tuning, rising from 63.7% to 79.8%.In summary, prompt design and fine-tuning strategies are key factors in improving model performance. Future optimizations can be achieved through larger-scale models and more sophisticated fine-tuning techniques to further enhance the application of LLMs in specific domains. Keyphrases: Information Extraction, LLM fine-tuning, LLM微調, Legal judgment documents, data annotation, traffic accidents, 交通事故, 法律判決書, 資料標註, 資訊擷取
|