Summarize and Paste: Enhancing Neural Machine Translation via In-Context Learning with Automatic Text Summarization

EasyChair Preprint 15358

6 pages•Date: November 3, 2024

Jillaphat Jaroenkantasima, Prachya Boonkwan, Hutchatai Chanlekha and Manabu Okumura

Abstract

Machine translation, while revolutionary, struggles with coherence in long-form content, especially for low-resource languages. We introduce 'Summarize and Paste,' a novel approach combining advanced summarization with large language models (LLMs) to significantly enhance translation quality. This method provides LLMs with concise, abstractive summaries as additional context, capturing essential information often lost in traditional translation.Across English to Thai, Japanese, Chinese, and Spanish translations, we achieve remarkable improvements. For English-Thai, a low-resource pair, our method yields a 44.0\% increase in BLEU score over state-of-the-art baselines. Our innovative tri-text integration, combining original text, summary, and preliminary translation, further boosts BLEU scores by 12.7\% across all language pairs.This work not only enhances translation accuracy but also improves contextual understanding in document-level translation. It opens new avenues for leveraging summarization in NLP and provides crucial insights into LLMs' context-aware translation capabilities, with far-reaching implications for cross-lingual communication.

Keyphrases: Natural Language Processing, Summarization, large language models, machine translation

Links:

https://easychair.org/publications/preprint/tHcx

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:15358,
  author    = {Jillaphat Jaroenkantasima and Prachya Boonkwan and Hutchatai Chanlekha and Manabu Okumura},
  title     = {Summarize and Paste: Enhancing Neural Machine Translation via In-Context Learning with Automatic Text Summarization},
  howpublished = {EasyChair Preprint 15358},
  year      = {EasyChair, 2024}}

Download PDF Open PDF in browser