RAMIE: retrieval-augmented multi-task information extraction with large language models on dietary supplements.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Mingchen Li, Zaifu Zhan, Rui Zhang, Shuang Zhou

Ngôn ngữ: eng

Ký hiệu phân loại: 331.7 Labor by industry and occupation

Thông tin xuất bản: England : Journal of the American Medical Informatics Association : JAMIA , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 176653

 OBJECTIVE: To develop an advanced multi-task large language model (LLM) framework for extracting diverse types of information about dietary supplements (DSs) from clinical records. METHODS: We focused on 4 core DS information extraction tasks: named entity recognition (2 949 clinical sentences), relation extraction (4 892 sentences), triple extraction (2 949 sentences), and usage classification (2 460 sentences). To address these tasks, we introduced the retrieval-augmented multi-task information extraction (RAMIE) framework, which incorporates: (1) instruction fine-tuning with task-specific prompts
  (2) multi-task training of LLMs to enhance storage efficiency and reduce training costs
  and (3) retrieval-augmented generation, which retrieves similar examples from the training set to improve task performance. We compared the performance of RAMIE to LLMs with instruction fine-tuning alone and conducted an ablation study to evaluate the individual contributions of multi-task learning and retrieval-augmented generation to overall performance improvements. RESULTS: Using the RAMIE framework, Llama2-13B achieved an F1 score of 87.39 on the named entity recognition task, reflecting a 3.51% improvement. It also excelled in the relation extraction task with an F1 score of 93.74, a 1.15% improvement. For the triple extraction task, Llama2-7B achieved an F1 score of 79.45, representing a significant 14.26% improvement. MedAlpaca-7B delivered the highest F1 score of 93.45 on the usage classification task, with a 0.94% improvement. The ablation study highlighted that while multi-task learning improved efficiency with a minor trade-off in performance, the inclusion of retrieval-augmented generation significantly enhanced overall accuracy across tasks. CONCLUSION: The RAMIE framework demonstrates substantial improvements in multi-task information extraction for DS-related data from clinical records.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH