Multimodal large language models address clinical queries in laryngeal cancer surgery: a comparative evaluation of image interpretation across different models.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Yifan Gao, Bingyu Liang, Qin Wang, Taibao Wang, Lei Zhang

Ngôn ngữ: eng

Ký hiệu phân loại:

Thông tin xuất bản: United States : International journal of surgery (London, England) , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 701517

BACKGROUND AND OBJECTIVES: Recent advances in multimodal large language models (MLLMs) have shown promise in medical image interpretation, yet their utility in surgical contexts remains unexplored. This study evaluates six MLLMs' performance in interpreting diverse imaging modalities for laryngeal cancer surgery. METHODS: We analyzed 169 images (X-rays, CT scans, laryngoscopy, and pathology findings) from 50 patients using six state-of-the-art MLLMs. Model performance was assessed across 1084 clinically relevant questions by two independent physicians. RESULTS: Claude 3.5 Sonnet achieves the highest accuracy (79.43%, 95% CI: 77.02%-81.84%). Performance varied significantly across imaging modalities and between commercial and open-source models, with a 19-percentage point gap between the best commercial and open-source solutions. CONCLUSION: Advanced MLLMs show promising potential as clinical decision support tools in laryngeal cancer surgery, while performance variations suggest the need for specialized model development and clinical workflow integration. Future research should focus on developing specialized MLLMs trained on large-scale multi-center laryngeal cancer datasets.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH