Evaluating the Efficacy of Large Language Models in Guiding Treatment Decisions for Pediatric Refractive Error.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Jia Feng, Andrzej Grzybowski, Kai Jin, Daohuan Kang, Wenyue Shen, Wen Sun, Hongkang Wu, Lu Yuan, Jiao Zhan

Ngôn ngữ: eng

Ký hiệu phân loại:

Thông tin xuất bản: England : Ophthalmology and therapy , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 722169

Thêm vào giỏ Liên kết toàn văn

INTRODUCTION: Effective management of pediatric myopia, which includes treatments like corrective lenses and low-dose atropine, requires accurate clinical decisions. However, the complexity of pediatric refractive data, such as variations in visual acuity, axial length, and patient-specific factors, pose challenges to determining optimal treatment. This study aims to evaluate the performance of three large language models in analyzing these refractive data. METHODS: A dataset of 100 pediatric refractive records, including parameters like visual acuity and axial length, was analyzed using ChatGPT-3.5, ChatGPT-4o, and Wenxin Yiyan, respectively. Each model was tasked with determining whether intervention was needed and subsequently recommending a treatment (eyeglasses, orthokeratology lens, or low-dose atropine). The recommendations were compared to professional optometrists' consensus, rated on a 1-5 Global Quality Score (GQS) scale, and evaluated for clinical safety utilizing a three-tier accuracy assessment. RESULTS: ChatGPT-4o outperformed both ChatGPT-3.5 and Wenxin Yiyan in determining intervention needs, with an accuracy of 90%, significantly higher than Wenxin Yiyan (p <
0.05). It also achieved the highest GQS of 4.4 ± 0.55, surpassing the other models (p <
0.001), with 85% of responses rated as "good" ahead of ChatGPT-3.5 (82%) and Wenxin Yiyan (74%). ChatGPT-4o made only eight errors in recommending interventions, fewer than ChatGPT-3.5 (12) and Wenxin Yiyan (15). Additionally, it performed better with incomplete or abnormal data, maintaining higher quality scores. CONCLUSION: ChatGPT-4o showed better accuracy and clinical safety, making it a promising tool for decision support in pediatric ophthalmology, although expert oversight is still necessary.

Tạo bộ sưu tập với mã QR