American Academy of Orthopedic Surgery OrthoInfo provides more readable information regarding meniscus injury than ChatGPT-4 while information accuracy is comparable.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Camden Bohn, Brian Forsythe, Miguel Girod-Hoffman, Catherine Hand, Aaron Krych, Yining Lu, Sami Saniei, Shadia Tannir, Marisa Ulrich

Ngôn ngữ: eng

Ký hiệu phân loại: 133.531 Sun

Thông tin xuất bản: England : Journal of ISAKOS : joint disorders & orthopaedic sports medicine , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 643817

Thêm vào giỏ Liên kết toàn văn

INTRODUCTION: Over 61% of Americans seek health information online, often using artificial intelligence (AI) tools like ChatGPT. However, concerns persist about the readability and accessibility of AI-generated content, especially for individuals with varying health literacy levels. This study compares the readability and accuracy of ChatGPT responses on meniscus injuries with those from the American Academy of Orthopedic Surgeons' OrthoInfo website, which is tailored for patient education. We hypothesize that while ChatGPT offers accurate information, its readability will be lower than OrthoInfo. METHODS: Seven frequently asked questions about meniscus injuries were used to compare responses from ChatGPT-4 and OrthoInfo. Readability was assessed using multiple calculators (Flesch-Kincaid, Gunning Fog, Coleman-Liau, SMOG Readability Formula, FORCAST Readability Formula, Fry Graph, Raygor Readability Estimate), and accuracy was evaluated by three independent reviewers on a 4-point scale. Statistical analysis included independent t-tests to compare readability and accuracy between the two sources. RESULTS: ChatGPT responses required a significantly higher education level to comprehend, with an average reading grade level of 13.8 compared to 9.8 for OrthoInfo (p <
0.01). The Flesch Reading Ease Index also indicated lower readability for ChatGPT (32.0 vs. 59.9, p <
0.01). However, both ChatGPT and OrthoInfo responses were highly accurate, with all but one ChatGPT response receiving the highest accuracy rating of 4. The response to physical exam findings was less accurate (3.3 vs. 3.6, p = 0.52). CONCLUSION: While AI-generated responses were accurate, their readability made them less accessible than OrthoInfo, which is designed for a broad audience. This study underscores the importance of clear, accessible information for meniscal injuries and suggests that AI tools should incorporate readability metrics to enhance patient comprehension. Despite the potential of AI, resources like OrthoInfo remain essential for effectively communicating health information to the public. LEVEL OF EVIDENCE: IV.

Tạo bộ sưu tập với mã QR