Managing class imbalance in the training of a large language model to predict patient selection for total knee arthroplasty: Results from the Artificial intelligence to Revolutionise the patient Care pathway in Hip and knEe aRthroplastY (ARCHERY) project.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Lesley Anderson, Luke Farrow, Mingjun Zhong

Ngôn ngữ: eng

Ký hiệu phân loại: 334.6068 Producers cooperatives

Thông tin xuất bản: Netherlands : The Knee , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 728700

Thêm vào giỏ Liên kết toàn văn

INTRODUCTION: This study set out to test the efficacy of different techniques used to manage to class imbalance, a type of data bias, in application of a large language model (LLM) to predict patient selection for total knee arthroplasty (TKA). METHODS: This study utilised data from the Artificial Intelligence to Revolutionise the Patient Care Pathway in Hip and Knee Arthroplasty (ARCHERY) project (ISRCTN18398037). Data included the pre-operative radiology reports of patients referred to secondary care for knee-related complaints from within the North of Scotland. A clinically based LLM (GatorTron) was trained regarding prediction of selection for TKA. Three methods for managing class imbalance were assessed: a standard model, use of class weighting, and majority class undersampling. RESULTS: A total of 7707 individual knee radiology reports were included (dated from 2015 to 2022). The mean text length was 74 words (range 26-275). Only 910/7707 (11.8%) patients underwent TKA surgery (the designated 'minority class'). Class weighting technique performed better for minority class discrimination and calibration compared with the other two techniques (Recall 0.61/AUROC 0.73 for class weighting compared with 0.54/0.70 and 0.59/0.72 for the standard model and majority class undersampling, respectively. There was also significant data loss for majority class undersampling when compared with class-weighting. CONCLUSION: Use of class-weighting appears to provide the optimal method of training a an LLM to perform analytical tasks on free-text clinical information in the face of significant data bias ('class imbalance'). Such knowledge is an important consideration in the development of high-performance clinical AI models within Trauma and Orthopaedics.

Tạo bộ sưu tập với mã QR