Vertical federated learning based on data subset representation for healthcare application.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Gangyong Jia, Miaoqi Li, Yukun Shi, Meiting Xue, Qihong Yu, Yan Zeng, Jilin Zhang

Ngôn ngữ: eng

Ký hiệu phân loại: 622.159 Other methods of prospecting

Thông tin xuất bản: Ireland : Computer methods and programs in biomedicine , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 116374

Thêm vào giỏ Liên kết toàn văn

BACKGROUND AND OBJECTIVE: Artificial intelligence is increasingly essential for disease classification and clinical diagnosis tasks in healthcare. Given the strict privacy needs of healthcare data, Vertical Federated Learning (VFL) has been introduced. VFL allows multiple hospitals to collaboratively train models on vertically partitioned data, where each holds only the patient's partial data features, thus maintaining patient confidentiality. However, VFL applications in healthcare scenarios with fewer samples and labels are challenging because existing methods heavily depend on labeled samples and do not consider the intrinsic connections among the data across hospitals. METHODS: This paper proposes FedRL, a representation-based VFL method that enhances the performance of downstream tasks by utilizing aligned data for federated representation pretraining. The proposed method creates the same feature dimensions subsets by splitting the local data, exploiting the relationships among these subsets, constructing a bespoke loss function, and collaboratively training a representation model to these subsets across all participating hospitals. This model captures the latent representations of the global data, which are then applied to the downstream classification tasks. RESULTS AND CONCLUSION: The proposed FedRL method was validated through experiments on three healthcare datasets. The results demonstrate that the proposed method outperforms several existing methods across three performance metrics. Specifically, FedRL achieves average improvements of 4.7%, 5.6%, and 4.8% in accuracy, AUC, and F1-score, respectively, compared to current methods. In addition, FedRL demonstrates greater robustness and consistent performance in scenarios with limited labeled samples, thereby confirming its effectiveness and potential use in healthcare data analysis.

Tạo bộ sưu tập với mã QR