Missing-modality enabled multi-modal fusion architecture for medical data.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Hui Chen, Shiyu Fan, Yichen Li, Muyu Wang, Zhongrang Xie

Ngôn ngữ: eng

Ký hiệu phân loại: 133.594 Types or schools of astrology originating in or associated with a

Thông tin xuất bản: United States : Journal of biomedical informatics , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 643774

Thêm vào giỏ Liên kết toàn văn

BACKGROUND: Fusion of multi-modal data can improve the performance of deep learning models. However, missing modalities are common in medical data due to patient specificity, which is detrimental to the performance of multi-modal models in applications. Therefore, it is critical to adapt the models to missing modalities. OBJECTIVE: This study aimed to develop an effective multi-modal fusion architecture for medical data that was robust to missing modalities and further improved the performance for clinical tasks. METHODS: X-ray chest radiographs for the image modality, radiology reports for the text modality, and structured value data for the tabular data modality were fused in this study. Each modality pair was fused with a Transformer-based bi-modal fusion module, and the three bi-modal fusion modules were then combined into a tri-modal fusion framework. Additionally, multivariate loss functions were introduced into the training process to improve models' robustness to missing modalities during the inference process. Finally, we designed comparison and ablation experiments to validate the effectiveness of the fusion, the robustness to missing modalities, and the enhancements from each key component. Experiments were conducted on MIMIC-IV and MIMIC-CXR datasets with the 14-label disease diagnosis and patient in-hospital mortality prediction task The area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC) were used to evaluate models' performance. RESULTS: Our proposed architecture showed superior predictive performance, achieving the average AUROC and AUPRC of 0.916 and 0.551 in the 14-label classification task, 0.816 and 0.392 in the mortality prediction task. while the best average AUROC and AUPRC among the comparison methods were 0.876, 0.492 in the 14-label classification task and 0.806, 0.366 in the mortality prediction task. Both metrics decreased only slightly when tested with modal-incomplete data. Different levels of enhancements were achieved through three key components. CONCLUSIONS: The proposed multi-modal fusion architecture effectively fused three modalities and showed strong robustness to missing modalities. This architecture holds promise for scaling up to more modalities to enhance the clinical practicality of the model.

Tạo bộ sưu tập với mã QR