Comparison of Deep Learning Models for Voice Disorder Classification Using Kymographic Images.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: S Pravin Kumar, B Panchami

Ngôn ngữ: eng

Ký hiệu phân loại: 920.71 Men

Thông tin xuất bản: United States : Journal of voice : official journal of the Voice Foundation , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 56142

Thêm vào giỏ Liên kết toàn văn

Voice is a critical tool for communication, and diagnosing voice disorders poses significant challenges, particularly when using high-speed video (HSV) endoscopy. The primary difficulty with HSV lies in the need for clinical experts to manually analyze and interpret large volumes of HSV frames. To address these challenges, kymography has been introduced as an effective clinical decision-support tool. In this study, a deep learning-based approach for classifying kymographic images is proposed to automate the analysis by training models to detect subtle and intricate variations in pathological vibratory patterns. We used high-speed recordings from the Benchmark for Automatic Glottis Segmentation (BAGLS) dataset to generate kymographic images, which were then used for binary and tertiary classifications employing deep learning models. We evaluated the performance of five widely used pretrained models: AlexNet, DenseNet121, Xception, Inceptionv3, and ResNet50v2. Our experimental results demonstrate that DenseNet121 can automatically classify voice disorders with higher accuracy and better performance across different model evaluation indicators, outperforming existing methods. With further research, the deep learning classifier has the potential to become a valuable diagnostic assistance tool for clinicians.

Tạo bộ sưu tập với mã QR