Key insights into recommended SMS spam detection datasets.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Irshad Ahmed Abbasi, Kang Leng Chiew, Daniel Grzonka, Abdul Razak Hosen, Mohammad Firdaus Johari, Adnan Shahid Khan, Kelvin S C Yong

Ngôn ngữ: eng

Ký hiệu phân loại: 628.4459 Waste technology, public toilets, street cleaning

Thông tin xuất bản: England : Scientific reports , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 693554

Short Message Service (SMS) spam poses significant risks, including financial scams and phishing attempts. Although numerous datasets from online repositories have been utilized to address this issue, little attention has been given to evaluating their effectiveness and impact on SMS spam detection models. This study fills this gap by assessing the performance of ten SMS spam detection datasets using Decision Tree and Multinomial Naïve Bayes models. Datasets were evaluated based on accuracy and qualitative factors such as authenticity, class imbalance, feature diversity, metadata availability, and preprocessing needs. Due to the multilingual nature of the datasets, experiments were conducted with two stopword removal groups: one in English and another in the respective non-English languages. The key findings of this research have led to the recommendation of Dataset 5 for future SMS spam detection research, as evidence from the dataset's high qualitative assessment score of 3.8 out of 5.0 due to its high feature diversity, real-world complexity, and balanced class distribution, and low detection rate of 86.10% from Multinomial Naïve Bayes. Recommending a dataset that poses challenges for high model performance fosters the development of more robust and adaptable spam detection models capable of handling diverse forms of noise and ambiguity. Furthermore, selecting the dataset with the highest qualitative score enhances research quality, improves model generalizability, and mitigates risks related to bias and inconsistencies.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH