Analysis of 2019 Ohio Disease Intervention Specialist Records for Syphilis Cases Using Clustering Algorithms.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Payal Chakraborty, David M Kline, Mary McNeill, William C Miller, Xia Ning, Abigail Norris Turner, Abigail B Shoben

Ngôn ngữ: eng

Ký hiệu phân loại:

Thông tin xuất bản: United States : Sexually transmitted diseases , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 720142

BACKGROUND: Developments in natural language processing and unsupervised machine learning methodologies (e.g., clustering) have given researchers new tools to analyze both structured and unstructured health data. We applied these methods to 2019 Ohio disease intervention specialist (DIS) syphilis records, to determine whether these methods can uncover novel patterns of co-occurrence of individual characteristics, risk factors, and clinical characteristics of syphilis that are not yet reported in the literature. METHODS: The 2019 disease intervention specialist syphilis records (n = 1996) contain both structured data (categorical and numerical variables) and unstructured notes. In the structured data, we examined case demographics, syphilis risk factors, and clinical characteristics of syphilis. For the unstructured text, we applied TF-IDF (term frequency multiplied by inverse document frequency) weights, a common way to convert text into numerical representations. We performed agglomerative clustering with cosine similarity using the CLUTO software. RESULTS: The cluster analysis yielded 6 clusters of syphilis cases based on patterns in the structured and unstructured data. The average internal similarities were much higher than the average external similarities, indicating that the clusters were well formed. The factors underlying 3 of the clusters related to patterns of missing data. The factors underlying the other 3 clusters were sexual behaviors and partnerships. Notably, 1 of the 3 consisted of individuals who reported oral sex with male or anonymous partners while intoxicated, and one comprised mainly of males who have sex with females. CONCLUSIONS: Our analysis resulted in clusters that were well formed mathematically, but did not reveal novel epidemiological information about syphilis risk factors or transmission that were not already known.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH