Publication Type Tagging using Transformer Models and Multi-Label Classification.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Halil Kilicoglu, Joe D Menke, Neil R Smalheiser

Ngôn ngữ: eng

Ký hiệu phân loại: 972.83051 *Central America

Thông tin xuất bản: United States : medRxiv : the preprint server for health sciences , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 723436

 Indexing articles by their publication type and study design is essential for efficient search and filtering of the biomedical literature, but is understudied compared to indexing by MeSH topical terms. In this study, we leveraged the human-curated publication types and study designs in PubMed to generate a dataset of more than 1.2M articles (titles and abstracts) and used state-of-the-art Transformer-based models for automatic tagging of publication types and study designs. Specifically, we trained PubMedBERT-based models using a multi-label classification approach, and explored undersampling, feature verbalization, and contrastive learning to improve model performance. Our results show that PubMedBERT provides a strong baseline for publication type and study design indexing
  undersampling, feature verbalization, and unsupervised constrastive loss have a positive impact on performance, whereas supervised contrastive learning degrades the performance. We obtained the best overall performance with 80% undersampling and feature verbalization (0.632 macro-F
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH