Zero-Inflated Bandits

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Lei Shi, Rui Song, Runzhe Wan, Haoyu Wei

Ngôn ngữ: eng

Ký hiệu phân loại: 332.41 Value of money

Thông tin xuất bản: 2023

Mô tả vật lý:

Bộ sưu tập: Metadata

ID: 201085

Many real-world bandit applications are characterized by sparse rewards, which can significantly hinder learning efficiency. Leveraging problem-specific structures for careful distribution modeling is recognized as essential for improving estimation efficiency in statistics. However, this approach remains under-explored in the context of bandits. To address this gap, we initiate the study of zero-inflated bandits, where the reward is modeled using a classic semi-parametric distribution known as the zero-inflated distribution. We develop algorithms based on the Upper Confidence Bound and Thompson Sampling frameworks for this specific structure. The superior empirical performance of these methods is demonstrated through extensive numerical studies.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH