Asymptotic Theory for IV-Based Reinforcement Learning with Potential Endogeneity

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Jin Li, Ye Luo, Zigan Wang, Xiaowei Zhang

Ngôn ngữ: eng

Ký hiệu phân loại: 511.4 Approximations formerly also 513.24 and expansions

Thông tin xuất bản: 2021

Mô tả vật lý:

Bộ sưu tập: Metadata

ID: 166468

 Comment: main body: 42 pages
  supplemental material: 14 pagesIn the standard data analysis framework, data is collected (once and for all), and then data analysis is carried out. However, with the advancement of digital technology, decision-makers constantly analyze past data and generate new data through their decisions. We model this as a Markov decision process and show that the dynamic interaction between data generation and data analysis leads to a new type of bias -- reinforcement bias -- that exacerbates the endogeneity problem in standard data analysis. We propose a class of instrument variable (IV)-based reinforcement learning (RL) algorithms to correct for the bias and establish their theoretical properties by incorporating them into a stochastic approximation (SA) framework. Our analysis accommodates iterate-dependent Markovian structures and, therefore, can be used to study RL algorithms with policy improvement. We also provide formulas for inference on optimal policies of the IV-RL algorithms. These formulas highlight how intertemporal dependencies of the Markovian environment affect the inference.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH