Discounted Inverse Reinforcement Learning for Linear Quadratic Control.

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Fei Dong, Qinglei Hu, Dongyu Li, Zhenchao Ouyang, Han Wu, Jianying Zheng

Ngôn ngữ: eng

Ký hiệu phân loại:

Thông tin xuất bản: United States : IEEE transactions on cybernetics , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 722748

Thêm vào giỏ Liên kết toàn văn

Linear quadratic control with unknown value functions and dynamics is extremely challenging, and most of the existing studies have focused on the regulation problem, incapable of dealing with the tracking problem. To solve both linear quadratic regulation and tracking problems for continuous-time systems with unknown value functions, this article develops a discounted inverse reinforcement learning (DIRL) method that inherits the model-independent property of reinforcement learning (RL). More specifically, we first formulate a standard paradigm for solving linear quadratic control using DIRL. To recover the value function and the target control gain, an error metric is elaborately constructed, and a quasi-Newton algorithm is adopted to minimize it. Furthermore, three DIRL algorithms, including model-based, model-free off-policy, and model-free on-policy algorithms, are proposed. The latter two rely on the expert's demonstration data or the online observed data, requiring no prior knowledge of the system dynamics and value function. The stability, convergence, and existence conditions of multiple solutions are thoroughly analyzed. Finally, numerical simulations demonstrate the effectiveness of the theoretical results.

Tạo bộ sưu tập với mã QR