STEER: Assessing the Economic Rationality of Large Language Models

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Samuel Amouyal, Yoav Levine, Kevin Leyton-Brown, Taylor Lundy, Narun Raman, Moshe Tennenholtz

Ngôn ngữ: eng

Ký hiệu phân loại: 149.94 Linguistic philosophies

Thông tin xuất bản: 2024

Mô tả vật lý:

Bộ sưu tập: Metadata

ID: 201591

 There is increasing interest in using LLMs as decision-making "agents." Doing so includes many degrees of freedom: which model should be used
  how should it be prompted
  should it be asked to introspect, conduct chain-of-thought reasoning, etc? Settling these questions -- and more broadly, determining whether an LLM agent is reliable enough to be trusted -- requires a methodology for assessing such an agent's economic rationality. In this paper, we provide one. We begin by surveying the economic literature on rational decision making, taxonomizing a large set of fine-grained "elements" that an agent should exhibit, along with dependencies between them. We then propose a benchmark distribution that quantitatively scores an LLMs performance on these elements and, combined with a user-provided rubric, produces a "STEER report card." Finally, we describe the results of a large-scale empirical experiment with 14 different LLMs, characterizing the both current state of the art and the impact of different model sizes on models' ability to exhibit rational behavior.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH