SheepNav
精选今天0 投票

A Contextual-Bandit Oversight Game with Two-Sided Informational Asymmetry

arXiv:2607.00155v1 Announce Type: new Abstract: We study runtime human oversight of an AI agent when private information runs in both directions: the human privately knows her reward function, while the AI privately knows the quality of the action it proposes. This is the kind of asymmetry that arises naturally when an autonomous robot or software agent has inspected a situation its human supervisor cannot directly assess. Building on Cooperative Inverse Reinforcement Learning (CIRL) and the Ove

延伸阅读

  1. 构建认知型AI素养:学生与AI协作编程中的认知目标与过程检测
  2. RareDxR1:无需人类标注的罕见病自主诊断AI,突破开放式推理瓶颈
  3. 可解释AI路径规划:为空管员设计的冲突解脱算法
查看原文