Reinforcement learning papers from arXiv cs.LG, cs.AI, and Hugging Face Daily Papers.
World Models in Pieces: Structural Certification for General Agents
Yikai Lu, Yifei Wu, Xinyu Lu
In the big-world regime, agents cannot be universally capable and their ability is inevitably specialized across a world model in pieces. Consequently, standard uniform guarantees fail to distinguish between the understanding of critical bottlenecks and irrelevant failures. We first formalize this limitation by provin…
cs.AI
TACTFUL: Tactile-Driven Exploration For Object Localization and Identification in Confined Environments
Shivani Kamtikar, Chung Hee Kim, Camilla Tabasso
Humans effortlessly locate and identify objects by touch alone, even without vision. In contrast, robotic systems rely heavily on vision and struggle with autonomous tactile exploration and object identification. We present TACTFUL, a vision-free tactile exploration framework that enables a multi-fingered robot to aut…
cs.ROcs.AI
LaGO: Latent Action Guidance for Online Reinforcement Learning
Kuan-Yen Liu, Ren-Jyun Huang, Ti-Rong Wu
Large language models (LLMs) have shown strong potential for planning and sequential decision-making, but prior work often relies on using them as direct controllers, which requires precise action generation and can be unreliable in practice. This paper proposes Latent Action Guidance for Online Reinforcement Learning…
cs.AI
ASALT: Adaptive State Alignment for Lateral Transfer in Multi-agent Reinforcement Learning
Anurag Akula, Satheesh K. Perepu, Abhishek Sarkar
Multi-agent reinforcement learning (MARL) addresses the problem of training multiple agents that pursue collaborative, competitive, or mixed objectives. Prior work has investigated transfer learning between source and target domains in MARL; however, the majority of existing approaches impose the constraint that the d…
cs.AIcs.LG
RE4: Transformation-aware Imitation of Object Interactions Using Manipulation Modes
Arsh Chawla, Rahul Shome
Object interaction tasks have been a focus of advances in imitation learning. End-to-end methods, dominated by diffusion and flow-based variants have shown leaps in performance while sacrificing interpretability. Object-centric and pose-informed variants have had a role in learning from demonstration in manipulation t…
cs.ROcs.LG
Qwen-AgentWorld: Language World Models for General Agents
HF Daily Papers
Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning
HF Daily Papers
World Value Models for Robotic Manipulation
HF Daily Papers