Reinforcement Learning

RL Digest

Reinforcement learning research, tooling, and news — papers from arXiv cs.LG/cs.AI, Hugging Face Daily Papers, and the wider RL community.

Generated 2026-06-24 23:35 UTC Sources: public RSS, APIs, arXiv, Hugging Face
8
Papers
22
Other
3
Sources
14d
Window

Papers & Researchacademic

Reinforcement learning papers from arXiv cs.LG, cs.AI, and Hugging Face Daily Papers.

World Models in Pieces: Structural Certification for General Agents

Yikai Lu, Yifei Wu, Xinyu Lu

In the big-world regime, agents cannot be universally capable and their ability is inevitably specialized across a world model in pieces. Consequently, standard uniform guarantees fail to distinguish between the understanding of critical bottlenecks and irrelevant failures. We first formalize this limitation by provin…

cs.AI

TACTFUL: Tactile-Driven Exploration For Object Localization and Identification in Confined Environments

Shivani Kamtikar, Chung Hee Kim, Camilla Tabasso

Humans effortlessly locate and identify objects by touch alone, even without vision. In contrast, robotic systems rely heavily on vision and struggle with autonomous tactile exploration and object identification. We present TACTFUL, a vision-free tactile exploration framework that enables a multi-fingered robot to aut…

cs.ROcs.AI

LaGO: Latent Action Guidance for Online Reinforcement Learning

Kuan-Yen Liu, Ren-Jyun Huang, Ti-Rong Wu

Large language models (LLMs) have shown strong potential for planning and sequential decision-making, but prior work often relies on using them as direct controllers, which requires precise action generation and can be unreliable in practice. This paper proposes Latent Action Guidance for Online Reinforcement Learning…

cs.AI

ASALT: Adaptive State Alignment for Lateral Transfer in Multi-agent Reinforcement Learning

Anurag Akula, Satheesh K. Perepu, Abhishek Sarkar

Multi-agent reinforcement learning (MARL) addresses the problem of training multiple agents that pursue collaborative, competitive, or mixed objectives. Prior work has investigated transfer learning between source and target domains in MARL; however, the majority of existing approaches impose the constraint that the d…

cs.AIcs.LG

RE4: Transformation-aware Imitation of Object Interactions Using Manipulation Modes

Arsh Chawla, Rahul Shome

Object interaction tasks have been a focus of advances in imitation learning. End-to-end methods, dominated by diffusion and flow-based variants have shown leaps in performance while sacrificing interpretability. Object-centric and pose-informed variants have had a role in learning from demonstration in manipulation t…

cs.ROcs.LG

Qwen-AgentWorld: Language World Models for General Agents

HF Daily Papers

Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning

HF Daily Papers

World Value Models for Robotic Manipulation

HF Daily Papers

Tooling & Frameworkstools

Libraries, training infrastructure, and RL framework updates.

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

HF Daily Papers

Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning

HF Daily Papers

ChartWalker: Benchmarking the Cross-Chart RAG Task

HF Daily Papers

AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning

HF Daily Papers

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

HF Daily Papers

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

HF Daily Papers

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

HF Daily Papers

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

HF Daily Papers

OpenThoughts-Agent: Data Recipes for Agentic Models

HF Daily Papers

Semantic Browsing: Controllable Diversity for Image Generation

HF Daily Papers

FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation

HF Daily Papers

FedOT: Ownership Verification and Leakage Tracing via Watermarks for Federated LDMs

HF Daily Papers

Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning

HF Daily Papers

DiffusionBench: On Holistic Evaluation of Diffusion Transformers

HF Daily Papers

FlowR2A: Learning Reward-to-Action Distribution for Multimodal Driving Planning

HF Daily Papers