RL Digest — Multi-Topic Digest

Papers & Researchacademic

Reinforcement learning papers from arXiv cs.LG, cs.AI, and Hugging Face Daily Papers.

World Models in Pieces: Structural Certification for General Agents

Yikai Lu, Yifei Wu, Xinyu Lu

In the big-world regime, agents cannot be universally capable and their ability is inevitably specialized across a world model in pieces. Consequently, standard uniform guarantees fail to distinguish between the understanding of critical bottlenecks and irrelevant failures. We first formalize this limitation by provin…

cs.AI

arXiv HF

TACTFUL: Tactile-Driven Exploration For Object Localization and Identification in Confined Environments

Shivani Kamtikar, Chung Hee Kim, Camilla Tabasso

Humans effortlessly locate and identify objects by touch alone, even without vision. In contrast, robotic systems rely heavily on vision and struggle with autonomous tactile exploration and object identification. We present TACTFUL, a vision-free tactile exploration framework that enables a multi-fingered robot to aut…

cs.ROcs.AI

arXiv HF

LaGO: Latent Action Guidance for Online Reinforcement Learning

Kuan-Yen Liu, Ren-Jyun Huang, Ti-Rong Wu

Large language models (LLMs) have shown strong potential for planning and sequential decision-making, but prior work often relies on using them as direct controllers, which requires precise action generation and can be unreliable in practice. This paper proposes Latent Action Guidance for Online Reinforcement Learning…

cs.AI

arXiv HF

ASALT: Adaptive State Alignment for Lateral Transfer in Multi-agent Reinforcement Learning

Anurag Akula, Satheesh K. Perepu, Abhishek Sarkar

Multi-agent reinforcement learning (MARL) addresses the problem of training multiple agents that pursue collaborative, competitive, or mixed objectives. Prior work has investigated transfer learning between source and target domains in MARL; however, the majority of existing approaches impose the constraint that the d…

cs.AIcs.LG

arXiv HF

RE4: Transformation-aware Imitation of Object Interactions Using Manipulation Modes

Arsh Chawla, Rahul Shome

Object interaction tasks have been a focus of advances in imitation learning. End-to-end methods, dominated by diffusion and flow-based variants have shown leaps in performance while sacrificing interpretability. Object-centric and pose-informed variants have had a role in learning from demonstration in manipulation t…

cs.ROcs.LG

arXiv HF

Qwen-AgentWorld: Language World Models for General Agents

HF Daily Papers

arXiv HF Source

Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning

HF Daily Papers

arXiv HF Source

World Value Models for Robotic Manipulation

HF Daily Papers

arXiv HF Source

Tooling & Frameworkstools

Libraries, training infrastructure, and RL framework updates.

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

HF Daily Papers

Papers & Researchacademic

World Models in Pieces: Structural Certification for General Agents

TACTFUL: Tactile-Driven Exploration For Object Localization and Identification in Confined Environments

LaGO: Latent Action Guidance for Online Reinforcement Learning

ASALT: Adaptive State Alignment for Lateral Transfer in Multi-agent Reinforcement Learning

RE4: Transformation-aware Imitation of Object Interactions Using Manipulation Modes

Qwen-AgentWorld: Language World Models for General Agents

Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning

World Value Models for Robotic Manipulation

Tooling & Frameworkstools

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning

ChartWalker: Benchmarking the Cross-Chart RAG Task

AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

OpenThoughts-Agent: Data Recipes for Agentic Models

Semantic Browsing: Controllable Diversity for Image Generation

FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation

FedOT: Ownership Verification and Leakage Tracing via Watermarks for Federated LDMs

Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning

DiffusionBench: On Holistic Evaluation of Diffusion Transformers

FlowR2A: Learning Reward-to-Action Distribution for Multimodal Driving Planning