Abstract
Physics-informed regularization based on Hamilton-Jacobi-Bellman equation with viscosity solution improves value estimation in offline goal-conditioned reinforcement learning through Monte Carlo estimation.
Offline goal-conditioned reinforcement learning (GCRL) learns goal-conditioned policies from static pre-collected datasets. However, accurate value estimation remains a challenge due to the limited coverage of the state-action space. Recent physics-informed approaches have sought to address this by imposing physical and geometric constraints on the value function through regularization defined over first-order partial differential equations (PDEs), such as the Eikonal equation. However, these formulations can often be ill-posed in complex, high-dimensional environments. In this work, we propose a physics-informed regularization derived from the viscosity solution of the Hamilton-Jacobi-Bellman (HJB) equation. By providing a physics-based inductive bias, our approach grounds the learning process in optimal control theory, explicitly regularizing and bounding updates during value iterations. Furthermore, we leverage the Feynman-Kac theorem to recast the PDE solution as an expectation, enabling a tractable Monte Carlo estimation of the objective that avoids numerical instability in higher-order gradients. Experiments demonstrate that our method improves geometric consistency, making it broadly applicable to navigation and high-dimensional, complex manipulation tasks. Open-source codes are available at https://github.com/HrishikeshVish/phys-fk-value-GCRL.
Community
Offline goal-conditioned reinforcement learning (GCRL) learns goalconditioned policies from static pre-collected datasets. However, accurate value
estimation remains a challenge due to the limited coverage of the state-action
space. Recent physics-informed approaches have sought to address this by imposing physical and geometric constraints on the value function through regularization defined over first-order partial differential equations (PDEs), such as
the Eikonal equation. However, these formulations can often be ill-posed in
complex, high-dimensional environments. In this work, we propose a physicsinformed regularization derived from the viscosity solution of the HamiltonJacobi-Bellman (HJB) equation. By providing a physics-based inductive bias,
our approach grounds the learning process in optimal control theory, explicitly
regularizing and bounding updates during value iterations. Furthermore, we leverage the Feynman-Kac theorem to recast the PDE solution as an expectation, enabling a tractable Monte Carlo estimation of the objective that avoids numerical
instability in higher-order gradients. Experiments demonstrate that our method
improves geometric consistency, making it broadly applicable to navigation and
high-dimensional, complex manipulation tasks.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Epigraph-Guided Flow Matching for Safe and Performant Offline Reinforcement Learning (2026)
- Reparameterization Flow Policy Optimization (2026)
- FLAC: Maximum Entropy RL via Kinetic Energy Regularized Bridge Matching (2026)
- Stabilizing Physics-Informed Consistency Models via Structure-Preserving Training (2026)
- Zero-Shot Off-Policy Learning (2026)
- Stabilizing the Q-Gradient Field for Policy Smoothness in Actor-Critic (2026)
- How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models? (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper