arxiv:2602.23280

Physics Informed Viscous Value Representations

Published on Feb 26

· Submitted by

Hrishikesh Viswanath on Mar 9

Purdue University

Upvote

Authors:

Hrishikesh Viswanath ,

Abstract

Physics-informed regularization based on Hamilton-Jacobi-Bellman equation with viscosity solution improves value estimation in offline goal-conditioned reinforcement learning through Monte Carlo estimation.

AI-generated summary

Offline goal-conditioned reinforcement learning (GCRL) learns goal-conditioned policies from static pre-collected datasets. However, accurate value estimation remains a challenge due to the limited coverage of the state-action space. Recent physics-informed approaches have sought to address this by imposing physical and geometric constraints on the value function through regularization defined over first-order partial differential equations (PDEs), such as the Eikonal equation. However, these formulations can often be ill-posed in complex, high-dimensional environments. In this work, we propose a physics-informed regularization derived from the viscosity solution of the Hamilton-Jacobi-Bellman (HJB) equation. By providing a physics-based inductive bias, our approach grounds the learning process in optimal control theory, explicitly regularizing and bounding updates during value iterations. Furthermore, we leverage the Feynman-Kac theorem to recast the PDE solution as an expectation, enabling a tractable Monte Carlo estimation of the objective that avoids numerical instability in higher-order gradients. Experiments demonstrate that our method improves geometric consistency, making it broadly applicable to navigation and high-dimensional, complex manipulation tasks. Open-source codes are available at https://github.com/HrishikeshVish/phys-fk-value-GCRL.

View arXiv page View PDF Add to collection

Community

hrishivish23

Paper author Paper submitter about 21 hours ago

Offline goal-conditioned reinforcement learning (GCRL) learns goalconditioned policies from static pre-collected datasets. However, accurate value
estimation remains a challenge due to the limited coverage of the state-action
space. Recent physics-informed approaches have sought to address this by imposing physical and geometric constraints on the value function through regularization defined over first-order partial differential equations (PDEs), such as
the Eikonal equation. However, these formulations can often be ill-posed in
complex, high-dimensional environments. In this work, we propose a physicsinformed regularization derived from the viscosity solution of the HamiltonJacobi-Bellman (HJB) equation. By providing a physics-based inductive bias,
our approach grounds the learning process in optimal control theory, explicitly
regularizing and bounding updates during value iterations. Furthermore, we leverage the Feynman-Kac theorem to recast the PDE solution as an expectation, enabling a tractable Monte Carlo estimation of the objective that avoids numerical
instability in higher-order gradients. Experiments demonstrate that our method
improves geometric consistency, making it broadly applicable to navigation and
high-dimensional, complex manipulation tasks.