GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

Wang, Mianchu; Yang, Rui; Chen, Xi; Sun, Hao; Fang, Meng; Montana, Giovanni

Computer Science > Machine Learning

arXiv:2310.20025 (cs)

[Submitted on 30 Oct 2023 (v1), last revised 16 May 2024 (this version, v3)]

Title:GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

Authors:Mianchu Wang, Rui Yang, Xi Chen, Hao Sun, Meng Fang, Giovanni Montana

View PDF HTML (experimental)

Abstract:Offline Goal-Conditioned RL (GCRL) offers a feasible paradigm for learning general-purpose policies from diverse and multi-task offline datasets. Despite notable recent progress, the predominant offline GCRL methods, mainly model-free, face constraints in handling limited data and generalizing to unseen goals. In this work, we propose Goal-conditioned Offline Planning (GOPlan), a novel model-based framework that contains two key phases: (1) pretraining a prior policy capable of capturing multi-modal action distribution within the multi-goal dataset; (2) employing the reanalysis method with planning to generate imagined trajectories for funetuning policies. Specifically, we base the prior policy on an advantage-weighted conditioned generative adversarial network, which facilitates distinct mode separation, mitigating the pitfalls of out-of-distribution (OOD) actions. For further policy optimization, the reanalysis method generates high-quality imaginary data by planning with learned models for both intra-trajectory and inter-trajectory goals. With thorough experimental evaluations, we demonstrate that GOPlan achieves state-of-the-art performance on various offline multi-goal navigation and manipulation tasks. Moreover, our results highlight the superior ability of GOPlan to handle small data budgets and generalize to OOD goals.

Comments:	Spotlight Presentation at Goal-conditioned Reinforcement Learning Workshop at NeurIPS 2023
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.20025 [cs.LG]
	(or arXiv:2310.20025v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.20025
Journal reference:	Transactions on Machine Learning Research (05/2024)

Submission history

From: Mianchu Wang [view email]
[v1] Mon, 30 Oct 2023 21:19:52 UTC (1,653 KB)
[v2] Sun, 28 Jan 2024 15:04:34 UTC (2,300 KB)
[v3] Thu, 16 May 2024 14:08:55 UTC (2,672 KB)

Computer Science > Machine Learning

Title:GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators