Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks

Donghoon Kim, Minjong Yoo, Honguk Woo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Goal-conditioned (GC) policy learning often faces a challenge arising from the sparsity of rewards, when confronting long-horizon goals. To address the challenge, we explore skill-based GC policy learning in offline settings, where skills are acquired from existing data and long-horizon goals are decomposed into sequences of near-term goals that align with these skills. Specifically, we present an 'offline GC policy learning via skill-step abstraction' framework (GLVSA) tailored for tackling long-horizon GC tasks affected by goal distribution shifts. In the framework, a GC policy is progressively learned offline in conjunction with the incremental modeling of skill-step abstractions on the data. We also devise a GC policy hierarchy that not only accelerates GC policy learning within the framework but also allows for parameter-efficient fine-tuning of the policy. Through experiments with the maze and Franka kitchen environments, we demonstrate the superiority and efficiency of our GLVSA framework in adapting GC policies to a wide range of long-horizon goals. The framework achieves competitive zero-shot and few-shot adaptation performance, outperforming existing GC policy learning and skill-based methods.

Original languageEnglish
Title of host publicationProceedings of the 33rd International Joint Conference on Artificial Intelligence, IJCAI 2024
EditorsKate Larson
PublisherInternational Joint Conferences on Artificial Intelligence
Pages4282-4290
Number of pages9
ISBN (Electronic)9781956792041
StatePublished - 2024
Event33rd International Joint Conference on Artificial Intelligence, IJCAI 2024 - Jeju, Korea, Republic of
Duration: 3 Aug 20249 Aug 2024

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference33rd International Joint Conference on Artificial Intelligence, IJCAI 2024
Country/TerritoryKorea, Republic of
CityJeju
Period3/08/249/08/24

Fingerprint

Dive into the research topics of 'Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks'. Together they form a unique fingerprint.

Cite this