Download PDFOpen PDF in browserCurrent version

Generalization of Temporal Logic Tasks via Future Dependent Options

EasyChair Preprint 12510, version 1

Versions: 12history
30 pagesDate: March 15, 2024

Abstract

Temporal logic (TL) tasks consist of complex and temporally extended subgoals and they are common for many real-world applications, such as service and navigation robots. However, it is often inefficient or even infeasible to train reinforcement learning (RL) agents to solve multiple TL tasks, since rewards are sparse and non-Markovian in these tasks. A promising solution to this problem is to learn task-conditioned policies which can zero-shot generalize to new TL tasks without further training. However, influenced by some practical issues, such as issues of lossy symbolic observation and long time-horizon of completing TL task, previous works suffer from sample inefficiency in training and sub-optimality (or even infeasibility) in task execution. In order to tackle these issues, this paper proposes an option-based framework to generalize TL tasks, consisting of option training and task execution parts. We have innovations in both parts. In option training, we propose to learn options dependent on the future subgoals via a novel approach. Additionally, we propose to train a multi-step value function which can propagate the rewards of satisfying future subgoals more efficiently in long-horizon tasks. In task execution, in order to ensure the optimality and safety, we propose a model-free MPC planner for option selection, circumventing the learning of a transition model which is required by previous MPC planners. In experiments on three different domains, we evaluate the generalization capability of the agent trained by the proposed method, showing its significant advantage over previous methods.

Keyphrases: Option, Reinforcement Learning, Temporal logic task, generalization

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:12510,
  author    = {Duo Xu and Faramarz Fekri},
  title     = {Generalization of Temporal Logic Tasks via Future Dependent Options},
  howpublished = {EasyChair Preprint 12510},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browserCurrent version