Investigating the Impact of Hierarchical Decomposition in Reinforcement Learning on a Toy Navigation Task
Main Article Content
Abstract
Hierarchical Reinforcement Learning (HRL) provides promising techniques for decomposing com- plex tasks into subproblems, facilitating efficient learning and exploration. This paper investigates the influence of a two-level hierarchical reinforcement learning agent compared to a flat RL baseline on a deterministic grid-world navigation task. Experimental results demonstrate that the hierarchical agent significantly outperforms the flat agent in learning speed and success rates in sparse reward environ- ments. This study provides empirical insights into the benefits of hierarchy in reinforcement learning through controlled toy experiments.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
References
A. G. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems, 13(4):341–379, 2003.
P. Dayan and G. E. Hinton. Feudal reinforcement learning. Advances in neural information pro- cessing systems, 5, 1992.
T. G. Dietterich. Hierarchical reinforcement learning with the maxq value function decomposition. Journal of artificial intelligence research, 13:227–303, 2000.
T.Haarnoja,A.Zhou,P.Abbeel,andS.Levine.Softactor-critic:Off-policymaximumentropydeep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870. Pmlr, 2018.
L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of artificial intelligence research, 4:237–285, 1996.
S. Kapturowski, G. Ostrovski, J. Quan, R. Munos, and W. Dabney. Recurrent experience replay in distributed reinforcement learning. In International conference on learning representations, 2018.
J. Kober, J. A. Bagnell, and J. Peters. Reinforcement learning in robotics: A survey. The Interna- tional Journal of Robotics Research, 32(11):1238–1274, 2013.
S. Krishnan, R. Fox, I. Stoica, and K. Goldberg. Ddco: Discovery of deep continuous options for robot learning from demonstrations. In Conference on robot learning, pages 418–437. PMLR, 2017.
T. D. Kulkarni, K. Narasimhan, A. Saeedi, and J. Tenenbaum. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. Advances in neural information processing systems, 29, 2016.
S. Levine, C. Finn, T. Darrell, and P. Abbeel. End-to-end training of deep visuomotor policies. Journal of Machine Learning Research, 17(39):1–40, 2016.
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Ried- miller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learn- ing. nature, 518(7540):529–533, 2015.
S. Pateria, B. Subagdja, A.-h. Tan, and C. Quek. Hierarchical reinforcement learning: A compre- hensive survey. ACM Computing Surveys (CSUR), 54(5):1–35, 2021.
R. S. Sutton, A. G. Barto, et al. Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998.
R. S. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1-2):181–211, 1999.
A. S. Vezhnevets, S. Osindero, T. Schaul, N. Heess, M. Jaderberg, D. Silver, and K. Kavukcuoglu. Feudal networks for hierarchical reinforcement learning. In International conference on machine learning, pages 3540–3549. PMLR, 2017.
J. Wo ̈hlke, F. Schmitt, and H. van Hoof. Hierarchies of planning and reinforcement learning for robot navigation. In 2021 IEEE international conference on robotics and automation (ICRA), pages 10682–10688. IEEE, 2021.