David Martínez: Code

This is a C++ library that implements algorithms that combine Reinforcement Learning and Active Learning. It has the following features:

The V-MIN [1] and the REX-D [2] algorithm.
An algorithm to learn models with action and exogenous effects from a set of input state transitions. The learned models can be used by standard probabilistic task planners [3].
Integrates Tobias Lang's implementation of Pasula et al.'s learner [4] and PRADA [5] planner.
Provides a wrapper for IPPC planners. It currently works with the PROST [6] and the G-PACK [7] planners, but it should be easy to integrate with other planners.
Implements teacher guidance [1] to facilitate interaction with a teacher.

Code

The code is available at bitbucket.

Documentation

Documentation for the code is available here.

Considerations about the library

Most features should work in the master branch.
A few features such as subgoals may only work in the v1.0 or previous tags.
Teacher guidance only works with the G-PACK planner.
Exogenous effects only work with the PROST planner.

[1] V-MIN: Efficient reinforcement learning through demonstrations and relaxed reward demands
D. Martínez, G. Alenyà, and C. Torras
Proceedings of the AAAI Conference on Artificial Intelligence, 2015, pp. 2857–2863

PDF Bibtex Code

[2] Relational reinforcement learning with guided demonstrations
D. Martínez, G. Alenyà, and C. Torras
Artificial Intelligence, 247: 295-312, 2017

PDF Bibtex Code

[3] Learning Relational Dynamics of Stochastic Domains for Planning
D. Martínez, G. Alenyà, C. Torras, T. Ribeiro and K. Inoue
International Conference on Automated Planning and Scheduling, 2016, pp. 235-243

PDF Bibtex Code

[4] Learning symbolic models of stochastic domains
H. M. Pasula, L. S. Zettlemoyer and L. P. Kaelbling
Journal of Artificial Intelligence Research, 2007, 29(1), pp. 309–352

[5] Planning with noisy probabilistic relational rules
T. Lang, M. Toussaint
The Journal of Machine Learning Research, 2012, 39, pp. 1–49

[6] PROST: Probabilistic Planning Based on UCT
T. Keller and P. Eyerich
International Conference on Automated Planning and Scheduling, 2012, pp. 119–127

[7] LRTDP Versus UCT for Online Probabilistic Planning
A. Kolobov, Mausam, and D. S. Weld
Proceedings of the AAAI Conference on Artificial Intelligence, 2012, pp. 1786–1792

The REX-D library

Code

Documentation

Considerations about the library