Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski

TL;DR

A new Q function for off-policy reinforcement learning that doesn't rely on actions.

Venue

In Proceedings of the 37th International Conference on Machine Learning (ICML 2020).

BibTeX

@article{edwards2020estimating,
title={Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients},
author={Ashley D. Edwards and Himanshu Sahni and Rosanne Liu and Jane Hung and Ankit Jain and Rui Wang and Adrien Ecoffet and Thomas Miconi and Charles Isbell and Jason Yosinski},
year={2020},
eprint={2002.09505},
archivePrefix={arXiv},
primaryClass={cs.LG}
}

Date

February, 2020

Links

ICML 2020 arXiv Video Code