Abstract
In this paper, we study the variance minimization problem of Markov decision processes (MDPs) in which the policy is parameterized by action selection......
小提示:本篇文献需要登录阅读全文,点击跳转登录