## On Polynomial Sized MDP Succinct Policies

**Paolo Liberatore**
*Journal of Artificial Intelligence Research*

Policies of Markov Decision Processes (MDPs) determine the next
action to execute from the current state and, possibly, the
history (the past states). When the number of states is large,
succinct representations are often used to compactly represent
both the MDPs and the policies in a reduced amount of space. In
this paper, some problems related to the size of succinctly
represented policies are analyzed. Namely, it is shown that some
MDPs have policies that can only be represented in space
super-polynomial in the size of the MDP, unless the polynomial
hierarchy collapses. This fact motivates the study of the
problem of deciding whether a given MDP has a policy of a given
size and reward. Since some algorithms for MDPs work by finding
a succinct representation of the value function, the problem of
deciding the existence of a succinct representation of a value
function of a given size and reward is also considered.

@article{libe-04,
title = {On Polynomial Sized MDP Succinct Policies},
year = {2004},
author = {Liberatore, Paolo},
journal = {Journal of Artificial Intelligence Research},
pages = {551--577},
volume = {21},
}

HTTP download.

doi: doi:10.1613/jair.1134