TY - GEN
T1 - Approximability of constant-horizon constrained POMDP
AU - Khonji, Majid
AU - Jasour, Ashkan
AU - Williams, Brian
N1 - Publisher Copyright:
© 2019 International Joint Conferences on Artificial Intelligence. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Partially Observable Markov Decision Process (POMDP) is a fundamental framework for planning and decision making under uncertainty. POMDP is known to be intractable to solve or even approximate when the planning horizon is long (i.e., within a polynomial number of time steps). Constrained POMDP (C-POMDP) allows constraints to be specified on some aspects of the policy in addition to the objective function. When the constraints involve bounding the probability of failure, the problem is called Chance-Constrained POMDP (CC-POMDP). Our first contribution is a reduction from CC-POMDP to C-POMDP and a novel Integer Linear Programming (ILP) formulation. Thus, any algorithm for the later problem can be utilized to solve any instance of the former. Second, we show that unlike POMDP, when the length of the planning horizon is constant, (C)C-POMDP is NP-Hard. Third, we present the first Fully Polynomial Time Approximation Scheme (FPTAS) that computes (near) optimal deterministic policies for constant-horizon (C)C-POMDP in polynomial time.
AB - Partially Observable Markov Decision Process (POMDP) is a fundamental framework for planning and decision making under uncertainty. POMDP is known to be intractable to solve or even approximate when the planning horizon is long (i.e., within a polynomial number of time steps). Constrained POMDP (C-POMDP) allows constraints to be specified on some aspects of the policy in addition to the objective function. When the constraints involve bounding the probability of failure, the problem is called Chance-Constrained POMDP (CC-POMDP). Our first contribution is a reduction from CC-POMDP to C-POMDP and a novel Integer Linear Programming (ILP) formulation. Thus, any algorithm for the later problem can be utilized to solve any instance of the former. Second, we show that unlike POMDP, when the length of the planning horizon is constant, (C)C-POMDP is NP-Hard. Third, we present the first Fully Polynomial Time Approximation Scheme (FPTAS) that computes (near) optimal deterministic policies for constant-horizon (C)C-POMDP in polynomial time.
UR - http://www.scopus.com/inward/record.url?scp=85074910941&partnerID=8YFLogxK
U2 - 10.24963/ijcai.2019/775
DO - 10.24963/ijcai.2019/775
M3 - Conference contribution
AN - SCOPUS:85074910941
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 5583
EP - 5590
BT - Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019
A2 - Kraus, Sarit
T2 - 28th International Joint Conference on Artificial Intelligence, IJCAI 2019
Y2 - 10 August 2019 through 16 August 2019
ER -