Learning and planning in stochastic interactive environments with the presence of sparse dependence structures