I know that v10 has support for both Machine Learning and Finite Markov Processes, the Q-learning algorithm uses both. Specifically, Q-learning finds an optimal action-selection policy for any given (finite) Markov decision process (MDP).
Are there any specific combinations of functions and options that I should use as a starting point? or should I do this from scratch?

