Skip to main content


Showing posts from June, 2017


For an introduction to Reinforcement Learning, its basic terminologies, concepts and types read Reinforcement Learning - Part 1 by following this link:
Q learning is an algorithm in reinforcement learning. It originates from the model based reinforcement learning. It can be referred to as a different kind of value function. The values are called Q values and are denoted by Q(s,a). It signifies the Q value when in a state 's' and taking an action 'a'.
                      Q(s,a) = R(s) + γ Σs' P(s,a,s') maxa' Q(s',a')

It can be defined as the value for arriving in a state which is obtained by learning via action 'a' and proceeding optimally thereafter.
Also, V(s)     = maxa Q(s,a)                                  л(s)      =   argmaxa Q(s,a)
V(s) is a value, i.e. it returns a number , a scalar value in particular, whereas л(s) returns an action. Hence…