Multi-actions vanilla policy Gradient

As a small extension to the previous policy gradient implementation we discussed, we are now going to study how to support multiple actions (ie. num_actions > 2) in the policy network.

→ Read more...

2019/03/09 08:54

Full policy Gradient agent for Reinforcement Learning

This time we are going to handle the creation of a full policy gradient algorithm implementation training on the OpenAI CartPole environment. As opposed to the previous simple policy gradient implementation, this time we will need to handle the previous states to decide what actions to take, and the training network will become sligthly more complex.

→ Read more...

2019/03/08 08:58

Simple Policy gradient Training on armed bandit

In this post, we are going to build a simple Policy gradient experiment, on an “n-armed bandit” problem.

→ Read more...

2019/03/07 13:27

QNetwork learning

Continuing on my current “Reinforcement Learning” path we are now going to try the Q network implementation that we will train on the Frozenlake environment again.

→ Read more...

2019/03/05 21:19

Probabilistic QTable learning

As mentioned in my previous post on this subject, I feel it could be worth investigating a bit further on this “Q table learning” algorithm. And more specifically, I want to try to introduce some kind of probabilistic action management in the system… Even if I'm not sure yet what kind of results this will give me. Let's just try and see ;-)

→ Read more...

2019/03/05 08:42

<< Newer entries | Older entries >>