Deep Q Learning is Simple with PyTorch | Full Tutorial 2020




[ad_1]

The PyTorch deep learning framework makes coding a deep q learning agent in python easier than ever. We’re going to code up the simplest possible deep Q learning agent, and show that we only need a replay memory to get some serious results in the Lunar Lander environment from the Open AI Gym. We don’t really need the target network, though it has been known to help the deep Q learning algorithm with convergence.

Learn how to turn deep reinforcement learning papers into code:

Deep Q Learning:
https://www.udemy.com/course/deep-q-learning-from-paper-to-code/?couponCode=DQN-DEC-2020

Actor Critic Methods:
https://www.udemy.com/course/actor-critic-methods-from-paper-to-code-with-pytorch/?couponCode=AC-DEC-2020

Reinforcement Learning Fundamentals
https://www.manning.com/livevideo/reinforcement-learning-in-motion

Come hang out on Discord here:
https://discord.gg/Zr4VCdv

Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter: https://twitter.com/MLWithPhil

Source


[ad_2]

Comment List

  • Machine Learning with Phil
    December 26, 2020

    This content is sponsored by my Udemy courses. Level up your skills by learning to turn papers into code. See the links in the description.

  • Machine Learning with Phil
    December 26, 2020

    Why did you set the epsion to 1?

  • Machine Learning with Phil
    December 26, 2020

    Next time use font size 22 at least.

  • Machine Learning with Phil
    December 26, 2020

    Thankyou Dr. Phil for an amazing video. When I try to run this on colab, I get this error : "expected scalar type Float but found Double" at either 18th line or 23rd line of main**.py. I am trying it on cartpole environment and I have also tried to change the observation(line 16) to float 32 but it didn't work.

  • Machine Learning with Phil
    December 26, 2020

    How did you know to use [8] as input dims?

  • Machine Learning with Phil
    December 26, 2020

    Hi Phil, is it correct that epsilon already reaches the eps_min of 0.01 after only 11 episodes ? Does it mean that we have almost no exploration anymore after 11 episodes ?

  • Machine Learning with Phil
    December 26, 2020

    thank for the help

  • Machine Learning with Phil
    December 26, 2020

    Hi Phil, any reason you are using the forward() method on your neural net instead of calling it directly As Q_eval() I.e. using __call__()? I believe in general calling forward() is unsafe, since there’s potentially some necessary magic involving hooks going on under the surface that you might miss.

  • Machine Learning with Phil
    December 26, 2020

    I really appreciate this simple agent walkthrough. I find it easy to digest compared to other courses I've seen, and doesn't try to explain the math behind it TOO much, which for novices is pretty nice.

    My concern though is that because our agent is learning every step of every episode, it is also decaying epsilon every step as well. This leads to a much more rapid and unpredictable descent of epsilon (due to each episode having varying number of steps) for the lifetime of the agent vs other agents I have seen. (Full decay by episode 15-25)

    Is this intentional? If so, is there any way you could elaborate on why we would want epsilon to be fully decayed within 5% of the Agents training time?

  • Machine Learning with Phil
    December 26, 2020

    Thanks A LOT for making this tutorial!
    Coming from a non-CS background, coding is always a bottleneck for me but this video helped pass that phase with ease.

  • Machine Learning with Phil
    December 26, 2020

    Great content

  • Machine Learning with Phil
    December 26, 2020

    Hi, Dr. Phil. Great work for a deep Q network implementation and demo. I have been following your tutorial for a while. I am recently doing a DQN for a "multi-agent" collection, which means there is more than 1 agent in the system but we consider them all as a collection. Correspondingly, state(agent1, agent2,…)and action(action1, action2, …) as collections are used to describe this collection. But the trick is we don't know the number of agents for sure, which gives me a hard time describing n_actions(if 1 agent has 8 actions, 2 would have 64). Does the DQN framework still apply here? If it does, is it possible for you to give me some suggestions about how to modify this framework? Thanks in advance!!!

  • Machine Learning with Phil
    December 26, 2020

    Hi Dr. Phil,
    I'm a Ph.D. student and I’m really grateful beyond measure for your such great work that did tremendous support to my Ph.D.!
    I'm really looking for a multi-agent DQN, if that is possible for you to be offered in youtube, please

  • Machine Learning with Phil
    December 26, 2020

    Please don't tell people "you don't need any exposure fo deep learning etc". This is why people jump from projects to projects without understanding as they get excited.

  • Machine Learning with Phil
    December 26, 2020

    At 33:54 "is our children learn…, is our agent learning" funny

  • Machine Learning with Phil
    December 26, 2020

    How can we save DeepQ Model after full episodes of training? Thank you

  • Machine Learning with Phil
    December 26, 2020

    nice

  • Machine Learning with Phil
    December 26, 2020

    Great tutorial!
    Can you make a video that builds a DQN from scratch using Numpy?

  • Machine Learning with Phil
    December 26, 2020

    can you give us source code? in your github there is totally mess.

  • Machine Learning with Phil
    December 26, 2020

    Hi Phil. Just found this channel, nice 🙂 I may be wrong, but i think there may be a problem in the learn process, mem_counter is never reset, so once its hit batchsize it will learn every time the learn function is called.

  • Machine Learning with Phil
    December 26, 2020

    Nice video. Well explained.

  • Machine Learning with Phil
    December 26, 2020

    I thought Q and Q* use a different NN, but it seems not the case here. Am I wrong?

  • Machine Learning with Phil
    December 26, 2020

    Hey Phil!, You are a class apart from others in explaining all these topics. I have a request for you. Since, reinforcement learning takes a lot of time when implemented on real world problems. Isn't it good to move your videos towards new techniques like 'Imitation learning', 'GANs' etc ?

  • Machine Learning with Phil
    December 26, 2020

    wondering how you got pytorch to recognize np.bool for self.terminal_memory, brought up an error for me. I had to change dtype to np.uint8

Write a comment