Deep Q Learning w/ DQN – Reinforcement Learning p.5




[ad_1]

Hello and welcome to the first video about Deep Q-Learning and Deep Q Networks, or DQNs. Deep Q Networks are the deep learning/neural network versions of Q-Learning.

With DQNs, instead of a Q Table to look up values, you have a model that you inference (make predictions from), and rather than updating the Q table, you fit (train) your model.

Text-based tutorial and sample code: https://pythonprogramming.net/deep-q-learning-dqn-reinforcement-learning-python-tutorial/

Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex

#reinforcementlearning #machinelearning #python

Source


[ad_2]

Comment List

  • sentdex
    January 18, 2021

    Is there an explanation how to choose the parameters for the neural network? I've watched the tutorial on that and multiple other tutorials over the net, but it's always the same: "We just use a Conv-Net and set that to 64… hmmm maybe 128 nodes, I dunno, whatever. Now copy all of those layers I guess."
    I understand that you can't explain all of that in depth all the time, but I have yet to find a video that explains what layers to use and why, and how many nodes you need.

  • sentdex
    January 18, 2021

    don't bother for ppl complaining about drinking… thanks to your efforts. You are totally cool…

  • sentdex
    January 18, 2021

    cool!

  • sentdex
    January 18, 2021

    So, Professor Sentdex mentions in this video that (paraphrasing) none of the other deep Q learning videos/tutorials do not really show some important steps ("…all the tutorials suck…"). Well, perhaps they don't suck, but they do gloss over or hide certain steps. This tutorial here (p.5 and p.6) truly does explain and show each step. It delivers. I've takin 2 courses on Udemy and trolled several other AI channels. This is the best DQN tutorial out there. It's a little long and each step is highly detailed. My advice is follow along….code as you watch. Thanks for this tutorial! (Would love to see a Paperspace tutorial!)

  • sentdex
    January 18, 2021

    How do you use metrics = "Accuracy" , If you are using mse. isn't this a regression problem ?, I am a little confused

  • sentdex
    January 18, 2021

    Sentdex you and your YouTube Channel is awesome! Great content and videos

  • sentdex
    January 18, 2021

    I'm new to this. Can you tell me. Which line exactly makes the magic happen?

  • sentdex
    January 18, 2021

    excellent tutorials

  • sentdex
    January 18, 2021

    Thanks for an awesome video. I am wondering how we can you train a sequential DQN meaning that the agent will be able to predict let's say the actions for the next 5 timesteps? Thanks a lot!

  • sentdex
    January 18, 2021

    Best channel, Thank you for this tutorial

  • sentdex
    January 18, 2021

    This guy has tutorial on every fucking machine learning topics….😂😂😂🔥🔥🔥

  • sentdex
    January 18, 2021

    And i am the one who is wondering what drink he drinks. he got wings. the best tutor ever.

  • sentdex
    January 18, 2021

    A deep neural network can interpolate. The network scales much better because it can map the state space to a fixed number of nodes instead of needing a node for every state. The reasons it can scale and interpolate are related because the mapping is a kind of convolution of the state onto the network.

  • sentdex
    January 18, 2021

    can we convert these action to_categorical and use categorical_crossentropy??

  • sentdex
    January 18, 2021

    Hi @sentdex,
    Thanks for this awesome content. Just wanted to ask one thing, is this by any chance Double Deep Q learning that you are implementing?

  • sentdex
    January 18, 2021

    Hundredth

  • sentdex
    January 18, 2021

    one can use np.newaxis for expanding one dimension of an np array

  • sentdex
    January 18, 2021

    22:30 50K steps for one episode or 50K steps for all the episodes?

  • sentdex
    January 18, 2021

    I plan On doing a minecraft deep-q learning

  • sentdex
    January 18, 2021

    Dude, this is amazing. As a systems developer, I am using this for my master thesis on streaming algorithms and congestion control. You literally saved me a ton of hours since I don't have much background in RL. Thanks for the amazing work.

  • sentdex
    January 18, 2021

    I rarely comment on youtube videos, but I gotta say that I love your energy and the way you explain things in a fun and clear way!
    You got yourself a new subscriber, Keep up the good work! 🙂

  • sentdex
    January 18, 2021

    Nice tutorial 🙂
    I cant run the ModifiedTensorBoard class…
    The function update_stats writes this message:

    "AttributeError: 'ModifiedTensorBoard' object has no attribute '_write_logs'"

    does anyone have a solution for this error?

  • sentdex
    January 18, 2021

    Hi there,

    Do you have a video or a blog post where you gibe an example how to migrate code from Tensorflow 1.X to Tensorflow 2.X?
    Especially in terms of placeholders and sessions?

  • sentdex
    January 18, 2021

    Amazing, thank you !!!

  • sentdex
    January 18, 2021

    please make a video where we can use a dqn model in chatbots (something like sentence classification + dqn model to choose the response so it can handle entire conversations and not giving only a response for what is classified)

  • sentdex
    January 18, 2021

    I love your tutorials sentdex 🙂
    I have just one doubt tho:
    if the target_model and model are initially same (in case of the weights and the architecture), the model will be training against its own self the whole time, because the weights of the target_model are always updated to the actual model.
    thnx so much!!

  • sentdex
    January 18, 2021

    People who did this, is the previous video on the creation of environments needed for this and next video ?

  • sentdex
    January 18, 2021

    27:43, you say "observation space" twice, but in the subsequent lecture it becomes evident that what you meant to say is "state." This is confused me, and I had to go run this to ground to figure out why.

  • sentdex
    January 18, 2021

    HI! Can anybody tell me why using the fit() method instead of train_on_batch(), if we want the optimizer to keep track of the number of iterations we made?
    Thank you

  • sentdex
    January 18, 2021

    Can someone tell me the difference between double deep q learning and duelling deep q learning and which one is @sentdex using. He has the best tutorial for reinforcement learning I could find but I wanna know which is the best algorithm and if he is using the best algorithm here

  • sentdex
    January 18, 2021

    Plz ! Can you do the atari breakout tutorial.. every tutorial in the Internet sucks veeeery bad

  • sentdex
    January 18, 2021

    You can also do 5e4 instead of 50,000
    It returns a float so int(5e4) works. Useful for when numbers get large like 1e9

    'deque' is also pronounced like 'deck' I think

  • sentdex
    January 18, 2021

    This is by far the best tutorial on DQNs on the entire internet.

  • sentdex
    January 18, 2021

    Great video, thanks very much.

  • sentdex
    January 18, 2021

    Hi sentdex.

    I used your provided code as a template for one of my own little projects, and when I gave the discount factor higher than 0.5, my Q-values diverge and tend to infinity. I am using the double q-learning, have played with hyper-parameters, target-network update rates etc, but I absolutely cannot get Q-values to converge if the factor is > 0.5. Any idea why this is happening?

  • sentdex
    January 18, 2021

    Thanks Sentdex, this indeed is a great tutorial. How about extending this to some of the atari games..!

  • sentdex
    January 18, 2021

    how are you so confidently building a specific type of model, why not some other variant?

  • sentdex
    January 18, 2021

    Best tutorial for RL seen so far! Love yah!

  • sentdex
    January 18, 2021

    Is this video still up to date with keras ?

  • sentdex
    January 18, 2021

    @sentdex
    Deque is pronounced the same as deck. Great tutorial, thank you!

  • sentdex
    January 18, 2021

    This is actually one good tutorial I could find. Thank you sentdex!

  • sentdex
    January 18, 2021

    Wait should we really use max pooling here? I think network also has to know where in the frame an object is and that information is lost with pooling. Correct me if I'm wrong

  • sentdex
    January 18, 2021

    Really looking forward to the following videos in this series. You're totally right, there is a serious lack of RL tutorials. Thank you for producing this series <3

  • sentdex
    January 18, 2021

    Why can't we use train_on_batch instead of .fit? I guess that would not recreate a log file…

  • sentdex
    January 18, 2021

    self.model.predict(np.array(state).reshape(-1, *state.shape)/255)[0] . Could someone please explain the purpose of "*state.shape" and the "/256"

  • sentdex
    January 18, 2021

    I don't quite get the purpose of the target model and replay part.

  • sentdex
    January 18, 2021

    Awesome dude. Just got the "Perfect" tutorial in the internet on RL!

Write a comment