Deep Q Learning w/ DQN – Reinforcement Learning p.5
[ad_1]
Hello and welcome to the first video about Deep Q-Learning and Deep Q Networks, or DQNs. Deep Q Networks are the deep learning/neural network versions of Q-Learning.
With DQNs, instead of a Q Table to look up values, you have a model that you inference (make predictions from), and rather than updating the Q table, you fit (train) your model.
Text-based tutorial and sample code: https://pythonprogramming.net/deep-q-learning-dqn-reinforcement-learning-python-tutorial/
Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Instagram: https://instagram.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
#reinforcementlearning #machinelearning #python
Source
[ad_2]
Is there an explanation how to choose the parameters for the neural network? I've watched the tutorial on that and multiple other tutorials over the net, but it's always the same: "We just use a Conv-Net and set that to 64… hmmm maybe 128 nodes, I dunno, whatever. Now copy all of those layers I guess."
I understand that you can't explain all of that in depth all the time, but I have yet to find a video that explains what layers to use and why, and how many nodes you need.
don't bother for ppl complaining about drinking… thanks to your efforts. You are totally cool…
cool!
So, Professor Sentdex mentions in this video that (paraphrasing) none of the other deep Q learning videos/tutorials do not really show some important steps ("…all the tutorials suck…"). Well, perhaps they don't suck, but they do gloss over or hide certain steps. This tutorial here (p.5 and p.6) truly does explain and show each step. It delivers. I've takin 2 courses on Udemy and trolled several other AI channels. This is the best DQN tutorial out there. It's a little long and each step is highly detailed. My advice is follow along….code as you watch. Thanks for this tutorial! (Would love to see a Paperspace tutorial!)
How do you use metrics = "Accuracy" , If you are using mse. isn't this a regression problem ?, I am a little confused
Sentdex you and your YouTube Channel is awesome! Great content and videos
I'm new to this. Can you tell me. Which line exactly makes the magic happen?
excellent tutorials
Thanks for an awesome video. I am wondering how we can you train a sequential DQN meaning that the agent will be able to predict let's say the actions for the next 5 timesteps? Thanks a lot!
Best channel, Thank you for this tutorial
This guy has tutorial on every fucking machine learning topics….😂😂😂🔥🔥🔥
And i am the one who is wondering what drink he drinks. he got wings. the best tutor ever.
A deep neural network can interpolate. The network scales much better because it can map the state space to a fixed number of nodes instead of needing a node for every state. The reasons it can scale and interpolate are related because the mapping is a kind of convolution of the state onto the network.
can we convert these action to_categorical and use categorical_crossentropy??
Hi @sentdex,
Thanks for this awesome content. Just wanted to ask one thing, is this by any chance Double Deep Q learning that you are implementing?
Hundredth
one can use np.newaxis for expanding one dimension of an np array
22:30 50K steps for one episode or 50K steps for all the episodes?
I plan On doing a minecraft deep-q learning
Dude, this is amazing. As a systems developer, I am using this for my master thesis on streaming algorithms and congestion control. You literally saved me a ton of hours since I don't have much background in RL. Thanks for the amazing work.
I rarely comment on youtube videos, but I gotta say that I love your energy and the way you explain things in a fun and clear way!
You got yourself a new subscriber, Keep up the good work! 🙂
Nice tutorial 🙂
I cant run the ModifiedTensorBoard class…
The function update_stats writes this message:
"AttributeError: 'ModifiedTensorBoard' object has no attribute '_write_logs'"
does anyone have a solution for this error?
Hi there,
Do you have a video or a blog post where you gibe an example how to migrate code from Tensorflow 1.X to Tensorflow 2.X?
Especially in terms of placeholders and sessions?
Amazing, thank you !!!
please make a video where we can use a dqn model in chatbots (something like sentence classification + dqn model to choose the response so it can handle entire conversations and not giving only a response for what is classified)
I love your tutorials sentdex 🙂
I have just one doubt tho:
if the target_model and model are initially same (in case of the weights and the architecture), the model will be training against its own self the whole time, because the weights of the target_model are always updated to the actual model.
thnx so much!!
People who did this, is the previous video on the creation of environments needed for this and next video ?
27:43, you say "observation space" twice, but in the subsequent lecture it becomes evident that what you meant to say is "state." This is confused me, and I had to go run this to ground to figure out why.
HI! Can anybody tell me why using the fit() method instead of train_on_batch(), if we want the optimizer to keep track of the number of iterations we made?
Thank you
Can someone tell me the difference between double deep q learning and duelling deep q learning and which one is @sentdex using. He has the best tutorial for reinforcement learning I could find but I wanna know which is the best algorithm and if he is using the best algorithm here
Plz ! Can you do the atari breakout tutorial.. every tutorial in the Internet sucks veeeery bad
You can also do 5e4 instead of 50,000
It returns a float so int(5e4) works. Useful for when numbers get large like 1e9
'deque' is also pronounced like 'deck' I think
This is by far the best tutorial on DQNs on the entire internet.
Great video, thanks very much.
Hi sentdex.
I used your provided code as a template for one of my own little projects, and when I gave the discount factor higher than 0.5, my Q-values diverge and tend to infinity. I am using the double q-learning, have played with hyper-parameters, target-network update rates etc, but I absolutely cannot get Q-values to converge if the factor is > 0.5. Any idea why this is happening?
Thanks Sentdex, this indeed is a great tutorial. How about extending this to some of the atari games..!
how are you so confidently building a specific type of model, why not some other variant?
Best tutorial for RL seen so far! Love yah!
Is this video still up to date with keras ?
@sentdex
Deque is pronounced the same as deck. Great tutorial, thank you!
This is actually one good tutorial I could find. Thank you sentdex!
Wait should we really use max pooling here? I think network also has to know where in the frame an object is and that information is lost with pooling. Correct me if I'm wrong
Really looking forward to the following videos in this series. You're totally right, there is a serious lack of RL tutorials. Thank you for producing this series <3
Why can't we use train_on_batch instead of .fit? I guess that would not recreate a log file…
self.model.predict(np.array(state).reshape(-1, *state.shape)/255)[0] . Could someone please explain the purpose of "*state.shape" and the "/256"
I don't quite get the purpose of the target model and replay part.
Awesome dude. Just got the "Perfect" tutorial in the internet on RL!