Actor critic methods are used in machine learning. They are most useful for applications in robotics as they allow us to output continuous, rather than discrete actions. This enables control of electric motors to actuate movement in robotic systems, at the expense of increased computational complexity.
π» Code for the algorithms covered:
π Actor Critic:
π Deep Deterministic Policy Gradients (DDPG):
π Twin Delayed Deep Deterministic Policy Gradients (TD3):
π Proximal Policy Optimization (PPO):
π Soft Actor Critic (SAC):
π Asynchronous Advantage Actor Critic (A3C):
βοΈ Course from Phil Tabor. For more deep reinforcement learning tutorials check out his channel at:
βοΈ Course Contents βοΈ
β¨οΈ (0:00:00) Intro
β¨οΈ (0:04:03) Actor Critic (TF2)
β¨οΈ (0:44:50) DDPG (TF2)
β¨οΈ (1:52:36) TD3 (TF2)
β¨οΈ (3:08:29) PPO (PyTorch)
β¨οΈ (4:03:16) SAC (TF2)
β¨οΈ (5:09:28) A3C (PyTorch)
βοΈ Software requirements βοΈ
Python 3.x
box2d-py 2.3.8
gym 0.15.4
matplotlib 3.1.1
numpy 1.18.1
pybullet 2.8.5
torch 1.4.0
tensorflow-gpu 2.3.1
π Thanks to our Champion and Sponsor supporters:
πΎ Wong Voon jinq
πΎ hexploitation
πΎ Katia Moran
πΎ BlckPhantom
πΎ Nick Raker
πΎ Otis Morgan
πΎ DeezMaster
πΎ Treehouse
—
Learn to code for free and get a developer job:
Read hundreds of articles on programming:
And subscribe for new videos on technology every day:
π great
As a first viewer and a young coder I love code camp well my age is 12 and in picture he is my dad so donβt be confused
Nobody cares if you are 12.
@Geeky Programmer bro chill, sort your own problems out before hating on others.
@Harry Wijnschenk No hate but revealing age doesn’t pertain to the content.
@Geeky Programmer alright but he’s young and is excited by the video, just move on the kids 12, no need to reply.
Man this channel is a goldmine π
Nothing new though as this is not the first course I saw here. This course is going to be very helpful for me. Thank you for the work you guys are putting into teaching people like me.
P.s. double thanks for the nextjs course as well. It was very helpful.
Woow πππ thank you πππππ
We need a new vue.js course!
Hey I know that guy! Any questions, please leave them down below!
It was very helpful. Thanks for sharing your knowledge and pointers. ππππππ
Thank you Phil πππππ
Tell him we are grateful for such a great video
Personally like the style of few slides, no BS, no nothing Sir, straight to the coding. Strong work.
ππ
This was exactly what I wanted to learn. Thank you
Wow, I just turned in my project with an actor critic algorithm THIS WEEK.
-__-
*cries
good job….kinda gives me hope too. Ik its stupid….
Thanks!
ok sub bot
@Muhammad Aariz Marzuq I am not a bot.
Guys we should literally donate to this channel once Hired , is more useful than most universities
agree
Very true
Agree!!
ΩΩ Ψ§Ψ°Ψ§ ΩΨ§ ΨͺΩΨ¬Ψ― ΨͺΨ±Ψ¬Ω Ω Ω Ψ΅Ψ§ΨΨ¨Ψ© ΩΩΨ°Ψ§ Ψ§ΩΩΩΨ―ΩΩ
This channel is awesome. Its content and support is beyond any words.. Thank you so much for all the quality content Team.
This is why the computer science and software engg field is so successful and growing so quickly. We keep everything open source and freely available to anyone willing to learn. Thatβs so rare these days. There are so many other fields that lock up their knowledge in university courses and paywalls.
Thanks!
Oohhh sweet! Machine Learning with Phil is awesome!
I thought prob_ratio must equal to one if we replay the same action as the actor is updated after replay . am I right?
Thank you so much for putting an effort to do the whole implementation which is relatively bit easier to grasp than the paper. I am very new to RL and I have a rather weird question(cause no one actually addressed but ignore if I am being stupid), so when for the first time you call the learn function after doing 20 steps, wouldn’t the new_probs be equal to the old_probs, because essentially the neural network didn’t learn anything so would both these values be random until like several iteration? And if actually they would be random, how is the agent learning?
Thank you for the awesome video. Can you please characterize all the DRL models? If possible.