Reinforcement Learning Course: Intro to Advanced Actor Critic Methods

Ecommerce Empire Academy

Actor critic methods are used in machine learning. They are most useful for applications in robotics as they allow us to output continuous, rather than discrete actions. This enables control of electric motors to actuate movement in robotic systems, at the expense of increased computational complexity.

πŸ’» Code for the algorithms covered:
πŸ”— Actor Critic:
πŸ”— Deep Deterministic Policy Gradients (DDPG):
πŸ”— Twin Delayed Deep Deterministic Policy Gradients (TD3):
πŸ”— Proximal Policy Optimization (PPO):
πŸ”— Soft Actor Critic (SAC):
πŸ”— Asynchronous Advantage Actor Critic (A3C):

✏️ Course from Phil Tabor. For more deep reinforcement learning tutorials check out his channel at:

⭐️ Course Contents ⭐️
⌨️ (0:00:00) Intro
⌨️ (0:04:03) Actor Critic (TF2)
⌨️ (0:44:50) DDPG (TF2)
⌨️ (1:52:36) TD3 (TF2)
⌨️ (3:08:29) PPO (PyTorch)
⌨️ (4:03:16) SAC (TF2)
⌨️ (5:09:28) A3C (PyTorch)

⭐️ Software requirements ⭐️
Python 3.x
box2d-py 2.3.8
gym 0.15.4
matplotlib 3.1.1
numpy 1.18.1
pybullet 2.8.5
torch 1.4.0
tensorflow-gpu 2.3.1

πŸŽ‰ Thanks to our Champion and Sponsor supporters:
πŸ‘Ύ Wong Voon jinq
πŸ‘Ύ hexploitation
πŸ‘Ύ Katia Moran
πŸ‘Ύ BlckPhantom
πŸ‘Ύ Nick Raker
πŸ‘Ύ Otis Morgan
πŸ‘Ύ DeezMaster
πŸ‘Ύ Treehouse

Learn to code for free and get a developer job:

Read hundreds of articles on programming:

And subscribe for new videos on technology every day:

33 thoughts on “Reinforcement Learning Course: Intro to Advanced Actor Critic Methods”

  1. As a first viewer and a young coder I love code camp well my age is 12 and in picture he is my dad so don’t be confused

    1. Harry Wijnschenk

      @Geeky Programmer bro chill, sort your own problems out before hating on others.

    2. Geeky Programmer

      @Harry Wijnschenk No hate but revealing age doesn’t pertain to the content.

    3. Harry Wijnschenk

      @Geeky Programmer alright but he’s young and is excited by the video, just move on the kids 12, no need to reply.

  2. Man this channel is a goldmine πŸ˜‚

    Nothing new though as this is not the first course I saw here. This course is going to be very helpful for me. Thank you for the work you guys are putting into teaching people like me.

    P.s. double thanks for the nextjs course as well. It was very helpful.

    1. It was very helpful. Thanks for sharing your knowledge and pointers. πŸ‘πŸ‘πŸ‘ŒπŸ‘ŒπŸ‘πŸ‘

  3. Personally like the style of few slides, no BS, no nothing Sir, straight to the coding. Strong work.

  4. Wow, I just turned in my project with an actor critic algorithm THIS WEEK.
    -__-
    *cries

  5. Rachad El Moutaouaffiq

    Guys we should literally donate to this channel once Hired , is more useful than most universities

  6. Mohamed Nasr El-din Azouz Mohamadin

    Ω„Ω…Ψ§Ψ°Ψ§ Ω„Ψ§ Ψͺوجد ΨͺΨ±Ψ¬Ω…Ω‡ Ω…Ψ΅Ψ§Ψ­Ψ¨Ψ© Ω„Ω‡Ψ°Ψ§ Ψ§Ω„ΩΩŠΨ―ΩŠΩˆ

  7. This channel is awesome. Its content and support is beyond any words.. Thank you so much for all the quality content Team.

  8. This is why the computer science and software engg field is so successful and growing so quickly. We keep everything open source and freely available to anyone willing to learn. That’s so rare these days. There are so many other fields that lock up their knowledge in university courses and paywalls.

  9. I thought prob_ratio must equal to one if we replay the same action as the actor is updated after replay . am I right?

  10. Thank you so much for putting an effort to do the whole implementation which is relatively bit easier to grasp than the paper. I am very new to RL and I have a rather weird question(cause no one actually addressed but ignore if I am being stupid), so when for the first time you call the learn function after doing 20 steps, wouldn’t the new_probs be equal to the old_probs, because essentially the neural network didn’t learn anything so would both these values be random until like several iteration? And if actually they would be random, how is the agent learning?

  11. Thank you for the awesome video. Can you please characterize all the DRL models? If possible.

Comments are closed.