Reinforcement Learning with Stable Baselines 3 - Introduction (P.1)

Reinforcement Learning with Stable Baselines 3 - Introduction (P.1)

sentdex

3 года назад

100,543 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@ebrahimpichka
@ebrahimpichka - 05.02.2022 21:34

looking forward for the next episodes. BTW, at the end you were still using random actions after training the model.

Ответить
@Djellowman
@Djellowman - 05.02.2022 21:45

Looking forward to the next one!

Ответить
@vaibhavkumar642
@vaibhavkumar642 - 05.02.2022 22:19

💥💥💥

Ответить
@pythonocean7879
@pythonocean7879 - 05.02.2022 22:51

❤️ for ❤️

Ответить
@KennTollens
@KennTollens - 05.02.2022 23:21

Thank you for this tutorial. I am just getting into AI. It is over my head immediately, but your overview of the parts such as observation and agent were helpful for the bigger picture.

Ответить
@rgel3762
@rgel3762 - 06.02.2022 00:22

Have you considered unity+mlagents? Why not to go that way?

Ответить
@pfrivik
@pfrivik - 06.02.2022 00:26

LETS GOOOOOO THIS IS EXACTLY WHAT I WANTED THANK YOU SO MUCH

Ответить
@pfrivik
@pfrivik - 06.02.2022 00:26

How often will these videos be released?? Im so excited to start watching and keep watching tzhe series!!

Ответить
@Mutual_Information
@Mutual_Information - 06.02.2022 01:56

This is very useful. I'm working on an RL video series myself (the theory side, so no overlap here) and I was just looking for prebuilt RL algos. Stable baseline's 3 is by far the most complete/well tested suite I've come across. This really makes a big differences - thanks!

Also, it's nice to see super technical coverage like can yield a 1M+ followers. Awesome.

Ответить
@arthurflores4585
@arthurflores4585 - 06.02.2022 03:27

Thank you, these video tutorial will be of big help to my thesis. I going to support you.
I have many doubts I hope this can resolved them.

Ответить
@yashwanth9549
@yashwanth9549 - 06.02.2022 07:18

Please add more videos about reinforcement learning

Ответить
@shreeshaaithal-
@shreeshaaithal- - 06.02.2022 08:25

Then can u say how can I make gym to play valorant game 😅 can we do this with gym or can it play call of duty: cold war

Ответить
@VaibhavSingh-lf6ps
@VaibhavSingh-lf6ps - 06.02.2022 13:31

Thanks for introducing the Stable Baseline 3,
and yeah sometime we forget to use model!

Ответить
@PerfectNight123
@PerfectNight123 - 06.02.2022 17:56

Does anybody know how to train the model using GPU? I tried changing the model parameter to device='cude', but it's still using cpu device when learning.

Ответить
@piyushjaininventor
@piyushjaininventor - 06.02.2022 18:00

You are still taking random actions.

Ответить
@oguzhanoguz8890
@oguzhanoguz8890 - 06.02.2022 18:03

Little heads up for the next video if you can explore it : the saving and loading of a sb3 baseline model depends on the " deterministic " flag.. Sometimes when used the eval procedure given in sb3, even if the u saved the model in deterministic manner you get unstable results. Can you explore that too ? Thank g8 video

Ответить
@EnglishRain
@EnglishRain - 06.02.2022 20:43

What does one use this for IRL?

Ответить
@davidcristobal7152
@davidcristobal7152 - 06.02.2022 21:37

Don't you have to define a neural model? I mean, what if you have an image as an input? Does Stable Baselines automagically asumes the neural network to pass through de values of the observations?

Ответить
@criscanto7040
@criscanto7040 - 07.02.2022 00:49

Awesome

Ответить
@tytobieola2766
@tytobieola2766 - 07.02.2022 06:54

Happy New Year SEndex, was learning machine learning during the lockdown & I had no idea in the Field . U teach so well

Ответить
@amogh3275
@amogh3275 - 07.02.2022 10:35

Honestly loving this series, i hope you make a indepth tutorial series on this. Thanks

Ответить
@DasJonski
@DasJonski - 07.02.2022 12:54

Am I the only one trying to clean the screen from dust looking like a fool at the term explanations? Anyways, great video Harrison, really enjoy your videos!

Ответить
@connorvaughan356
@connorvaughan356 - 08.02.2022 07:25

Very excited for this series. I'm following along and when the lunar lander game displays, it plays incredibly quickly. Probably 4-5 times faster than in the video. Does anyone know how to adjust the speed at which the game plays?

Ответить
@Shaunmcdonogh-shaunsurfing
@Shaunmcdonogh-shaunsurfing - 10.02.2022 02:00

Awesome. Can’t wait for the next one

Ответить
@ddos87
@ddos87 - 10.02.2022 07:22

youre such a beauty man

Ответить
@markd964
@markd964 - 17.02.2022 17:18

Great series as always...needs the next step, developing asynchronous (multiprocessing) models, eg: PPO into Asynchronous-PPO (APPO) on custom environments...Thx

Ответить
@sanjaydokula5140
@sanjaydokula5140 - 20.02.2022 20:47

I see that yours is using cuda device, how do i make mine use cuda device instead of cpu?

Ответить
@OhHeyItsAnthony
@OhHeyItsAnthony - 01.04.2022 00:39

If you're following along using a Conda environment and the Lunar Lander environment gives you an error (namely "module 'gym.envs.box2d' has no attribute 'LunarLander'") then I found that you need to also install two other packages; swig and box2d-py:

conda install -c conda-forge swig box2d-py

Ответить
@noorwertheim2515
@noorwertheim2515 - 26.04.2022 15:29

Could this algorithm also be used for multi-agent multi-objective environments?

Ответить
@Veptis
@Veptis - 26.05.2022 03:35

I have watched a bunch of videos about what reinforcement learning can do. But I gave up on the Steve Brunton series. Perhaps I watch this series instead and understand how learning is done everything I did so far has been just gradient based learning. And I don't know if reinforcement learning applies to language. Maybe in a conversational setting.

I have a game from my childhood: Mirror's Edge mobile edition. Which you can't no longer buy as EA removed it from the store instead of updating it. As it essentially just has 6 discrete inputs I could see how it can be learned. But the levels are limited, so it might overfit easily. And rewards can't just be time, as that requires success in the first place.

Ответить
@karthikbharadhwaj9488
@karthikbharadhwaj9488 - 06.07.2022 16:00

Hey Sentdex, Actually in env.step() method you have passed the env.action_space.sample() instead of model.predict() !!!!! @sentdex

Ответить
@vernonvkayhypetuttz
@vernonvkayhypetuttz - 31.10.2022 21:23

SentDex youre a legend, brother. The thought of implementing these using deep learning libraries alone, instant grey hair! Thank you

Ответить
@ahmarhussain8720
@ahmarhussain8720 - 21.11.2022 04:14

awesome video, learned a lot, keep up the good work

Ответить
@poomchantarapornrat5685
@poomchantarapornrat5685 - 25.12.2022 02:13

What operating system do you use to run these on?

Ответить
@rverm1000
@rverm1000 - 30.01.2023 04:14

coding along it doesnt work. at least not in google colab

Ответить
@andreamaiellaro6581
@andreamaiellaro6581 - 13.02.2023 16:48

I followed all the instructions but when I try to run the notebook I get error on the step function; It advise me: raise NotImplememtedError.....>.<.What should I so?

Ответить
@bluedade2100
@bluedade2100 - 10.04.2023 21:19

Guys anyone having problem with installing/running stabe baseline in MacBook? I can't run on either MacBook or linux

Ответить
@bluedade2100
@bluedade2100 - 28.04.2023 22:16

What does the variable episodes represent here?

Ответить
@walterwang5996
@walterwang5996 - 24.05.2023 15:04

I have a small question: why A2C only uses one "MlpPolicy" in Stable_baselines3? Actually, it has two networks, am I right? Thanks.

Ответить
@ReOp14
@ReOp14 - 30.10.2023 06:17

Im at the start of the tutorial after adding the env.render().. why is it that its not rendering anything when I run the code? I'm running python=3.9 on a windows machine w/ conda

Ответить
@Bvic3
@Bvic3 - 09.11.2023 01:10

RL in a nutshell. IT NEVER WORKS.

It's really the part of deep learning where you implement a paper and get zero result.

Ответить
@michpo1445
@michpo1445 - 15.11.2023 00:01

"Your environment must inherit from the gymnasium.Env class cf." can you address this error?

Ответить
@luisbarba9532
@luisbarba9532 - 16.01.2024 05:31

can SB3 be extended to pettingzoo and used for MARL?

Ответить
@GusIncognito
@GusIncognito - 07.04.2024 14:16

I am getting ValueError: too many values to unpack (expected 2) when executing model.learn(), some other people enconountered the same issue but I haven't found a solution.

Ответить
@Akhtar_AI_Azim
@Akhtar_AI_Azim - 13.04.2024 19:54

Can you please talk about how we use the RL to model and optimize satellite networks and HAP( high altitude platforms)??

Ответить
@Akhtar_AI_Azim
@Akhtar_AI_Azim - 13.04.2024 22:48

Can you please talk about how we use the RL to model and optimize satellite networks and HAP( high altitude platforms)??
How we control the direction and angle of a projector embedded into HAP or UAV so that it directs its light beams towards an special area of interest on the Earth??

Ответить
@geepytee
@geepytee - 29.11.2024 02:14

This guy never misses, best tutorials in the game.

Ответить
@thomasschmidt73
@thomasschmidt73 - 16.03.2025 22:18

Keeping the env's sample action and your humor made my day :)

Ответить