Attention for Neural Networks, Clearly Explained!!!

StatQuest with Josh Starmer

1 год назад

241,747 Просмотров

Скачать видео

Комментарии:

Murat.theoz - 15.09.2023 15:58

I feel like I am watching a cartoon as a kid. :)

Ответить

Mehmet Eren BULUT - 13.09.2023 19:20

I was stunned when you start the video with a catch jingle man, cheers :D

Ответить

Handsome Mehdi - 12.09.2023 10:21

Hello, Thank you for the video, but I am so confused that some terms introduced in original 'Attention is All You Need' paper were not mentioned in video, for example, keys, values, and queries. Furthermore, in the paper, authors don't talk about cosine similarity and LSTM application. Can you please clarify this case a little bit much better?

Ответить

Anonymous - 10.09.2023 19:19

I can relate to Squatch so much😅. If he would have been a real person, he would have been a great friend of mine😁

Ответить

Hasan Soufan - 03.09.2023 14:36

Thanks

Ответить

Mel Ugaddan - 29.08.2023 16:24

The level of explainability from this video is top-notch. I always watch your video first to grasp the concept then do the implementation on my own. Thank you so much for this work !

Ответить

navid gh - 24.08.2023 17:42

great

Ответить

Milad Afrasiabi - 23.08.2023 00:01

Thank you for the awesome video. I have a question. What does the similarity score entails in reality? I assume that the Ws and Bs are being optimized by backpropagation in order to give larger positive values to synonyms, close to 0 values to unrelated words and large negative values to antonyms. Is this a right assumption?

Ответить

Pratik Shetty - 22.08.2023 09:31

We are comparing the score then shoulnt we divide the denominator then?

Ответить

Best Sagittarius - 13.08.2023 14:20

Frankly, Josh , if take view of Transformer Self-attention, this video seems meaninglessful because Self-attention can do much more better than what you mentioned. If so, why we need to take this lesson?

Ответить

KickComedyFun - 29.07.2023 06:20

Bad content, Focus on content, clear theory and not on sarcasm. It isn't helping

Ответить

Ritish Adhikari - 28.07.2023 16:13

Thanks Professor Josh for such a great tutorial ! It was very informative !

Ответить

Andres Nava - 26.07.2023 23:15

I had a little confusion about the final fully connected layer. It takes in separate attention values for each input word. But doesn't this mean that the dimension of the input depends on how many input words there are (thus it would be difficult to generalize for arbitrarily long sentences)? Did I misunderstand something?

Ответить

yi zhou - 23.07.2023 17:37

I am always amazed by your tutorials! Thanks. And when we can expect the transformer tutorial to be uploaded?

Ответить

abdullah bin khaled shovo - 22.07.2023 16:27

I have been waiting for this for a long time

Ответить

Ryan Peters - 21.07.2023 23:58

Do you have any courses with start-to-finish projects for people who are only just getting interested in machine learning?
Your explanations on the mathematical concepts has been great and I'd be more than happy to pay for a course that implements some of these concepts into real world examples

Ответить

Michael Bwin - 19.07.2023 22:19

Thank you for this explanation. But my question is how with backprogation are the weights and bias adjusted in such a model like this. if you could explain that i would deeply appreciate it.

Ответить

Vini silva - 19.07.2023 04:39

it would not be possible to translate the other older videos you explain very well.❤

Ответить

Sascha Homeier - 17.07.2023 13:50

You have a talent for explaining these things in a straightforward way. Love your videos. You have no video about Transformers yet, right?

Ответить

capyk - 17.07.2023 11:32

You're amazing Josh, thank you so much for all this content <3.

Ответить

Ulixes - 15.07.2023 22:26

A video on Siamese Networks would be cool, esp. Siamese BERT-Networks

Ответить

Arvin Prince - 13.07.2023 17:05

hey there josh @statquest, your videos are really awsome and super helpful, thus i was wondering when will your video for transformer model come out

Ответить

El Mehdi Talbi - 13.07.2023 12:11

Could you do a video about Bert? Architectures like these can be very helpful on NLP and I think a lot of folks will benefit from that :)

Ответить

El Mehdi Talbi - 11.07.2023 16:16

Will there be a video about transformers?

Ответить

somreeta roy - 08.07.2023 06:32

Great work, Josh! Listening to my deep learning lectures and reading papers become way easier after watching your videoes, because you explain the big picture and the context so well!! Eagerly waiting for the transformers video!

Ответить

john doe - 08.07.2023 04:08

I'm excited for the video about transformers. Thank you Josh, your videos are extremely helpful

Ответить

Umut Nacak - 06.07.2023 17:40

Great videos! So after watching technical videos I think complicating the math has no effect on removing bias from the model. In the future one can find a model with self-encoder-soft-attention-direct-decoder you name it, but it's still garbage in garbage out. Do you think there is a way to plug a fairness/bias filter to the layers so instead of trying to filter the output of the model you just don't produce unfair output? It's like preventing a disease instead of looking for a cure. Obviously I'm not an expert and just trying to get a direction for my personal ethics research out of this naive question. Thanks!

Ответить

TheElysium - 04.07.2023 20:44

Since you asked for video suggestions in another video: A video about the EM and Mean Shift algorithm would be great!

Ответить

mohammed ebrahim - 03.07.2023 23:45

You made me really excited for transformers 😅

Ответить

clockent - 01.07.2023 15:27

One thing that eludes me, after watching the video once again is, on what basis can we compare hidden states of encoder and decoder. Why are they comparable at all? I understand we can compare word embeddings, but hidden states?

Ответить

KOOFUM Kim - 29.06.2023 04:09

“Statquest is all you need” — I really needed this video for my NLP course but glad it’s out now. I got an A+ for the course, your precious videos helped a lot!

Ответить

Nicolas Magalhães - 29.06.2023 02:30

Hey! Great video, this is really helping me with neural networks at the university, do we have a date for when the transformer video comes out?

Ответить

Noga Gshur - 28.06.2023 10:31

Thank you very much!

Ответить

Full Stats - 28.06.2023 05:56

This channel is pure gold. I'm a machine learning and deep learning student.

Ответить

Thành Trung Nguyễn - 27.06.2023 06:06

can't wait for the next StatQuest

Ответить

Arpit Anand - 26.06.2023 21:56

Really looking forward to your explanation of Transformers!!!

Ответить

Arpit Anand - 26.06.2023 21:30

The best explanation of Attention that I have come across so far ...
Thanks a bunch❤

Ответить

guillermo sainz zarate - 26.06.2023 19:15

Y ahora en español? Casi no lo creo, este canal es increible😭 muchas gracias por tus videos !!!

Ответить

ColdBrewed - 25.06.2023 09:57

Hello! Can we get a video on Gaussian Processes? many thanks!!!

Ответить

gnorts mr alien - 22.06.2023 07:18

waiting really hard for the transformer video.

Ответить

Jana - 21.06.2023 04:48

omg! where's the transformers video? my test is tomorrow ahhaha

Ответить

Luxcium - 20.06.2023 23:36

Oups 🙊 What is « Seq2Seq » I think I will have to check out the quest and then I will be happy to come back to learn with Josh 😅 I am impatient to learn Attention for Neural Networks Clearly Explained

Ответить

Hasan Sayeed - 20.06.2023 22:33

Amazing video Josh! Waiting for the transformer video. Hopefully it'll come out soon. Thanks for everything!

Ответить

Alex Feng - 19.06.2023 21:10

Fantastic video, indeed! Is the attention described in the video the same as in the attention paper? I didn't see the mention of QKV in the video and would like to know whether it was omitted to simplify or by mistake.

Ответить

Сейчас смотрят