Coding Stable Diffusion from scratch in PyTorch

Coding Stable Diffusion from scratch in PyTorch

Umar Jamil

7 месяцев назад

76,935 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@harishankar2572
@harishankar2572 - 02.01.2024 11:48

As per my understanding , the loss function of an auto encoder is the KL Divergence loss. I dont know if I missed it the video but , I cant figure out where we have added the loss function .

Ответить
@runqipang9127
@runqipang9127 - 02.01.2024 03:24

一开始听口音还以为是老印

Ответить
@user-iv6ux4uy8b
@user-iv6ux4uy8b - 31.12.2023 21:16

What nobody seems to explain is why CLIP model is chosen to produce embeddings. Everyone mentions how it was used to match images to text, but how is this relevant at all if we are using our VAE encoder which has nothing to do with image encoder they've used in CLIP?

Ответить
@Johnx69
@Johnx69 - 31.12.2023 04:58

Really really good video. Can you create a video about tokenizer from scratch? Many thanks!

Ответить
@CallBlofD
@CallBlofD - 27.12.2023 13:58

Thank you so much! the best stable diffusion video I found!!!

Ответить
@romanbogachev6147
@romanbogachev6147 - 24.12.2023 17:00

the most powerfull deep learning videos in the world are on this channel

Ответить
@gautamVashishtha23
@gautamVashishtha23 - 23.12.2023 23:32

This is extremely helpful, can you please also make content for score base diffusion models and other more complex scheduling algorithms like Ranga kutta scheduling? Thanks a lot for your efforts

Ответить
@ryankao1983
@ryankao1983 - 18.12.2023 20:02

Could you please make a video on how to train a stable diffusion model? e.g. how many images do we need to train it? what types of images should we collect?

Ответить
@AInseven
@AInseven - 18.12.2023 18:13

谢谢你,总算清楚sampler和unet之间的关系了

Ответить
@radads
@radads - 18.12.2023 05:15

Thanks!

Ответить
@user-ep4fk2jm6q
@user-ep4fk2jm6q - 17.12.2023 09:05

Really great video for understanding stable diffusion in detail. Thanks a lot for your contribution

Ответить
@user-co6gq1mz1x
@user-co6gq1mz1x - 16.12.2023 20:44

Amazing job my friend! I just got a job in ShenZhen China by learing it! Thank u so much mate. I hope u and ur family living a great in China :)

Ответить
@kajalgupta3168
@kajalgupta3168 - 11.12.2023 05:34

pre-trained weights not working with the code you have provided.

Ответить
@The_One_Who_Moves_the_Stars
@The_One_Who_Moves_the_Stars - 09.12.2023 18:40

I wish the code size was larger to make it easier to read.

Ответить
@arjunreddy8358
@arjunreddy8358 - 09.12.2023 09:31

In the Original Stable Diffusion Process, are the encoder and decoder components trained independently from the Noise Prediction U- Net architecture and then utilized as pre-trained models, where the architecture looks like Pre-trained Encoder + Noise Prediction U- Net + Pre-Trained Decoder (Note here Noise Prediction U- Net is not related to Pre-trained Encoder / Decoder before training combined Stable Diffusion )? or Are the Encoder, Noise Predictor, and Decoder trained together as a unified system, where they collectively learn patterns from the training images?

Ответить
@zeweichu550
@zeweichu550 - 09.12.2023 08:29

I just discovered a great, wonderful, amazing, fantastic, gem channel 🎉🎉🎉

Ответить
@user-or5dl7hd7o
@user-or5dl7hd7o - 09.12.2023 05:42

It's the best explaination ever!!!! Thank you!

Ответить
@user-uz6jc9ix1e
@user-uz6jc9ix1e - 08.12.2023 13:50

ModuleNotFoundError: No module named 'pytorch_lightning'

Ответить
@unscripted-adventures
@unscripted-adventures - 07.12.2023 21:37

Awesome, This is the best explanation!!!

Ответить
@loganli5609
@loganli5609 - 06.12.2023 15:18

Thank you!

Ответить
@lucao9059
@lucao9059 - 05.12.2023 22:44

jesus I have base knowledge of AI and Statistics but you made me understand quite a lot of things thanks to your vid

Ответить
@icejust9195
@icejust9195 - 05.12.2023 17:09

This is amazing video!! Great job!!!

Ответить
@user-kg9zs1xh3u
@user-kg9zs1xh3u - 03.12.2023 06:44

讲的非常不错!❤

Ответить
@user-kg9zs1xh3u
@user-kg9zs1xh3u - 03.12.2023 06:39

guoqing jie laojia😂chinese?vary good video, keep going,Thank you!

Ответить
@satirthapaulshyam7769
@satirthapaulshyam7769 - 02.12.2023 23:30

So u didnt train the unet?

Ответить
@neocrz
@neocrz - 30.11.2023 03:23

this covers LoRA? can you make a video if not?

Ответить
@mmaxpo9852
@mmaxpo9852 - 28.11.2023 17:13

Thanks Dear For helping Us , you Video's are very helpful

Ответить
@beyzakaya8620
@beyzakaya8620 - 28.11.2023 12:10

Is this coding compatible with diffusers library? I fine-tuned the stable diffusion model for my dataset but I need a torch model for further changes but I couldn't capture the full dependencies from diffusers code.

Ответить
@user-lm8dm6bu4q
@user-lm8dm6bu4q - 28.11.2023 04:44

Great Work! Could you make a tutorial for ControlNet?

Ответить
@gokayaydogan2473
@gokayaydogan2473 - 28.11.2023 01:19

almost karpathy level explanations, thank you!

Ответить
@birendrakathariya3517
@birendrakathariya3517 - 26.11.2023 21:28

excellent video, full of information

Ответить
@kotcraftchannelukraine6118
@kotcraftchannelukraine6118 - 26.11.2023 19:37

Is it possible to create Stable Doffusion alternative using Brian2 instead of PyTorch?

Ответить
@leonwong3369
@leonwong3369 - 26.11.2023 04:12

Thanks again for the video. This is my second time watching this video. I can't help but notice that in the original latent diffusion paper, they were using vqgan to compress image into latent. Is the choice of VAE just for convenience?

Ответить
@hason4234
@hason4234 - 25.11.2023 19:40

An extremely detailed video about diffusion. I have learned a lot. Thank you ❤❤❤

Ответить
@hakancevikalp3653
@hakancevikalp3653 - 23.11.2023 18:15

What about training, I could not find a a training file in your github as well.

Ответить
@lianhongw
@lianhongw - 23.11.2023 07:06

Really appreciated, very informative.

Ответить
@parmarsuraj99
@parmarsuraj99 - 23.11.2023 07:05

By far best explanation ❤

Ответить
@lostpenguin3682
@lostpenguin3682 - 21.11.2023 06:30

Your code is so detailed and it runs on my enviorment just fine. Great job!!!👏

Ответить
@nakjoonim
@nakjoonim - 19.11.2023 16:27

Thank you so much for this amazing work!

Ответить
@kastrogerrard2905
@kastrogerrard2905 - 19.11.2023 13:40

Great work. I love this so much. Which auto completion tool are you using in VScode btw?

Ответить
@coreyhu6787
@coreyhu6787 - 19.11.2023 10:33

Why does the VAE encoder not use an activation between the two Convolution layers? Don't we need a nonlinearity?

Ответить
@dummy9422
@dummy9422 - 19.11.2023 07:25

Subscribed ❤❤

Ответить
@rabeyatussadia6729
@rabeyatussadia6729 - 16.11.2023 20:34

Thanks for this informative explanation. I was wondering in the demo file you took a pretrained model but you already built the model by your own, why don't you use that one, if I want to use how to do that? and Would you tell me how it will works for image conditioning without using the Clip text prompt? Thank you.

Ответить
@hientq3824
@hientq3824 - 16.11.2023 19:08

fabulous! thank you very much!

Ответить
@ActualCode0
@ActualCode0 - 16.11.2023 18:11

This is the best explanation of latent diffusion models I've seen

Ответить
@venkateshr6127
@venkateshr6127 - 15.11.2023 10:28

Sir, if we want to pretrain a distilled version of stable diffusion how to do that.

Ответить
@filipkilibarda8952
@filipkilibarda8952 - 14.11.2023 02:34

Great video! Really well made and informative.
On the sidenote.
Is somebody familiar with the correct VAE loss function? As i understand it consists of reproduction loss and KL terms. How should KL term be summed up over channels and batches? Also is there normalization for batches and training set?

Ответить
@lchunleo
@lchunleo - 13.11.2023 10:16

Thanks for your video..I learn a lot..may I check how can I put Lora into it ? Thanks

Ответить