Комментарии:
In the backward function of the dense class you're returning a matrix which uses the weight parameter of the class after updating it, surely you'd calculate this dE/dX value before updating the weights, and thus dY/dX?
ОтветитьHow did u made this video
ОтветитьThank you so much, my assignment was so unclear, this definitely helps!
ОтветитьI followed the code exactly, and I still get Numpy shape errors.
ОтветитьThank you very much for your videos explaining how to build ANN and CNN from scratch in Python: your explanations of the detailed calculations for forward and backward propagation and for the calculations in the kernel layers of the CNN are very clear, and seeing how you have managed to implrment them in only a few lines of code is very helpful in 1. understanding the calculations and processes, 2. demistifying the what is a black box in tensorflow / keras.
ОтветитьT Q VERY HELFUL IAM JUST A BEGINNER
Ответитьjesus christ this is a good video and shows clear understanding. no "i've been using neural networks for ten years, so pay attention as i ramble aimlessly for an hour" involved
ОтветитьWhat code editor do you use
ОтветитьThe music is super annoying and sad. Doesn't even fit. Plus, it's competing with the voice.
ОтветитьThis video really saved me. From matrix representation to chain rule and visualisation, everything is clear now.
ОтветитьI developed my first neural network in one night yesterday. that could not learn because of backward propagation, it was only going through std::vectors of std::vectors to get the output. I was setting weights to random values and tried to guess how to apply backward propagation from what i have heard about it.
But it failed to do anything, kept guessing just as I did, giving wrong answers anyway.
This video has a clean comprehensive explanation of the flow and architecture. I am really excited how simple and clean it is.
I am gonna try again.
Thank you.
Yeah, this is awesome
Ответитьwould you mind sharing the manim project for this video?
ОтветитьIn your code you compute the gradient step for each sample and update immediately. I think that this is called stochastic gradient descent.
To implement full gradient descent where I update after all samples I added a counter in the Dense Layer class to count the samples.
When the counter reached the training size I would average all the stored nudges for the bias and the weights.
Unfortunately when I plot the error over epoch as a graph there are a lot of spikes (less spikes than when using your method) but still some spikes.
My training data has (x,y) and tries to find (x+y).
Not only was the math presentation very clear, but the Python class abstraction was elegant.
ОтветитьBest tutorial video about neural networks i've ever watched. You are doing such a great job 👏
ОтветитьCan anyone please explain why do we need mse_prime or ( del E / del Y ) for the last layer? Because if it is the last layer then its output should be the final right? Why do we need to calculate mse_prime then?
ОтветитьVery educational
ОтветитьAmazing explanation!!
Ответитьwhy do we use the dot product function for matrix multiplication? i thought that those did different things
ОтветитьWhen I checked the output of the dense layer I was getting an array of size (output size, output size) instead of (output size, 1), later they said it's due to broadcast. I dont know what it is. But when i changed bias shape from (output size,1) to (output size) i get the result with shape (output size,1)
ОтветитьBest tutorial💯💯💯💯
Ответитьwhen looking at the error and it's derivative wrt some y[i], intuitively I would expect that if I increased y[i] by 1 the error would increase by dE/dy[i], but if I do the calculations the change in the error is 1/n off from the derivative, does this make sense?
ОтветитьJunk
ОтветитьThank you very very much for this video....
ОтветитьThat is so satisfying
Ответитьshouldnt the gradient of the input vectors involve some sort of sum along one axis of the weights matrix?
for example if your input vector is shape (2,1) and your weights matrix is (4,2), you do np.matmul(weights,input) and create a vector of shape (4,1), the gradient of the input vector with respect to the output vector is the weights matrix summed along axis 0
very clean code! but I am so confused about the flow.
i made a scratch code with neural network and my flow is => my training data X and Y is simply a two dimension array, without diving into difficult math so each time when I adjust the weight it's coming from all the training data
and your code is breaking the training data into single training data, running all the data X, Y forward => backforward and adjust the weight.
what are the difference and why it wouldnt be a big problem?
Man, I love you. How many times i tried too do the multilayer nn on my own, but always faced thousand of problems. But this video explained everything. Thank you
ОтветитьWonderful, informative, and excellent work. Thanks a zillion!!
ОтветитьI noticed that you are using a batch size of one. make a separate Gradiant variable and ApplyGradiants function for batch sizes > 1
Note 1: also change "+ bias" to "np.add(stuff, bias)" or "+ bias[:,None]
Note 2: in backpropagation, sum up the biases on axis 0 (I'm pretty sure that the axis is 0) and divide both weights and biases by batch size
what is the use of layer class ? and a great video. Hope you keep posting stuff on your channel
ОтветитьTanh is one of those things that sounds great in principle but in practice people use linear. Frankly I just clamp it these days. Curves kill the spread, clump it all up and lines work fine. They're close enough to curves.
ОтветитьAmazing!!
ОтветитьAwesome man!!
Ответитьhi, i love this video. Only one question why in the DenseLayer.backward() in the bias and the weigths you use -= insted of =. Why we substract that value?
The rest is all clear :) Ty
Hi , Im trying to print the weights after every epoch but I'm not able to do so. Can u help whats going wrong with this approach ..I simply tried to use the forward method..during training,
def predict(network, input,train=True):
output = input
for layer in network:
if layer._class__.__name_ =='Dense':
output = layer.forward(output)
list_.append(layer.weights)
else :
output = layer.forward(output)
however i get the same corresponding weights all the time
how can we update this to include mini-batch gradient descent? Especially how will the equations change?
ОтветитьI would like alot if u continue your channel bro
ОтветитьThe best tutorial on neural networks I've ever seen! Thanks, you have my subscription!
Ответитьamazing video. one thing we could do is to have layers calculate inputs automatically if possible. Like if I give Dense(2,8), then the next layer I dont need to give 8 as input since its obvious that it will be 8. Similar to how keras does this.
Ответитьactually,you saved my life, thanks for doing these
ОтветитьThis is a very good approach to building neural nets from scratch.
ОтветитьThat was incredibly explained and illustrated. Thanks
ОтветитьAre you the guy from 3blue1brown?
Ответить