A sample from every note set, free to download. No signup required for most, no credit card ever. If the format works for you, the rest will too.
The chapter that most people cite as the one that made it click. From MLE to the gradient update rule, every step shown — including the sigmoid derivative worked out four lines at a time.
Backprop derived from the chain rule with one worked example on a small network. No “it can be shown that” anywhere.
Why the Bellman equation is just “the value of now equals now plus the value of next.” Works out the gridworld case.
The definition of a derivative, the idea of going downhill, and the update rule. For anyone who's memorized gradient descent without understanding it.
What a convolution actually does to a pixel grid, why it's called that, and how it built into the first CNN.
A single attention head, worked out with small matrices you can follow by hand. No dot products hidden in the notation.
Sixty handwritten pages across every subject, zipped into one PDF. Drop your email and it lands in your inbox within a minute.
One email. Unsubscribe anytime. We never sell your address.