Public View
Suggest
Download this page (.md) Download entire wiki (.zip)
Clone entire wiki

gradient descent

It’s hard to make globally optimal solution, so therefore we instead make local progress.
constituents parameters \theta step size \alpha cost function J (and its derivative J’) requirements let \theta^{(0)} = 0 (or a random point), and then:

\begin{equation} \theta^{(t+1)} = \theta^{(t)} - \alpha J’\left(\theta^{(t)}\right) \end{equation}

“update the weight by taking a step in the opposite direction of the gradient by weight”. We stop, btw, when its “good enough” because the training data noise is so much that like a little bit non-convergent optimization its fine.
additional information multi-dimensional case \begin{equation} \theta^{(t+1)} = \theta^{(t)} - \alpha \nabla J\left(\theta^{(t)}\right) \end{equation}
where:

\begin{equation} \nabla J(\theta) = \mqty(\dv \theta_{1} J(\theta) \\ \dots \\ \dv \theta_{d} J(\theta)) \end{equation}

gradient descent for least-squares error We have:

\begin{equation} J\left(\theta\right) = \frac{1}{2} \sum_{i=1}^{n} \left(h_{\theta }\left(x^{(i)}\right) - y^{(i)}\right)^{2} \end{equation}

we want to take the derivative of this, which actually is chill

\begin{equation} \dv \theta_{j }J(\theta) = \sum_{i=1}^{n}\left(h_{\theta } \left(x^{(i)}\right) - y^{(i)}\right) \dv \theta_{j} h_{\theta} \left(x^{(i)}\right) \end{equation}

recall that h_{\theta}(x) = \theta_{0} x_{0} + \ldots
and so: \dv \theta_{j} h_{\theta}(x) = x_{j} since every other term goes to 0.
So, our update rule is:

\begin{align} \theta_{j}^{(t+1)} &= \theta_{j}^{(t)} - \alpha \dv \theta_{j} J\left(\theta^{(t)}\right) \\ &= \theta_{j}^{(t)} -\alpha \sum_{i=1}^{n} \left(h_{\theta}\left(x^{(i)}\right) - y^{(i)}\right)x_{j}^{(i)} \end{align}

Meaning, in vector notation: \theta^{(t+1)} = \theta^{(t)}-\alpha \sum_{i=1}^{n} \left(h_{\theta }\left(x^{(i)}\right) - y^{(i)}\right)x^{(i)}
when does gradient descent provably work? … on convex functions
stochastic gradient descent see stochastic gradient descent

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?