Public View
Suggest
Download this page (.md) Download entire wiki (.zip)
Clone entire wiki

xavier initialization

An neural network initialization scheme that tries to avoid Vanishing Gradients.
Consider Wx step in a neural network:

\begin{equation} o_{i} = \sum_{j=1}^{n_{\text{in}}} w_{ij} x_{j} \end{equation}

The variance of this:

\begin{equation} \text{Var}\left[o_{i}\right] = n_{\text{in}} \sigma^{2} v^{2} \end{equation}
[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?