exponential family

Exponential Family is a family of distributions following exponentials.
constituents y the data \eta the natural parameter — vector or scalar T\left(y\right) the “sufficient statistic” (this is usually just y) — vector or scalar b\left(y\right) the base parameter — scalar a\left(\eta\right) the log-partition function — scalar requirements A class of distributions is in the Exponential Family if it can be written as:

\begin{align} P\left(y \mid \eta\right) &= b\left(y\right) \exp \left(\eta^{\top}T\left(y\right)-a\left(\eta\right)\right) \\ &= \frac{b\left(y\right) \exp \left(\eta^{\top} T\left(y\right)\right)}{e^{a\left(\eta\right)}} \end{align}

To show a particular family of distirbutions is an Exponential Family, we fix a choice of b, T, a and show that varying \eta gives you the same family.
additional information properties of exponential family MLE wrt \eta is concave, which means it has a unique maximum; negative log-likelihood function is convex \mathbb{E}[y | \eta] = \pdv{\eta} a\left(\eta\right) \text{Var}[y | \eta] = \pdv[2]{n} a\left(\eta\right) motivation “family” What is a family of distributions? We can write a set

\begin{equation} S = \left\{\text{Bern}\left(j\right) \mid j \in [0.0, 1.0]\right\} \end{equation}

which is a family of Bernoulli distributions. You can also come up with a family for some fixed variance \sigma^{2}, such that:

\begin{equation} S = \left\{\mathcal{N}\left(i, \sigma^{2}\right) \mid i \in \mathbb{R}\right\} \end{equation}

example Bernoulli distribution is in the exponential family Prove that a Bernoulli distribution is in the exponential family:

\begin{equation} p\left(y\mid \phi\right) = \phi^{y} \left(1-\phi\right)^{1-y} \end{equation}

is in the exponential family.

\begin{align} \phi^{y}\left(1-\phi\right)^{1-y} &= \exp \log \left(\phi^{y} \left(1-\phi\right)^{1-y}\right) \\ &= \exp \left(y \log \phi + \left(1-y\right) \log\left(1-\phi\right)\right) \\ &= \exp\left( \left(\left(\log \frac{\phi}{1-\phi}\right)\right)y + \log \left(1-\phi\right)\right) \end{align}

So we can write:

\begin{equation} \eta = \log \frac{\phi}{1-\phi} \end{equation}

\begin{equation} \phi = \frac{1}{1+e^{-\eta}} \end{equation}

And we can write:

\begin{equation} \begin{cases} a\left(\eta\right) = -\log \left(1-\eta\right) = \log \left(1+e^{\eta}\right) \\ T\left(y\right) = y\\ b\left(y\right) = 1 \end{cases} \end{equation}

Hence, we can conclude that \text{ExpFam}\left(\eta\right) = \text{Bern}\left(\theta\right).
Gaussian distribution You can try yourself too for fixed \sigma=1. Just factor the quadratic \left(y-\mu\right)^{2} and pattern match:

\begin{equation} \begin{cases} b\left(y\right) = \frac{1}{\sqrt{2\pi}} \exp \left(-\frac{1}{2}y^{2}\right) \\ \eta = \mu \\ y = T\left(y\right) \\ a\left(\eta\right) = \frac{1}{2} \mu^{2} \end{cases} \end{equation}