Exponential Family

A family of univariate PDF/PMF is said to be exponential if it can be expressed as:

f (x; θ) = c (θ) h (x) exp (t (x) q (θ))

or equivalently,

f (x; θ) = h (x) exp (t (x) q (θ) - b (θ)),

where all values are scalars, $h (x) \geq 0$ , and $c (θ) = e^{- b (θ)} \geq 0$ is the normalizing constant.

We can extend this to cover multivariate random variables and multi-dimensional parameters. Specifically, consider real vectors $x \in R^{d}$ , $θ \in R^{s}$ and $t, q \in R^{k}$ where $k \geq s$ . A joint PDF/PMF is said to be exponential if

f (x; θ) = c (θ) h (x) exp (⟨ t (x), q (θ) ⟩),

where the inner product can be matrix inner product. Common distributions have $s = k$ , and such distribution is said to be in a == $k$ -parameter exponential family==. When $s < k$ , we say it is in a curved exponential family. An exponential family has the following components:

$c (θ)$ is the normalizing constant;
$h (x)$ is the base measure;
$q (θ)$ is the natural parameter; it can be thought of as a reparameterization of $θ$ , and thus we require the dimension of to be no less than that of $θ$ ;
$t (x)$ is a ==Sufficient Statistic== w.r.t. the natural parameter space:

Θ = {θ : \int h (x) e^{⟨ q (θ), t (x)⟩} d x < \infty};

$exp (q (θ)^{T} t (x))$ is the exponential tilt that up(down)-weights the base measure.

If $q = θ$ (perhaps after some reparameterization $θ \leftarrow q (θ)$ ), we say the exponential family is in canonical form. Further, if the sufficient statistic is the r.v. itself, i.e., $t (x) = x$ , we say the exponential family is in natural form. In between, we have the dispersion form:

f (x; θ) = h (ϕ, x) exp (\frac{x ^{T} θ - b ( θ )}{ϕ}),

where $ϕ$ is called the dispersion parameter. We can see that if $ϕ$ is known, then $θ$ is the only canonical parameter, but the sufficient statistic is the dispersed $x$ : $t (x) = x / ϕ$ . If $ϕ$ is unknown, the model may corresponds to a multi-parameter exponential family.

📗 For example, a Normal Distribution with known variance $σ^{2}$ has a dispersion form with $ϕ = σ^{2}$ , $t (x) = x$ , $q (μ) = μ$ , and $b (μ) = μ^{2} /2$ .

Examples

To verify if some family of distributions is of exponential type, we must be able to identify the functions $c (θ)$ (or $b (θ)$ ), $h (x)$ , $t (x)$ and $q (θ)$ .

Distribution \ Component	PDF/PMF	$q$	$t$	$b$	$h$
Univariate Normal Distribution	$(2 π)^{- 1/2} σ^{- 1} exp (- (x - μ)^{2} / (2 σ^{2})$	$(μ / σ^{2}, - 1/ (2 σ^{2}))$	$(x, x^{2})$	$μ^{2} / (2 σ^{2}) + ln σ$	$(2 π)^{- 1/2}$
Multivariate Normal Distribution	$(2 π)^{- d /2} ∣Σ ∣^{- 1/2} exp (- (x - μ)^{T} Σ^{- 1} (x - μ) /2)$	$(Σ^{- 1} μ, - Σ^{- 1} /2) \in R^{d} \times R^{d \times d}$	$(x, x x^{T})$	$(μ^{T} Σ^{- 1} μ + ln ∣Σ∣) /2$	$(2 π)^{- d /2}$
Exponential Distribution	$λ exp (- λ x)$	$- λ$	$x$	$- ln λ$	$𝟙_{x \geq 0}$
Bernoulli Distribution	$p^{x} (1 - p)^{1 - x}$	$ln (p / (1 - p))$	$x$	$- ln (1/ (1 - p))$	$𝟙_{x \in {0, 1}}$
Binomial Distribution with known $n$	$(x n) p^{x} (1 - p)^{n - x}$	$ln (p / (1 - p))$	$x$	$- n ln (1/ (1 - p))$	$(x n) 𝟙_{x \in {0, \dots, n}}$
Poisson Distribution	$λ^{x} e^{- λ} / x!$	$lo g λ$	$x$	$λ$	$1/ x! 𝟙_{x \in N}$
Chi-Square Distribution	$e^{- x /2} (x /2)^{ν /2 - 1} / (2Γ (ν /2))$	$ν /2 - 1$	$ln x$	$ln Γ (ν /2) + ln 2 \cdot ν /2$	$e^{- x /2} (ν /2) 𝟙_{x > 0}$
Gamma Distribution	$β e^{- β x} (β x)^{α - 1} /Γ (α)$	$(α - 1, - β)$	$(ln x, x)$	$ln Γ (α) - α ln β$	$𝟙_{x > 0}$
Beta Distribution	$x^{α - 1} (1 - x)^{β - 1} Γ (α + β) / (Γ (α) Γ (β))$	$(α, β)$	$(ln x, ln (1 - x))$	$ln (Γ (α) Γ (β) /Γ (α + β))$	$𝟙_{x \in (0, 1)} / (x (1 - x))$
Categorical Distribution with known $K$	$\prod_{i = 1}^{K} p_{i}^{x_{i}}$ with $p_{K} = 1 - \sum_{i = 1}^{K - 1} p_{i}$	$(lo g (p_{1} / p_{K}), \dots, lo g (p_{K - 1} / p_{K})) \in R^{K - 1}$	$(x_{1}, \dots, x_{K - 1})$	$- ln p_{K}$	$𝟙_{x \in {e_{1}, \dots, e_{K}}}$

MLE for Exponential Family

Due to the exponential family’s special form of the likelihood, the MLE estimator of $θ$ coincides with the moment estimator w.r.t. the exponent statistic $t (x)$ . WLOG, suppose $q (θ) = θ \in R^{k}$ . We first note that the inverse normalizing constant is infinitely differentiable:

\frac{\partial ^{p}}{\partial ^{j_{1}} θ _{1} \dots \partial ^{j_{k}} θ _{k}} (\frac{1}{c ( θ )}) = \int h (x) t_{1}^{j_{1}} (x) \dots t_{k}^{j_{k}} (x) e^{θ^{T} t (x)} d x,

where $j_{i} \in N$ and $p = \sum j_{i}$ . Therefore, the derivative of the log-likelihood $LL (θ ∣ x)$ is

LL^{'} (θ) = = = = = \frac{d}{d θ} (lo g (c (θ) h (x) e^{θ^{T} t (x)})) \frac{c ^{'} ( θ )}{c ( θ )} + t (x) - c (θ) \frac{d}{d θ} (\frac{1}{c ( θ )}) + t (x) t (x) - \int c (θ) h (x) t (x) e^{θ^{T} t (x)} d x t (x) - E_{θ} [t (X)] .

Then, the MLE estimator of $t$ , which is the zero of the derivative, satisfies $E_{\hat{θ}_{MLE}} t (X) = \hat{E}_{n} t (x)$ , indicating that MLE is a General MM Estimator w.r.t $t (x)$ . As a result, the asymptotic normality property of both MLE (M-estimator) and MM (Z-estimator) applies.

Moments of Dispersion Exponential Family

The above calculation also gives a convenient way to compute the first (mean) and second (variance) central moments of $X$ when $X$ follows a dispersion exponential family. Specifically, notice that $t (x) = x / ϕ$ and $c (θ) = exp (- b (θ) / ϕ)$ . Therefore, we have

LL^{'} (θ) = \frac{c ^{'} ( θ )}{c ( θ )} + t (x) = \frac{x - b ^{'} ( θ )}{ϕ} .

Under For Maximum Likelihood Estimation, we have $E [LL^{'} (θ)] = 0$ , and thus

E [X] = b^{'} (θ) .

Similarly, under the same conditions, we have

0 = E [LL^{''} (θ) + (LL^{'} (θ))^{2}] = \frac{- b ^{''} ( θ )}{ϕ} + \frac{E [( X - E [ X ] ) ^{2} ]}{ϕ ^{2}},

which gives

Var (X) = ϕ b^{''} (θ) .

📗 For example, for a Poisson Distribution, $θ = ln λ$ , $b (θ) = e^{θ}$ , and $ϕ = 1$ . Thus, $E [X] = Var (X) = b (θ) = b^{'} (θ) = b^{''} (θ) = λ$ .

Almost All Probability

Table of Contents

Backlinks

Graph View

Exponential Family

Table of Contents

Exponential Family

Examples

MLE for Exponential Family

Moments of Dispersion Exponential Family

Backlinks

Graph View