Maximum a Posteriori

As in Bayesian Inference, we have a posterior distribution given a prior distribution and the observation. However, instead of taking the expectation, we can also apply the Maximum Likelihood Estimation, which gives the maximum a posteriori (MAP) estimation.

w_{MAP} = w arg max ln p (w ∣ y, X) = w arg max ln p (y ∣ w, X) + ln p (w) - ln p (y ∣ X) = w arg max ln p (y ∣ w, X) + ln p (w)

Note here, we are not maximizing the likelihood of the observation; we are maximizing the likelihood of the weight given the posterior distribution.

Relationship with Ridge Regression

Just like MLE being the probability interpretation of least square estimation, MAP is the probability interpretation of the Ridge Regression.

Assume the prior distribution of $w$ is $N (0, λ^{- 1} I)$ , then

w_{MAP} = ar g w max - \frac{1}{2 σ ^{2}} (y - Xw)^{T} (y - Xw) - \frac{λ}{2} w^{T} w

which gives $w_{MAP} = (λ σ^{2} I + X^{T} X)^{- 1} X^{T} y$ , equaling to $w_{RR}$ with regularizer parameter $λ σ^{2}$ .

Sufficient Statistics

Table of Contents

Backlinks

Graph View

Maximum a Posteriori

Table of Contents

Maximum a Posteriori

Relationship with Ridge Regression

Backlinks

Graph View