Estimation
Quick Reference
- Types of estimation
- Metric
- Evaluating an Estimator
- Probabilistic properties
- Statistical properties
- Bayes Optimal Estimator
- Minimax Optimal Estimator
- Point estimation methods
flowchart subgraph BB[Prediction] direction TB E["Empirical Risk Minimization"] E --"geneneralizes"--> F["Regression"] F --"contains"--> FF@{shape: processes, label: "... ...", w: 100px} end subgraph AA[Estimation] direction TB A["M-Estimator"] --"generalizes"--> B["Maximum Likelihood Estimator"] C["Z-Estimator"] --"generalizes"--> A C["Z-Estimator"] --"generalizes"--> D["Moment Estimator"] B <--"same for exponential family"--> D B --"add a prior"--> M["Maximum a Posteriori "] D --"contains"--> D1["Sample Mean"] D --"contains"--> D2["Sample Variance"] end E <--"same form"--> A
Point Estimation
A point estimator/statistic recovers a quantity of interest from data samples. Formally, it’s any algorithm/measurable function that returns a point in the parameter space given the sample:
The parameter space can be one-dimensional, multi-dimensional, or even a function space. When the sample has a sample size/dimension of , we also conventionally write to denote the point estimator.
In contrast to point estimation, Confidence Interval/region returns a subset of the parameter space , and Bayesian Inference returns a distribution over the parameter space .
Comparison of Estimation Methods
MLE vs MoM
- For quadratic risks, MLE is more accurate in general
- MLE still gives good results even for misspecified models, while Method of Moments is more sensitive to model misspecification.
- Sometimes MLE can be computationally intractable, and Method of Moments is easier with only polynomial equations.
Bayesian Estimation
- Maximum a Posteriori, which returns the mode of the posterior distribution.
- Bayes Optimal Estimator, which returns the
- mean of the posterior distribution for Mean Squared Error, or any Bowl-Shaped Loss with a Gaussian posterior;
- median of the posterior distribution for absolute error loss ;
- mode of the posterior distribution for zero-one loss .
Bayes vs Frequentist
- The Bayesian approach has been criticized for its over-reliance on convenient priors and lack of robustness.
- The frequentist approach, such as MLE, has been criticized for its inflexibility (failure to incorporate prior information) and incoherence (failure to process information systematically).
- For large sample sizes (), or when the prior is uniform, the Bayesian method tends to yield results similar to those of the classical likelihood approach.