Best Estimator for Uniform Distribution Parameter

We want to estimate the parameter of a Uniform Distribution given i.i.d samples .

  • ❓ So, what is the best estimator?

First, we need to define the evaluation metric. We use Mean Squared Error. We know for an estimator of , its MSE is

where is the bias of the estimator, and is the standard error of the estimator.

We will also touch on Asymptotic Normality for certain estimators.

First Attempts

Note that the Expectation of is , which gives . Therefore, the very first estimator we can think of is to replace the expectation with a sample value:

We know this estimator is unbiased and thus its MSE is

which is a constant regardless of the sample size .

Obviously, is not satisfactory as it only uses the information from one sample. More generally, we can consider an estimator that uses samples:

which is still unbiased but has a reduced standard error:

The variance reduces because aggregates the information from i.i.d. samples. Again, its MSE is a constant w.r.t .

We plot the histogram of for different values to see how the distribution changes with sample size.

Method of Moments

A natural extension is to use all samples:

Since the above is equivalent to solving the estimation equation

the resultant estimator is a Moment Estimator. In other words, we plug in the sample mean as the true mean (first moment) to get the estimation. The mean squared error is the same as the previous case with samples:

We compare the MSE and histogram of the moment estimator with :

Maximum Likelihood Estimation

Derive the maximum likelihood estimator for : .

To get and , we first need to calculate the distribution of . It’s CDF is

Therefore, the PDF of is

Then, we have

Therefore

Thus, we can say that is a better estimator than .

We compare the MLE with previous estimators.

Uniformly Minimum-Variance Unbiased Estimator

Can we do better? Equation suggests that is an unbiased estimator, which may further reduce the error.

We have the following fact:

Proposition ^prop

If is a complete and Sufficient Statistic for a parameter , and is an estimator dependent only on , then is the unique uniformly minimum-variance unbiased estimator (UMVUE) of .

The uniformness refers to that the minimum variance is achieved for all .

We note that is a complete and sufficient statistic for . Thus, the estimator

has the minimum variance among all unbiased estimators of .

We calculate its variance, i.e., MSE:

Jackknife

In our problem, the bias of the MLE can be explicitly calculated, and thus we can directly correct it using the UMVUE. For more general biased estimators, we can use Jackknife resampling to estimate the bias, and then correct the estimator.

The first step of this procedure is to produce a series of leave-one-out estimates:

That is, we remove one sample and construct the estimator using the remaining samples. Then, the Jackknife bias estimate is

where is the original estimator (with samples) to be corrected. Thus, the corrected Jackknife estimator is

Generally, the MSE of a Jackknife estimator is difficult to calculate as are correlated. However, if we are to correct the MLE estimator for , the Jackknife estimator has a simple form:

where is the -th order statistic of the sample , and thus . Moreover, we can calculate its MSE, which slightly improves that of the MLE estimator:

See Appendix for details of the calculation.

Minimal MSE

When we look at the MSE, for MLE, bias dominates, while for UMVUE and Jackknife, variance dominates. A natural next step is to find the estimator that achieves the optimal balance between bias and variance. Actually, such an estimator is indeed the best estimator for in terms of MSE (see Appendix).

Consider a general form of the MLE and UMVUE using the complete and sufficient statistic :

recovers the MLE and recovers the UMVUE. For a general , we have

Thus,

Setting

gives . Therefore, we get

with

Summary

The following table summarizes the estimators we have discussed.

EstimatorExpressionBiasVarianceMSE
Samples
Method of Moments
MLE
UMVUE
Jackknife
MMSE

Finally, we plot the histograms of all estimators separately to compare their distributions, and calculate their empirical mean squared errors.

Beyond MSE

So far, we have focused on the MSE as the metric for evaluating estimators. In this section, we first explore a new risk, and then discuss the statistical properties of MLE (and thus other estimators built on ) for the uniform distribution.

Zero-One Loss

Recall that MSE is the risk associated with the squared loss .

Consider a new loss function:

This zero-one loss function is often used in binary decision-making problems. In the context of parameter estimation, it finds applications in catastrophic risk assessment, where any estimation error beyond a certain threshold is considered catastrophic. Then, the corresponding risk

is the probability of catastrophe.

Again, let’s consider an estimator of the general form , and try to minimize the zero-one risk. We have

We have already derived the CDF of in Maximum Likelihood Estimation, which gives

For a fixed , we usually consider a small threshold . In such scenarios,

Using the above approximation, one can easily verify that minimizes the zero-one risk.

Note that the estimator should not depend on the true parameter . Thus, we consider a zero-one loss based on the relative distance:

whose corresponding optimal estimator is thus

We plot the histogram of the zero-one estimator for different values, and compare it with the MMSE estimator on both the MSE and zero-one risk.

Statistical Properties of MLE for Uniform Distribution

MLE is known to be the best estimator in terms of statistical properties, such as consistency and asymptotic normality, under mild conditions. However, in this note, we have shown that the MLE for the uniform distribution is biased and has a larger MSE than some other estimators. Do our findings contradict the properties of MLE?

We first verify the consistency. In the previous section, we show that for and any ,

Thus, is consistent.

For asymptotic normality, we notice that

Thus, cannot be asymptotically normal. Specifically, the uniform distribution fails to meet the regularity condition that the support of the distribution should not depend on the parameter .

Moreover, we actually have

indicating that

Note that the exponential tail bound is significantly heavier than the Gaussian tail bound ( vs. ). We verify this by simulation.

Finally, we remark that either MSE or asymptotic normality is just one of many criteria for evaluating estimators. One estimator with smaller MSE may underestimate other risks. One asymptotically normal estimator may have larger MSE than another estimator. At the end of the day, we should choose the estimator that best fits our specific problem and risk criteria.

Appendix

Calculation of Jackknife MSE

For correcting the MLE , the Jackknife estimator has a simple form:

where is the -th order statistic of the sample , and thus .

Similar to the calculation in Maximum Likelihood Estimation (see also Distribution of i-th Order Statistic), the PDF of is

Then, we can calculate the bias:

We can see that compared to the Maximum Likelihood Estimation, the bias is significantly reduced.

To calculate the variance of , we need to know the joint distribution of and . Note that the joint PDF of the entire order statistic is given by

Integrating out gives

Thus, their covariance is

Then, we can calculate the variance:

Finally, we get

Optimality of MMSE

For any estimator , we write . Suppose is linear, then by the linearity of expectation, we have

Let . By Proposition ^prop, we know has the same bias as but smaller variance than if (uniqueness). Further, among the class of estimators consisting of linear functions of , it is easy to see has the smallest MSE.

Now suppose is not linear. By Taylor expansion,

For to be uniformly optimal for any , it must satisfy

Thus, for and any , indicating that must be linear in .