Confidence Interval

A (asymptotic) level confidence set of a parameter is such that .

  • ❗️ The confidence set is random. The confidence means that of all the set instances will cover the unknown value of . The wrong statement is that lies in the confidence set with a probability at least . Note that is not a Random Variable

Set relations are useful in constructing different CIs:

If is a CI for and is a monotonically increasing function, then ) is a CI for .

Test Statistic and Critical Values

Recall that a Statistic is a function of the observed data, e.g., mean and variance. If a test involves some parameters, a test statistic is often a function of both the sample and the parameter, such that

Such a test statistic is also called a pivot (quantity).

Then, we can first construct a confidence interval for the test statistic . Using the knowledge of its distribution (or quantiles), the confidence interval can be given by:

where is the -th quantile of the distribution of , and and are called the critical values.

First Example–Gaussian Mean

Let’s first consider a test statistic with a known distribution. Suppose a statistical model with know variance. Then we know that is is known and independent of . Using the quantile function (inverse CDF) of the Normal Distribution,

which gives the CI

Standard Error Based CI

Usually the test statistic is of the form , where is the estimator of , and is the standard error, i.e., the (estimated) standard deviation of the estimator . Then, if we know the quantile function of , i.e., critical values, the confidence interval is of the form:

Confidence Interval Width

The width of the confidence interval, that is, its accuracy depends on:

  • The sample size n: the larger the sample size the narrow the width of the CI.
  • The confidence level: the higher the confidence the wider the CI will be!
  • The standard deviation of the population or SE: the larger the SE the wider the CI will be.
  • The method used to construct the CI

CLT CI

The CI in First Example–Gaussian Mean is finite-sample valid since we use exact quantiles. When estimating the mean of a unknown distribution and unknown variance, we can leverage the asymptotic normality by CLT:

where ; then is an estimated SE.

Asymptotic Valid

We prove that CLT CI is asymptotic valid. Denote . We have

By CLT, ; by LLN, . Then, by Slutsky’s Theorem,

Hoeffding CI

For bounded r.v.s , Hoeffding’s Inequality gives

Letting the RHS be gives , which further gives a level CI:

  • ❗️ Note that Hoeffding CI is finite-sample valid, in contrast to the CLT CI, which is asymptotic valid. However, Hoeffding is very conservative and not typically used in practice.

Wald CI

When the SE of the estimator depends on the true parameter , we can simply plug in the estimator into the SE to get an estimated . This gives the Wald statistics:

Under some conditions, . Thus, the Wald CI is . The Wald CI is also called the plug-in CI.

  • 👎 Different from the CLT CI where we use the sample variance to approximate the SE, Wald CI requires the knowledge of how SE depends on the parameter.

  • 👍 However, Wald CI is usually easier to compute, and needs only one statistic.

    • 📗 For example, when estimating the mean, Wald CI only needs , while computing the sample variance generally needs to store the entire dataset.

Wilson Score CI

Instead of constructing the CI using the Standard Error Based CI with an estimated SE, we can also use the exact SE and solve the inequality with unknown parameter on both sides. However, only in a few cases, this leads to a closed form.

For example, for Binomial Distribution, using the exact SE, we have

The involved inequalities are quadratic in . Solving them gives the Wilson score CI: