p-value

Introduction

p-value is the probability of obtaining a real-valued test statistic at least as extreme as the one actually obtained under the null hypothesis.

In other words, (asymptotic) p-value of a test is the smallest (asymptotic) level at which the test rejects .

Consider an observed test-statistic  from unknown distribution . Then the p-value  is what the prior probability would be of observing a test-statistic value at least as “extreme” as  if null hypothesis  were true. That is:

  •  for a one-sided right-tail test,
  • for a one-sided left-tail test,
  •  for a two-sided test.
    • If the distribution of  is symmetric about zero, then 

where is the observed test statistic.

  • ❗️ Since the test statistic is random, p-value is also random.

Fundamental rule of statistics

In other words, an almost impossible event () happens given H0, thus it is rejected.

  • 💡 The smaller the p-value, the more confidently one can reject , because the event is too unlikely to happen under null.

Formal Definitions

The above note provides an intuitive understanding of p-value. This section discusses p-value from a more formal perspective.

We first formalize the idea in ^reject using rejection regions. Suppose we are given a test statistic and its rejection regions for any level . Then, we have

We then formalize the idea in ^extreme. For a test statistic , whose randomness depends on the underlying true parameter, we denote as the same statistic but whose distribution is determined by the null parameter and is usually known. Then, suppose the rejection region is of the form , i.e., consider a right-tail test. Then, we have

Specifically, upon realization of the test statistic , the p-value is realized as .

Finally, we provide the most general definition of p-value, which does not depend on the rejection region: p-value is a Statistic such that

Recall that the CDF of a uniform distribution is for . Thus, p-value is also said to has a super-uniform distribution. Common p-values are uniform.

Understanding the super-uniformality of p-value

  • p-value is a tail probability. Under null, we do not expect the p-value to be very small, because the tail probability of being too small is too small.
    Illustration of p-value as a tail bound.|300
  • p-value, as a random variable, puts the evidence for null on a standardized -scale. In other words, if the null hypothesis is true, we should observe a p-value uniformly distributed on .
    Illustration of p-value's super-uniformality.|300

To verify the validity of the above definitions , , and , one just need to check that the resultant test gives a size at most .

CI and HT duality

  • Denoting , we see that a p-value summaries the collection of test for different levels and a fixed null.
  • Due to the Confidence Interval and Hypothesis Test Duality, we see that a confidence interval summaries the collection of tests for different nulls and a fixed level.

Examples

CLT Test Statistic

Given a CLT Test Statistic , an asymptotic right-tail p-value is , i.e., the Gaussian tail bound. The construction is based on the definition .

Likelihood Ratio Test

For a simple-simple HT, given its likelihood ratio , is a p-value. We verify that satisfies the definition :

where the inequality uses Markov Inequality. See also Likelihood Ratio Test for more general results.