p-value

Introduction

p-value is the probability of obtaining a real-valued test statistic at least as extreme as the one actually obtained under the null hypothesis.

In other words, (asymptotic) p-value of a test is the smallest (asymptotic) level $α$ at which the test rejects $H_{0}$ .

Consider an observed test-statistic $t$ from unknown distribution $T$ . Then the p-value $p$ is what the prior probability would be of observing a test-statistic value at least as “extreme” as $t$ if null hypothesis $H_{0}$ were true. That is:

$p = Pr (T \geq t ∣ H_{0})$ for a one-sided right-tail test,
$p = Pr (T \leq t ∣ H_{0})$ for a one-sided left-tail test,
$p = 2 min {Pr (T \geq t ∣ H_{0}), Pr (T \leq t ∣ H_{0})$ for a two-sided test.
- If the distribution of $T$ is symmetric about zero, then $p = Pr (∣ T ∣ \geq ∣ t ∣ ∣ H_{0})$

CLT Test Statistic

$p = ⎩ ⎨ ⎧ Pr (T_{n} \geq t) Pr (T_{n} \leq t) Pr (∣ T_{n} ∣ \geq ∣ t ∣) (right-tail test) (left-tail test) (two-sided test)$ where $t$ is the observed test statistic.

❗️ Since the test statistic is random, p-value is also random.

Fundamental rule of statistics

$Reject H_{0} ⟺ p -value < α$

In other words, an almost impossible event ( $p < α$ ) happens given H0, thus it is rejected.

💡 The smaller the p-value, the more confidently one can reject $H_{0}$ , because the event is too unlikely to happen under null.

Formal Definitions

The above note provides an intuitive understanding of p-value. This section discusses p-value from a more formal perspective.

We first formalize the idea in ^reject using rejection regions. Suppose we are given a test statistic $T$ and its rejection regions $RR_{α}$ for any level $α$ . Then, we have

p -value = in f {α : T \in RR_{α}} . (1)

We then formalize the idea in ^extreme. For a test statistic $T$ , whose randomness depends on the underlying true parameter, we denote $T_{θ_{0}}$ as the same statistic but whose distribution is determined by the null parameter $θ_{0}$ and is usually known. Then, suppose the rejection region is of the form $RR_{α} = {T \geq c_{α}}$ , i.e., consider a right-tail test. Then, we have

p -value = P_{θ_{0}} (T_{θ_{0}} \geq T) . (2)

Specifically, upon realization of the test statistic $T = t$ , the p-value is realized as $P_{θ_{0}} (T_{θ_{0}} \geq t)$ .

Finally, we provide the most general definition of p-value, which does not depend on the rejection region: p-value is a Statistic $p : X \to [0, 1]$ such that

P_{θ_{0}} (p (X) \leq t) \leq t, \forall t \in [0, 1], θ_{0} \in Θ_{0} . (3)

Recall that the CDF of a uniform distribution is $P_{Unif} (X \leq t) = t$ for $t \in [0, 1]$ . Thus, p-value is also said to has a super-uniform distribution. Common p-values are uniform.

Understanding the super-uniformality of p-value

p-value is a tail probability. Under null, we do not expect the p-value to be very small, because the tail probability of being too small is too small.

p-value, as a random variable, puts the evidence for null on a standardized $[0, 1]$ -scale. In other words, if the null hypothesis is true, we should observe a p-value uniformly distributed on $[0, 1]$ .

To verify the validity of the above definitions $(1)$ , $(2)$ , and $(3)$ , one just need to check that the resultant test $ψ (X) = 1 (p (X) \leq α)$ gives a size at most $α$ .

CI and HT duality

Denoting $ψ_{α} = 1 {p \leq α}$ , we see that a p-value summaries the collection of test ${ψ_{α}}_{α \in [0, 1]}$ for different levels and a fixed null.

Due to the Confidence Interval and Hypothesis Test Duality, we see that a confidence interval summaries the collection of tests ${ψ_{\tilde{θ}}}_{\tilde{θ} \in Θ}$ for different nulls and a fixed level.

Examples

CLT Test Statistic

Given a CLT Test Statistic $T \to d N (0, 1)$ , an asymptotic right-tail p-value is $p (X) = 1 - Φ (T (X))$ , i.e., the Gaussian tail bound. The construction is based on the definition $(2)$ .

Likelihood Ratio Test

For a simple-simple HT, given its likelihood ratio $g (x) = f_{1} (x) / f_{0} (x)$ , $1/ g$ is a p-value. We verify that $1/ g$ satisfies the definition $(3)$ :

P (1/ g (X) \leq t) = P (g (X) \geq t^{- 1}) \leq \frac{E _{θ_{0}} [ g ( X )]}{( t ^{- 1} ) ^{- 1}} = t \int \frac{f _{1} ( x )}{f _{0} ( x )} f_{0} (x) d x = t,

where the inequality uses Markov Inequality. See also Likelihood Ratio Test for more general results.

Sufficient Statistics

Table of Contents

Backlinks

Graph View

p-value

Table of Contents

p-value

Introduction

Formal Definitions

Examples

CLT Test Statistic

Likelihood Ratio Test

Backlinks

Graph View