Convergence of Random Variables

Relation between Convergence Modes

where , , and are PDF (for continuous r.v.s), PMF (for discrete r.v.s), and characteristic function, respectively, and their convergences are in the sense of convergence of functions: .

  • 📎 If follows a point-mass/Dirac Distribution , then .

Convergence under transformations

Mode \ OperationCMT (g) 1Addition (+)Multiplication (×)Division (÷) 2Joint Distribution (·,·)

As a Random Variable involves many different elements, we can define various modes of convergence for a sequence of random variables. Some definitions view random variables as Measurable functions, others as probability measures, and some through their associated functions, such as the CDF and the Characteristic Function.

In this note, we denote a sequence of random variables as .

Almost Sure/ Strong Convergence

  • Definition: We say that converges almost surely/ almost everywhere/ with probability 1/ strongly to if
  • Notation: .
  • Alternative interpretation: Viewing a random variable as a measure of events, the above convergence is equivalent to

where is the event space.

  • ❗️ Remark: are generally highly dependent for this convergence to hold.

    • 📗 Partial sum infinite sum

    • 📗 R.v.s defined as converging functions of a single underlying r.v.

Convergence in Probability

  • Definition: We say that converges in probability to if for any ,
  • Notation: .

  • ❗️ Remark: Convergence in probability has a similar interpretation as almost sure convergence, and it also generally requires to be dependent. However, it’s weaker than almost sure convergence as it’s not uniform: different pairs such that require different .

Convergence in Distribution/ Weak Convergence

  • Definition: We say that converges in distribution/ in law/ weakly to if

for all at which is continuous, where is a Cumulative Distribution Function.

  • Notation: .

  • Alternative interpretation: Weak convergence inspects the convergence of the associated CDF of a random variable sequence, and thus it’s weaker and generally requires no dependency between . It’s equivalent to the convergence of the Characteristic Function.

  • ❗️ Remark: Weak convergence is consistent with the convergence of real numbers. If and , then .

    • ❗️ This consistency does not hold if we require for all .

Portmanteau Lemma

Several important statements equivalent to convergence in distribution are given by the Portmanteau Lemma:

  1. for any bounded, continuous/Lipschitz function .
  2. for any nonnegative and continuous function.
  3. for any open set .
  4. for any closed set .
  5. for any continuity set3 .
    • which is further equivalent to .

Convergence of PDF/PMF

Suppose . The convergence of the associated PDF/PMF is unclear:

  • It is possible for to be discrete and to be continuous.

    • 📗 .
  • It is possible for to be continuous and to be discrete.

    • 📗 .
  • ❗️ If and are continuous, it is possible that the PDF does not converge.

    • 📗 but

For the other direction, we have:

  • If and are continuous, the convergence of PDF implies the convergence in distribution.
  • If and are discrete, the convergence in distribution is equivalent to the convergence of PMF.

Convergence of Characteristic Functions

Convergence in distribution is equivalent to the convergence of Characteristic Functions:

Convergence in Norm

  • Definition: We say that converges ==in norm/ in th mean== to if
  • Notation: .

  • ❗️ Remark: For , we have , but not the other way around.

Convergence under Transformations

Continuous Mapping Theorem

Let be a sequence of random variables that converges almost surely/ in probability/ in distribution to . Let be a continuous function. Then converges to .

  • ❗️ The continuous mapping theorem does not apply to convergence in norm.

Further, if is continuous at and , then . need not be continuous in this case.

Slutsky’s Theorem

Let and be sequences of random variables that converge in distribution to and respectively, where is a constant. Then

where is a set of binary operations. For to be defined, must be nonzero.

  • ❗️ The theorem also holds for convergence in probability.

For almost sure convergence, convergence in probability, and in norm, we have stronger results:

Suppose and . Then

Suppose and . Then

Note that we no longer restrict to converge to a constant.

Sum of IID Random Variables

A good example to illustrate different modes of convergence is the sum, or average, of iid random variables. Suppose are iid with finite mean and variance . Let . Then, the following theorems claim that converges to ,

Footnotes

  1. The function is applied on and , and is continuous on the range of .

  2. For the division operation to be well-defined, the denominator sequence and its limit must be non-zero.

  3. A continuity set has a zero-measure boundary.