High-Dimensional Probability
High-dimensional probability theory studies a high-dim random vector and its transformation (high-dim functions). Without any additional structure, there is nothing special about this random vector. In high-dim probability, we always assume has independent (or weakly dependent) coordinates, i.e., where are independent random variables. Suppose are iid, the very first result we have is regarding its variance:
Such a factorization into the dimension and one-dim property is called tensorization, a key technique in high-dimensional probability.
For the transformation of such high-dim random vectors, high-dimensional probability theory studies high-dimensional functions of the form
Provided that is “smooth” enough, we expect Concentration of Measure, i.e., . Provided that is not too “complex”, we can express it as a Suprema of Stochastic Processes, and bound its expectation by its “complexity”.
Applications
- Statistical Learning
 - Compressed sensing
 - random matrices
 
- covariance matrix
 - random graphs
 - Sampling
 - Optimal transport
 - Gaussian approximation
 
Concentration of Measure
Consider the simplest form: . If is IID and has a finite mean, by the strong Central Limit Theorem, we know . We ask
Qn
- How about general functions?
 - How fast is this convergence? (non-asymptotic)
 
Informal Principle
If is IID, then provided that is “smooth” enough, i.e., does not depend too heavily on any of its coordinates.
- Chebyshev Inequality
 - Hoeffding’s Inequality
 - Poincare Inequality
 - Chernoff Bound
 - Sobolev Inequality
 - Transportation Inequality
 
Suprema of Stochastic Processes
Qn
How large is ?
Ex
- L2 error: .
 - Convex conjugate:
 
The prevalence of suprema is related to the variational principle, which transform the original problem into an Optimization problem, with the original solution corresponding to a supremum.
Informal Principle
If is “smooth”, then can be controlled by the complexity of .