High-Dimensional Probability
High-dimensional probability theory studies a high-dim random vector and its transformation (high-dim functions). Without any additional structure, there is nothing special about this random vector. In high-dim probability, we always assume has independent (or weakly dependent) coordinates, i.e., where are independent random variables. Suppose are iid, the very first result we have is regarding its variance:
Such a factorization into the dimension and one-dim property is called tensorization, a key technique in high-dimensional probability.
For the transformation of such high-dim random vectors, high-dimensional probability theory studies high-dimensional functions of the form
Provided that is “smooth” enough, we expect Concentration of Measure, i.e., . Provided that is not too “complex”, we can express it as a Suprema of Stochastic Processes, and bound its expectation by its “complexity”.
Applications
- Statistical Learning
- Compressed sensing
- random matrices
- covariance matrix
- random graphs
- Sampling
- Optimal transport
- Gaussian approximation
Concentration of Measure
Consider the simplest form: . If is IID and has a finite mean, by the strong Central Limit Theorem, we know . We ask
Qn
- How about general functions?
- How fast is this convergence? (non-asymptotic)
Informal Principle
If is IID, then provided that is “smooth” enough, i.e., does not depend too heavily on any of its coordinates.
- Chebyshev Inequality
- Hoeffding’s Inequality
- Poincare Inequality
- Chernoff Bound
- Sobolev Inequality
- Transportation Inequality
Suprema of Stochastic Processes
Qn
How large is ?
Ex
- L2 error: .
- Convex conjugate:
The prevalence of suprema is related to the variational principle, which transform the original problem into an Optimization problem, with the original solution corresponding to a supremum.
Informal Principle
If is “smooth”, then can be controlled by the complexity of .