An important property of a sketching matrix is that it preserves the geometry of a subspace.
This is quantified by the following definition:
Let V V V be a subspace of R n \R^n R n . We say a matrix S \vec{S} S is a subspace embedding for V V V with distortion ε ∈ ( 0 , 1 ) \varepsilon\in(0,1) ε ∈ ( 0 , 1 ) if
∀ x ∈ V : ( 1 − ε ) ∥ x ∥ ≤ ∥ S x ∥ ≤ ( 1 + ε ) ∥ x ∥ . \forall \vec{x}\in V: (1-\varepsilon)\|\vec{x}\| \leq \|\vec{S}\vec{x}\| \leq (1+\varepsilon)\|\vec{x}\|. ∀ x ∈ V : ( 1 − ε ) ∥ x ∥ ≤ ∥ Sx ∥ ≤ ( 1 + ε ) ∥ x ∥. When V = range ( A ) V = \range(\vec{A}) V = range ( A ) for some matrix A \vec{A} A , we say that S \vec{S} S is a subspace embedding for A \vec{A} A .
In particular, a subspace embedding ensures that the lengths of all vectors within a subspace are preserved by the sketch.
Equivalent Characterizations ¶ It will sometimes be useful to work with the following equivalent characterizations of subspace embeddings:
Proof
Suppose S \vec{S} S is a subspace embedding for V V V .
Any x ∈ V \vec{x} \in V x ∈ V can be written as x = V c \vec{x} = \vec{V}\vec{c} x = Vc for some c ∈ R d \vec{c} \in \R^d c ∈ R d .
Thus, the subspace embedding condition can be rewritten as (this is true even if V \vec{V} V isn’t orthonormal)
∀ c ∈ R d : ( 1 − ε ) ∥ V c ∥ ≤ ∥ S V c ∥ ≤ ( 1 + ε ) ∥ V c ∥ . \forall \vec{c}\in \R^d: (1-\varepsilon)\|\vec{V}\vec{c}\| \leq \|\vec{S}\vec{V}\vec{c}\| \leq (1+\varepsilon)\|\vec{V}\vec{c}\|. ∀ c ∈ R d : ( 1 − ε ) ∥ Vc ∥ ≤ ∥ SVc ∥ ≤ ( 1 + ε ) ∥ Vc ∥. Since V \vec{V} V is orthonormal, ∥ V c ∥ = ∥ c ∥ \|\vec{V}\vec{c}\| = \|\vec{c}\| ∥ Vc ∥ = ∥ c ∥ and hence we have the equivalent condition
∀ c ∈ R d : ( 1 − ε ) ∥ c ∥ ≤ ∥ S V c ∥ ≤ ( 1 + ε ) ∥ c ∥ . \forall \vec{c}\in \R^d: (1-\varepsilon)\|\vec{c}\| \leq \|\vec{S}\vec{V}\vec{c}\| \leq (1+\varepsilon)\|\vec{c}\|. ∀ c ∈ R d : ( 1 − ε ) ∥ c ∥ ≤ ∥ SVc ∥ ≤ ( 1 + ε ) ∥ c ∥. We can rewrite this as
σ max ( S V ) = max c ∈ R d ∥ S V c ∥ ∥ c ∥ ≤ 1 + ε , σ min ( S V ) = min c ∈ R d ∥ S V c ∥ ∥ c ∥ ≥ 1 − ε . \smax(\vec{S}\vec{V}) = \max_{\vec{c}\in \R^d} \frac{\|\vec{S}\vec{V}\vec{c}\|}{\|\vec{c}\|} \leq 1+\varepsilon
,\qquad
\smin(\vec{S}\vec{V}) = \min_{\vec{c}\in \R^d} \frac{\|\vec{S}\vec{V}\vec{c}\|}{\|\vec{c}\|} \geq 1-\varepsilon. σ max ( SV ) = c ∈ R d max ∥ c ∥ ∥ SVc ∥ ≤ 1 + ε , σ min ( SV ) = c ∈ R d min ∥ c ∥ ∥ SVc ∥ ≥ 1 − ε . Finally, note that
∥ V T S T S V − I ∥ 2 = max i ∣ λ i ( V T S T S V ) − 1 ∣ ∣ = max i ∣ σ max ( S V ) 2 − 1 ∣ = max { ( 1 + ε ) 2 − 1 , 1 − ( 1 − ε ) 2 } = ε ( 2 + ε ) . \begin{aligned}
\| \vec{V}^\T \vec{S}^\T \vec{S} \vec{V} - \vec{I} \|_2
&= \max_{i} |\lambda_i(\vec{V}^\T \vec{S}^\T \vec{S} \vec{V}) - 1||
\\&= \max_{i} |\smax(\vec{S}\vec{V})^2 - 1|
\\&= \max\{ (1+\varepsilon)^2 - 1, 1 - (1-\varepsilon)^2 \}
\\&= \varepsilon(2+\varepsilon).
\end{aligned} ∥ V T S T SV − I ∥ 2 = i max ∣ λ i ( V T S T SV ) − 1∣∣ = i max ∣ σ max ( SV ) 2 − 1∣ = max {( 1 + ε ) 2 − 1 , 1 − ( 1 − ε ) 2 } = ε ( 2 + ε ) . In TCS, where the rates as ε → 0 \varepsilon\to0 ε → 0 are important, the set of “equivalent” definitions is larger.
For instance, the condition on the spectral norm is often written as ∥ V T S T S V − I ∥ 2 ≤ ε \| \vec{V}^\T \vec{S}^\T \vec{S} \vec{V} - \vec{I} \|_2 \leq \varepsilon ∥ V T S T SV − I ∥ 2 ≤ ε .
Similarly, in some cases, a subspace embedding is defined with respect to the squared norms, since 1 + ε = 1 + ε / 2 + O ( ε 2 ) \sqrt{1+\varepsilon} = 1 + \varepsilon/2 + O(\varepsilon^2) 1 + ε = 1 + ε /2 + O ( ε 2 ) as ε → 0 \varepsilon \to 0 ε → 0 .