An important property of a sketching matrix is that it preserves the geometry of a subspace.
This is quantified by the following definition:
Let V V V be a subspace of R n \R^n R n . We say a matrix S \vec{S} S is a subspace embedding for V V V with distortion parameters ( α , β ) (\alpha, \beta) ( α , β ) where 0 < α ≤ 1 ≤ β 0 < \alpha \leq 1 \leq \beta 0 < α ≤ 1 ≤ β if
∀ x ∈ V : α ∥ x ∥ ≤ ∥ S x ∥ ≤ β ∥ x ∥ . \forall \vec{x}\in V: \alpha\|\vec{x}\| \leq \|\vec{S}\vec{x}\| \leq \beta\|\vec{x}\|. ∀ x ∈ V : α ∥ x ∥ ≤ ∥ Sx ∥ ≤ β ∥ x ∥. When V = range ( A ) V = \range(\vec{A}) V = range ( A ) for some matrix A \vec{A} A , we say that S \vec{S} S is a subspace embedding for A \vec{A} A .
In particular, a subspace embedding ensures that the lengths of all vectors within a subspace are approximately preserved by the sketch.
We note that it is common to use the parameterization
α = 1 − ε , β = 1 + ε . \alpha = 1-\varepsilon
,\qquad
\beta = 1+\varepsilon. α = 1 − ε , β = 1 + ε . In this case, it is common to say that S \vec{S} S is an ε \varepsilon ε -subspace embedding.
Equivalent Characterizations ¶ It will sometimes be useful to work with the following equivalent characterizations of subspace embeddings:
Proof
Suppose S \vec{S} S is a subspace embedding for V V V .
Any x ∈ V \vec{x} \in V x ∈ V can be written as x = V c \vec{x} = \vec{V}\vec{c} x = Vc for some c ∈ R d \vec{c} \in \R^d c ∈ R d .
Thus, the subspace embedding condition can be rewritten as (this is true even if V \vec{V} V isn’t orthonormal)
∀ c ∈ R d : α ∥ V c ∥ ≤ ∥ S V c ∥ ≤ β ∥ V c ∥ . \forall \vec{c}\in \R^d: \alpha\|\vec{V}\vec{c}\| \leq \|\vec{S}\vec{V}\vec{c}\| \leq \beta\|\vec{V}\vec{c}\|. ∀ c ∈ R d : α ∥ Vc ∥ ≤ ∥ SVc ∥ ≤ β ∥ Vc ∥. Since V \vec{V} V is orthonormal, ∥ V c ∥ = ∥ c ∥ \|\vec{V}\vec{c}\| = \|\vec{c}\| ∥ Vc ∥ = ∥ c ∥ and hence we have the equivalent condition
∀ c ∈ R d : α ∥ c ∥ ≤ ∥ S V c ∥ ≤ β ∥ c ∥ . \forall \vec{c}\in \R^d: \alpha\|\vec{c}\| \leq \|\vec{S}\vec{V}\vec{c}\| \leq \beta\|\vec{c}\|. ∀ c ∈ R d : α ∥ c ∥ ≤ ∥ SVc ∥ ≤ β ∥ c ∥. We can rewrite this as
σ max ( S V ) = max c ∈ R d ∥ S V c ∥ ∥ c ∥ ≤ β , σ min ( S V ) = min c ∈ R d ∥ S V c ∥ ∥ c ∥ ≥ α . \smax(\vec{S}\vec{V}) = \max_{\vec{c}\in \R^d} \frac{\|\vec{S}\vec{V}\vec{c}\|}{\|\vec{c}\|} \leq \beta
,\qquad
\smin(\vec{S}\vec{V}) = \min_{\vec{c}\in \R^d} \frac{\|\vec{S}\vec{V}\vec{c}\|}{\|\vec{c}\|} \geq \alpha. σ max ( SV ) = c ∈ R d max ∥ c ∥ ∥ SVc ∥ ≤ β , σ min ( SV ) = c ∈ R d min ∥ c ∥ ∥ SVc ∥ ≥ α . Finally, note that
∥ V T S T S V − I ∥ 2 = max i ∣ λ i ( V T S T S V ) − 1 ∣ ∣ = max i ∣ σ max ( S V ) 2 − 1 ∣ ≤ max { β 2 − 1 , 1 − α 2 } . \begin{aligned}
\| \vec{V}^\T \vec{S}^\T \vec{S} \vec{V} - \vec{I} \|_2
&= \max_{i} |\lambda_i(\vec{V}^\T \vec{S}^\T \vec{S} \vec{V}) - 1||
\\&= \max_{i} |\smax(\vec{S}\vec{V})^2 - 1|
\\&\leq \max\{ \beta^2 - 1, 1 - \alpha^2 \}.
\end{aligned} ∥ V T S T SV − I ∥ 2 = i max ∣ λ i ( V T S T SV ) − 1∣∣ = i max ∣ σ max ( SV ) 2 − 1∣ ≤ max { β 2 − 1 , 1 − α 2 } . In TCS, where the rates as ε → 0 \varepsilon\to0 ε → 0 (α = 1 − ε \alpha=1-\varepsilon α = 1 − ε , β = 1 + ε \beta=1+\varepsilon β = 1 + ε ) are important.
In this case, the set of “equivalent” definitions is larger.
For instance, the condition on the spectral norm simplifies to ∥ V T S T S V − I ∥ 2 ≤ ε ( 2 + ε ) \| \vec{V}^\T \vec{S}^\T \vec{S} \vec{V} - \vec{I} \|_2 \leq \varepsilon(2+\varepsilon) ∥ V T S T SV − I ∥ 2 ≤ ε ( 2 + ε ) , which is often written as ∥ V T S T S V − I ∥ 2 ≤ ε \| \vec{V}^\T \vec{S}^\T \vec{S} \vec{V} - \vec{I} \|_2 \leq \varepsilon ∥ V T S T SV − I ∥ 2 ≤ ε .
Similarly, in some cases, a subspace embedding is defined with respect to the squared norms, since 1 + ε = 1 + ε / 2 + O ( ε 2 ) \sqrt{1+\varepsilon} = 1 + \varepsilon/2 + O(\varepsilon^2) 1 + ε = 1 + ε /2 + O ( ε 2 ) as ε → 0 \varepsilon \to 0 ε → 0 .