Leverage Score Sketching - Randomized Numerical Linear Algebra with Examples

Here $\Call{leverage-dist}(\vec{A})$ is the distribution on $\{1,\ldots,n\}$ that samples each index $i$ with probability proportional to the leverage score $\ell_i$ of the $i$ th column of $\vec{A}$ .

Note that applying the Leverage score sampling matrix amounts to extracting rows, and therefore $\vec{S}$ can be applied to matrices and vectors without reading the whole input.

Subspace Embedding¶

Leverage score sampling gives a subspace embedding:

A nice proof of this can be found on Raphael’s wiki.

Approximate Leverage-scores¶

Many of the facts about leverage-score sketching (e.g. Theorem 2.9) can be extended to arbitrary probability distributions. In this case, what matters is how close the sampling distribution is to the leverage score distribution.

Row-norm sampling¶

As an example, consider row-norm sampling, where we sample proportional to $r_i := \| \vec{e}_i^\T \vec{A} \|^2$ for some matrix $\vec{A}$ . Row-norms are a common way of measuring the “importance” of rows of a matrix.

Let $\vec{V}\vec{R} = \vec{A}$ be a QR decomposition of $\vec{A}$ . Then

r_i = \| \vec{e}_i \vec{A} \|^2 = \|\vec{e}_i \vec{V} \vec{R} \|^2.

(2.2)

Then, since $\|\vec{R}\| = \smax(\vec{A})$ and $\|\vec{R}^{-1}\| = \smin(\vec{A})\|$ , we find that

\ell_i \smin(\vec{A})^2 \leq r_i \leq \ell_i \smax(\vec{A})^2.

(2.3)

This demonstrated that, we can use row-norm sampling in place of leverage score sampling, at the cost of paying for the conditioning of the matrix $\vec{A}$ .

This highlights an important fact: row-norm sampling is basis dependent. On the other hand, many tasks where we might used row-norm sampling are basis independent, and instead just depend on the relevant subspace. In such cases, one should think twice about using row-norm sampling.

Randomized Numerical Linear Algebra with Examples

Subsampling Sketches

Randomized Numerical Linear Algebra with Examples

Exact Factorization