Skip to article frontmatterSkip to article content

Here leverage-dist(A)\Call{leverage-dist}(\vec{A}) is the distribution on {1,,n}\{1,\ldots,n\} that samples each index ii with probability proportional to the leverage score i\ell_i of the iith column of A\vec{A}.

Note that applying the Leverage score sampling matrix amounts to extracting rows, and therefore S\vec{S} can be applied to matrices and vectors without reading the whole input.

Subspace Embedding

Leverage score sampling gives a subspace embedding:

A nice proof of this can be found on Raphael’s wiki.

Approximate Leverage-scores

Many of the facts about leverage-score sketching (e.g. Theorem 2.9) can be extended to arbitrary probability distributions. In this case, what matters is how close the sampling distribution is to the leverage score distribution.

Row-norm sampling

As an example, consider row-norm sampling, where we sample proportional to ri:=eiTA2r_i := \| \vec{e}_i^\T \vec{A} \|^2 for some matrix A\vec{A}. Row-norms are a common way of measuring the “importance” of rows of a matrix.

Let VR=A\vec{V}\vec{R} = \vec{A} be a QR decomposition of A\vec{A}. Then

ri=eiA2=eiVR2.r_i = \| \vec{e}_i \vec{A} \|^2 = \|\vec{e}_i \vec{V} \vec{R} \|^2.

Then, since R=σmax(A)\|\vec{R}\| = \smax(\vec{A}) and R1=σmin(A)\|\vec{R}^{-1}\| = \smin(\vec{A})\|, we find that

iσmin(A)2riiσmax(A)2.\ell_i \smin(\vec{A})^2 \leq r_i \leq \ell_i \smax(\vec{A})^2.

This demonstrated that, we can use row-norm sampling in place of leverage score sampling, at the cost of paying for the conditioning of the matrix A\vec{A}.

This highlights an important fact: row-norm sampling is basis dependent. On the other hand, many tasks where we might used row-norm sampling are basis independent, and instead just depend on the relevant subspace. In such cases, one should think twice about using row-norm sampling.