Here leverage-dist(A) is the distribution on {1,…,n} that samples each index i with probability proportional to the leverage score ℓi of the ith column of A.
Note that applying the Leverage score sampling matrix amounts to extracting rows, and therefore S can be applied to matrices and vectors without reading the whole input.
Many of the facts about leverage-score sketching (e.g. Theorem 2.9) can be extended to arbitrary probability distributions.
In this case, what matters is how close the sampling distribution is to the leverage score distribution.
As an example, consider row-norm sampling, where we sample proportional to ri:=∥eiTA∥2 for some matrix A.
Row-norms are a common way of measuring the “importance” of rows of a matrix.
This demonstrated that, we can use row-norm sampling in place of leverage score sampling, at the cost of paying for the conditioning of the matrix A.
This highlights an important fact: row-norm sampling is basis dependent.
On the other hand, many tasks where we might used row-norm sampling are basis independent, and instead just depend on the relevant subspace.
In such cases, one should think twice about using row-norm sampling.