One-pass (Nyström) methods - Randomized Numerical Linear Algebra with Examples

The Randomized SVD (RSVD) and improvements require multiple passes over $\vec{A}$ . In some cases, it may be advantageous to use a method that only requires a single pass over $\vec{A}$ .

Nyström for positive definite matrices¶

When $\vec{A}$ is symmetric positive semi-definite, we can use the Nyström approximation generated by a sketching matrix $\vec{\Omega}\in\R^{n\times k}$ .

\vec{A} \langle \vec{\Omega}\rangle := \vec{A}\vec{\Omega} ( \vec{\Omega}^\T\vec{A}\vec{\Omega})^+ \vec{\Omega}^\T\vec{A}.

(5.11)

When $\vec{\Omega}$ is a Gaussian sketching matrix, this method satisfies similar theoretical guarantees to the Randomized SVD (with respect to $k$ ), despite requiring half the number of matrix-vector products and only one pass over the data; see e.g. Corollary 8.2 in Tropp & Webber, 2023.

Block Krylov methods¶

In fact, at the cost of more passes over the data, we can replace $\vec{\Omega}$ with a basis for a Krylov subspace. As noted in Lemma 5.2 in Tropp & Webber, 2023 (below), the Nyström method always produces a low-rank approximation that is at least as good one-sided projection based methods Randomized Block Krylov Iteration.

Generalized Nyström for arbitrary matrices¶

A similar approach can be used for arbitrary $\vec{A}$ . Given sketching matrices $\vec{\Omega}\in\R^{d\times k_1}$ and $\vec{\Psi}\in\R^{n\times k_2}$ , the Generalized Nyström approximation is the approximation

\vec{A}\langle \vec{\Omega},\vec{\Psi}\rangle := \vec{A}\vec{\Omega} (\vec{\Psi}^\T\vec{A}\vec{\Omega})^+ \vec{\Psi}^\T\vec{A}.

(5.12)

We can understand the Generalized Nyström method as approximating the adaptive step in the Randomized SVD. Indeed, note that the matrix $\vec{X} = \vec{A}^\T\vec{Q}$ computed by the Randomized SVD is the matrix of coefficients for the linear combination of the columns of $\vec{Q}$ that best approximates the columns of $\vec{A}$ . That is,

\vec{A}\vec{Q} = \argmin_{\vec{X}\in\R^{b\times d}} \| \vec{A} - \vec{Q}\vec{X}^\T \|_\F.

(5.13)

Consider instead the sketched regression problem

\argmin_{\vec{X}\in\R^{b\times d}} \| \vec{\Phi}^\T \vec{A} - \vec{\Phi}^\T \vec{Q}\vec{X}^\T \|_\F.

(5.14)

This results in an approximation

\vec{Q}\vec{X}^\T = \vec{Q}(\vec{\Psi}^\T\vec{Q})^+ \vec{\Phi}^\T \vec{A} = \vec{A}\langle \vec{\Omega},\vec{\Psi}\rangle.

(5.15)

The above procedure is summarized by the following algorithm.

Sketching dimension¶

To obtain a $(1+\varepsilon)$ approximation in the Frobenius norm (similar to Theorem 5.1 for the RSVD), its not too hard to show that we must solve the regression problem (5.14) to relative accuracy $(1+\varepsilon)$ . Based on the analysis on sketch-and-solve, this requires that the sketching matrix $\vec{\Phi}$ have roughly $1/\varepsilon$ times the number of columns as $\vec{\Phi}^\T \vec{Q}$ .

This is in contrast to the approximation (5.11) for positive semi-definite case, which works with the same sketching dimension as the Randomized SVD.

References¶

Tropp, J. A., & Webber, R. J. (2023). Randomized algorithms for low-rank matrix approximation: Design, analysis, and applications. https://arxiv.org/abs/2306.12418

Randomized Numerical Linear Algebra with Examples

Block Krylov Iteration