Skip to article frontmatterSkip to article content

The Randomized SVD (RSVD) and improvements require multiple passes over A\vec{A}. In some cases, it may be advantageous to use a method that only requires a single pass over A\vec{A}.

Nyström for positive definite matrices

When A\vec{A} is symmetric positive semi-definite, we can use the Nyström approximation generated by a sketching matrix ΩRn×k\vec{\Omega}\in\R^{n\times k}.

AΩ:=AΩ(ΩTAΩ)+ΩTA.\vec{A} \langle \vec{\Omega}\rangle := \vec{A}\vec{\Omega} ( \vec{\Omega}^\T\vec{A}\vec{\Omega})^+ \vec{\Omega}^\T\vec{A}.

When Ω\vec{\Omega} is a Gaussian sketching matrix, this method satisfies similar theoretical guarantees to the Randomized SVD (with respect to kk), despite requiring half the number of matrix-vector products and only one pass over the data; see e.g. Corollary 8.2 in Tropp & Webber, 2023.

Block Krylov methods

In fact, at the cost of more passes over the data, we can replace Ω\vec{\Omega} with a basis for a Krylov subspace. As noted in Lemma 5.2 in Tropp & Webber, 2023 (below), the Nyström method always produces a low-rank approximation that is at least as good one-sided projection based methods Randomized Block Krylov Iteration.

Generalized Nyström for arbitrary matrices

A similar approach can be used for arbitrary A\vec{A}. Given sketching matrices ΩRd×k1\vec{\Omega}\in\R^{d\times k_1} and ΨRn×k2\vec{\Psi}\in\R^{n\times k_2}, the Generalized Nyström approximation is the approximation

AΩ,Ψ:=AΩ(ΨTAΩ)+ΨTA.\vec{A}\langle \vec{\Omega},\vec{\Psi}\rangle := \vec{A}\vec{\Omega} (\vec{\Psi}^\T\vec{A}\vec{\Omega})^+ \vec{\Psi}^\T\vec{A}.

We can understand the Generalized Nyström method as approximating the adaptive step in the Randomized SVD. Indeed, note that the matrix X=ATQ\vec{X} = \vec{A}^\T\vec{Q} computed by the Randomized SVD is the matrix of coefficients for the linear combination of the columns of Q\vec{Q} that best approximates the columns of A\vec{A}. That is,

AQ=arg minXRb×dAQXTF.\vec{A}\vec{Q} = \argmin_{\vec{X}\in\R^{b\times d}} \| \vec{A} - \vec{Q}\vec{X}^\T \|_\F.

Consider instead the sketched regression problem

arg minXRb×dΦTAΦTQXTF.\argmin_{\vec{X}\in\R^{b\times d}} \| \vec{\Phi}^\T \vec{A} - \vec{\Phi}^\T \vec{Q}\vec{X}^\T \|_\F.

This results in an approximation

QXT=Q(ΨTQ)+ΦTA=AΩ,Ψ.\vec{Q}\vec{X}^\T = \vec{Q}(\vec{\Psi}^\T\vec{Q})^+ \vec{\Phi}^\T \vec{A} = \vec{A}\langle \vec{\Omega},\vec{\Psi}\rangle.

The above procedure is summarized by the following algorithm.

Sketching dimension

To obtain a (1+ε)(1+\varepsilon) approximation in the Frobenius norm (similar to Theorem 5.1 for the RSVD), its not too hard to show that we must solve the regression problem (5.14) to relative accuracy (1+ε)(1+\varepsilon). Based on the analysis on sketch-and-solve, this requires that the sketching matrix Φ\vec{\Phi} have roughly 1/ε1/\varepsilon times the number of columns as ΦTQ\vec{\Phi}^\T \vec{Q}.

This is in contrast to the approximation (5.11) for positive semi-definite case, which works with the same sketching dimension as the Randomized SVD.

References
  1. Tropp, J. A., & Webber, R. J. (2023). Randomized algorithms for low-rank matrix approximation: Design, analysis, and applications. https://arxiv.org/abs/2306.12418