Skip to article frontmatterSkip to article content

It’s natural to combine Krylov subspace methods with variance reduction techniques for trace estimation. This can lead to improvements over Stochastic Lanczos Quadrature, which is based on the Girard-Hutchinson estimator. Krylov aware methods take this one step further, by making use of interaction between Krylov subspace methods and randomized low-rank approximation techniques to more efficiently produce low-rank approximations to matrix functions Chen & Hallman, 2023Persson et al., 2025.

Let’s consider a Randomized-SVD type approach to obtaining a low-rank approximation to f(A)f(\vec{A}), where products with f(A)f(\vec{A}) are approximated via Krylov subspace methods.

The first key observation is that Y\vec{Y} is contained in some block Krylov subspace

Ks(A,Ω)=span{Ω,AΩ,A2Ω,,As1Ω},\mathcal{K}_s(\vec{A},\vec{\Omega}) = \text{span}\{\vec{\Omega}, \vec{A}\vec{\Omega}, \vec{A}^2\vec{\Omega}, \ldots, \vec{A}^{s-1}\vec{\Omega}\},

where ss is the number of steps taken by the KSM.

Let Qs\vec{Q}_{s} be an orthonormal basis for Ks(A,Ω)\mathcal{K}_s(\vec{A},\vec{\Omega}). We can compute Qs\vec{Q}_{s} using the same number of matrix-vector products with A\vec{A} as were used to compute Y\vec{Y}. However, range(Q)range(Qs)\range(\vec{Q})\subseteq \range(\vec{Q}_{s}), so we may expect that QsQsTf(A)\vec{Q}_{s}\vec{Q}_{s}^\T f(\vec{A}) is a better approximation to f(A)f(\vec{A}) than QQTf(A)\vec{Q}\vec{Q}^\T f(\vec{A}).

However, since Qs\vec{Q}_{s} has many more columns than Q\vec{Q}, at first glance computing an approximation to f(A)Qsf(\vec{A})\vec{Q}_{s} seems to require many more products than computing an approximation to f(A)Qf(\vec{A})\vec{Q}.

The second key observation that because Qs\vec{Q}_{s} is a basis for a Krylov subspace, we can efficiently compute an approximation to f(A)Qsf(\vec{A}) \vec{Q}_{s}. In particular, we have the following:

Proof
Kr+1(A,Qs)=range([QsAQsArQs])=range([ΩAΩArΩAΩA2ΩAr+1ΩArΩAr+1ΩAs+r1Ω])=Ks+r(A,Ω).\begin{aligned} \mathcal{K}_{r+1}(\bm{A},\bm{Q}_s) &= \range\big( \big[ \bm{Q}_s \,\bm{A}\bm{Q}_s \, \cdots \, \bm{A}^r\bm{Q}_s \big]\big)\\ &= \range\big( \big[\bm{\Omega} \,\hspace{5pt}\bm{A}\bm{\Omega} \, \hspace{5pt}\cdots\hspace{5pt} \, \bm{A}^r\bm{\Omega}\\% First line &\hspace{2.2cm} \bm{A} \bm{\Omega} \,\hspace{5pt} \bm{A}^2\bm{\Omega} \, \cdots \, \bm{A}^{r+1} \bm{\Omega}\\ & \hspace{3.5cm} \ddots\\ & \hspace{3.5cm} \bm{A}^r \bm{\Omega} \,\hspace{5pt}\bm{A}^{r+1} \bm{\Omega} \,\hspace{5pt} \cdots \, \bm{A}^{s+r-1} \bm{\Omega} \big] \big) \\&= \mathcal{K}_{s+r}(\bm{A},\bm{\Omega}). \end{aligned}

This means that quantities such as QsTf(A)Qs\vec{Q}_s^\T f(\vec{A}) \vec{Q}_s can be approximated from Ks+r(A,Ω)\mathcal{K}_{s+r}(\vec{A},\vec{\Omega}). This requires just nrnr additional matrix vector products with A\vec{A}, rather than n(q+1)n(q+1).

While this idea is simple, our algorithm can often perform much better than if matrix-vector products with f(A)f(\vec{A}) are treated as a black box. This is particularly true for functions like f(x)=xf(x) = \sqrt{x}, where we may require qq to be quite large to obtain a good approximation to f(A)Ωf(\vec{A}) \vec{\Omega}, but even the basis obtained with q=1q=1 (which is basically like sketching A\vec{A}) works well.

Similar ideas are explored in Persson & Kressner, 2023Persson et al., 2025 where it is shown that sketching A\vec{A} instead of f(A)f(\vec{A}) is a good idea for operator monotone functions.

References
  1. Chen, T., & Hallman, E. (2023). Krylov-Aware Stochastic Trace Estimation. SIAM Journal on Matrix Analysis and Applications, 44(3), 1218–1244. 10.1137/22m1494257
  2. Persson, D., Chen, T., & Musco, C. (2025). Randomized block-Krylov subspace methods for low-rank approximation of matrix functions. https://arxiv.org/abs/2502.01888
  3. Persson, D., & Kressner, D. (2023). Randomized Low-Rank Approximation of Monotone Matrix Functions. SIAM Journal on Matrix Analysis and Applications, 44(2), 894–918. 10.1137/22m1523923
  4. Persson, D., Meyer, R. A., & Musco, C. (2025). Algorithm-Agnostic Low-Rank Approximation of Operator Monotone Matrix Functions. SIAM Journal on Matrix Analysis and Applications, 46(1), 1–21. 10.1137/23m1619435