This is a companion piece to the paper: https://arxiv.org/abs/2205.01736
Code to replicate the figures in this document can be found on Github.
Familiarity with Lanczos-FA or other Krylov subspace methods will be useful.
An understanding of Hutch++ is important for understanding the broader context, but not strictly necessary for this introduction.
Let be a symmetric matrix. An important primative in a number of state of the art trace estimation algorithms like Hutch++ [1] is finding an orthonormal matrix so that That is, finding the approximate span of the dominant eigenspace of .
A standard technique is by sketching [2]. In particular, we can use the following algorithm:
Algorithm. Randomized range finder
It can be shown that the produced by this algorithm is competitive with the best possible rank .
Often, for some matrix function . In this case, it is standard to approximate products with using a Krylov subspace methods, for instance the (block) Lanczos method for matrix function approximation. Such methods approximate terms like by constructing the Krylov subspace
If we are using such a method with the randomized range finder, the first step is to approximate using . Then, we obtain an orthonormal basis for this space. However, in doing so, we implcitly construct the space , so we can obtain a better by using , an orthonormal basis for .
The ostensible downside is that to compute now requires matrix-vector products with . However, this is not actually the case. The key observation made in this paper is the following: If is a basis for , then
This means that quantities such as can be approximated from . This requires just additional matrix vector products with , rather than .
While this idea is simple, our algorithm can often perform much better than if matrix-vector products with are treated as a black box. This is particularly true for functions like , where we may require to be quite large to obtain a good approximation to , but even the basis obtained with (which is basically like sketching ) works well.
A similar idea was explored in [3] where it is shown that sketching instead of is a good idea for operator monotone functions.
1. Meyer, R.A.; Musco, C.; Musco, C.; Woodruff, D.P. Hutch++: Optimal Stochastic Trace Estimation. In Proceedings of the Symposium on simplicity in algorithms (sosa); SIAM, 2021; pp. 142–155.
2. Halko, N.; Martinsson, P.G.; Tropp, J.A. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM Review 2011, 53, 217–288, doi:10.1137/090771806.
3. Persson, D.; Kressner, D. Randomized Low-Rank Approximation of Monotone Matrix Functions. arXiv preprint: 2209.11023 2022.