Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Introduction

Randomized algorithms for approximating the trace of a n×nn\times n matrix A\vec{A} using only matrix-vector products with A\vec{A} are increasingly used as a computational primitive in a number of areas in the computational sciences.

Many commonly used stochastic trace estimation algorithms make critical use of the fact that φTAφ\bm{\varphi}^\T \vec{A} \bm{\varphi} is an unbiased estimator for tr(A)\tr(\vec{A}), at least provided that E[φφT]=I\EE[\bm{\varphi} \bm{\varphi}^\T] = \vec{I}. Indeed,

E[φTAφ]=E[tr(φTAφ)]=E[tr(AφφT)]=tr(AE[φφT])=tr(A).\EE[ \bm{\varphi}^\T\vec{A}\bm{\varphi} ] = \EE[ \tr(\bm{\varphi}^\T\vec{A}\bm{\varphi}) ] = \EE[ \tr(\vec{A}\bm{\varphi}\bm{\varphi}^\T) ] = \tr(\vec{A}\EE[\bm{\varphi}\bm{\varphi}^\T]) = \tr(\vec{A}).

Common choice of distribution for φ\bm{\varphi} include sampling the entries independently from Unif({1,+1})\operatorname{Unif}(\{-1,+1\}) or N(0,1)\mathcal{N}(0,1) or sampling φ\bm{\varphi} from nUnif(Sn1)\sqrt{n}\operatorname{Unif}(\mathbb{S}^{n-1}), the uniform distribution on the hypersphere of radius n\sqrt{n}. For all of these distributions, tail bounds of the form

P[φTAφtr(A)>ϵ]δ=δ(A,ϵ)\PP\Big[\big|\bm{\varphi}^\T \vec{A} \bm{\varphi} - \tr(\vec{A})\big| > \epsilon\Big] \leq \delta = \delta(\vec{A}, \epsilon)

have been studied Reimann, 2007Popescu et al., 2006Avron & Toledo, 2011Roosta-Khorasani & Ascher, 2014Meyer et al., 2021Cortinovis & Kressner, 2021Chen et al., 2022. Current state of the art bounds offer refined tail bounds for δ\delta depending on A\vec{A} through AF2\|\vec{A}\|_\F^2 and A2\|\vec{A}\|_2 with the precise bounds depending on the choice of distribution for v\vec{v}.

In this note, we provide a brief historical overview of these algorithms and their corresponding analyses. While our exposition is far from comprehensive, we believe it provides some insight into the history of typicality-based algorithms. We hope that some of the connections highlighted in this letter motivate further the interaction between physics, numerical analysis, and theoretical computer science.

Quantum mechanics and linear algebra

In the now standard mathematical formalism of quantum mechanics, pure states of a quantum system are represented as (unit length) elements of a Hilbert space H\mathcal{H}, and physical observables (such as energy, translational momentum, orbital angular momentum, and spin) are self-adjoint operators on H\mathcal{H}. If H\mathcal{H} has dimension n<n<\infty (which we shall assume throughout this paper), the spectral theorem ensures an observable A\vec{A} can be decomposed as

A=i=1nλiψiψiT,\vec{A} = \sum_{i=1}^{n} \lambda_i \bm{\psi}_i \bm{\psi}_i^\T,

where {λi}\{\lambda_i\} are the eigenvalues of A\vec{A} and {ψi}\{ \bm{\psi}_i \} are the (orthonormal) eigenstates.

When a measurement is made of A\vec{A} while the system is in a state ψi\bm{\psi}_i, the value m(A;ψi)m(\vec{A};\bm{\psi}_i) of the measurement observed will be λi\lambda_i. More generally, if the measurement is made while the system is in an arbitrary state φ\bm{\varphi}, the value of the measurement observed may be any of λ1,,λn\lambda_1, \ldots, \lambda_n. In particular, the values observed will be λ\lambda with probability proportional to the norm of the projection of φ\bm{\varphi} onto the the eigenstates associated with λ\lambda. That is,

Pqm[m(A;φ)=λ]=i:λi=λψiTφ2.\PP_{\textup{qm}}\big[ m(\vec{A};\bm{\varphi}) = \lambda \big] = \sum_{i:\lambda_i = \lambda} |\bm{\psi}_i^\T\bm{\varphi}|^2.

Here the subscript ``qm’’ indicates that the probability is with respect to the inherent randomness in the formalism of quantum mechanics.

That the results of a measurement are probabilistic is distinctly quantum; in the classical setting, repeated measurements of a system in a given state result in exactly the same measurement value. However, repeated measurements of an observable in a given state will eventually reveal an average value

Aφ:=Eqm[m(A;φ)]=i=1nλiψiTφ2=φTAφ,\langle \vec{A} \rangle_{\bm{\varphi}} := \EE_{\textup{qm}}[m(\vec{A};\bm{\varphi})] = \sum_{i=1}^{n} \lambda_i |\bm{\psi}_i^\T\bm{\varphi}|^2 = \bm{\varphi}^\T \vec{A} \bm{\varphi},

referred to as \emph{quantum expectation value} (QEV) of A\vec{A} in state φ\bm{\varphi}.

In fact, as noted by von Neumann Neumann, 1929 , the probability distribution \cref{eqn:Aphi_dist} can be recovered by knowledge of the moments Aφ,A2φ,A3φ,\langle \vec{A} \rangle_{\bm{\varphi}}, \langle \vec{A}^2 \rangle_{\bm{\varphi}}, \langle \vec{A}^3 \rangle_{\bm{\varphi}}, \ldots

The simplest description of a state by means of a wave function φ\varphi is obtained in this way: the expectation value of the quantity AA in the state φ\varphi is equal to (Aφ,φ)(A\varphi, \varphi). The specification of all expectation values provides, as it includes the expectation values of all powers (i.e., the so-called higher moments of a probability distribution), knowledge of the entire probability distribution of every quantity—and thus a complete statistical characterization of the system

More generally, a quantum system may be be in a mixture of pure states {φi}\{\bm{\varphi}_i\} each occurring with probability {pi}\{p_i\}. Such an ensemble is represented mathematically by the density matrix

ρ=i=1npiφiφiT.\vec{\rho} = \sum_{i=1}^{n} p_i \bm{\varphi}_i \bm{\varphi}_i^\T.

When the observable A\vec{A} is measured in such a state, the corresponding QEV is given by

Aρ=iφiTAφipi=tr(ρA).\langle \vec{A} \rangle_{\vec{\rho}} %:= \EE[m(\vec{A},\vec{\rho})] = \sum_{i} \bm{\varphi}_i^\T \vec{A} \bm{\varphi}_i p_i = \tr(\vec{\rho} \vec{A}).

Note that in the case of a single pure state, so that ρ=φφT\vec{\rho} = \bm{\varphi}\bm{\varphi}^\T, we recover the original formula for the QEV by the identity tr(φφTA)=φTAφ\tr(\bm{\varphi}\bm{\varphi}^\T\vec{A}) = \bm{\varphi}^\T \vec{A} \bm{\varphi}.

References
  1. Reimann, P. (2007). Typicality for Generalized Microcanonical Ensembles. Physical Review Letters, 99(16). 10.1103/physrevlett.99.160404
  2. Popescu, S., Short, A. J., & Winter, A. (2006). Entanglement and the foundations of statistical mechanics. Nature Physics, 2(11), 754–758. 10.1038/nphys444
  3. Avron, H., & Toledo, S. (2011). Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix. Journal of the ACM, 58(2), 1–34. 10.1145/1944345.1944349
  4. Roosta-Khorasani, F., & Ascher, U. (2014). Improved Bounds on Sample Size for Implicit Matrix Trace Estimators. Foundations of Computational Mathematics, 15(5), 1187–1212. 10.1007/s10208-014-9220-1
  5. Meyer, R. A., Musco, C., Musco, C., & Woodruff, D. P. (2021). Hutch++: Optimal Stochastic Trace Estimation. In Symposium on Simplicity in Algorithms (SOSA) (pp. 142–155). Society for Industrial. 10.1137/1.9781611976496.16
  6. Cortinovis, A., & Kressner, D. (2021). On Randomized Trace Estimates for Indefinite Matrices with an Application to Determinants. Foundations of Computational Mathematics. 10.1007/s10208-021-09525-9
  7. Chen, T., Trogdon, T., & Ubaru, S. (2022). Randomized matrix-free quadrature for spectrum and spectral sum approximation.
  8. von Neumann, J. (1929). Beweis des Ergodensatzes und desH-Theorems in der neuen Mechanik. Zeitschrift Für Physik, 57(1–2), 30–70. 10.1007/bf01339852