Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Error bounds for Lanczos-based matrix function approximation

This is a companion piece to Chen et al. (2022). This approach has prompted a number of follow-up works that generalize it to non-symmetric problems, and block and rational algorithms Xu & Chen (2024), Simunec (2024), Adler et al. (2025).

Code to replicate the figures in the corresponding paper can be found on Github.

Introduction

The Lanczos method for matrix function approximation (Lanczos-FA) can be used to approximate f(A)bf(\vec{A})\vec{b}, and in the case that f(x)=1/xf(x) = 1/x and A\vec{A} is positive definite, this approximation is optimal over Krylov subspace. This case is very well studied and a range of error bounds and estimates exist. However, for other functions, the standard bounds are often too pessimistic as they do not take into account fine grained information about the spectrum of A\vec{A} such as outlaying or clustered eigenvalues. This makes it difficult to know when Lanczos-FA has reached a suitable accuracy for a given problem.

In this paper we show how to reduce the error of approximating f(A)bf(\vec{A})\vec{b} with Lanczos-FA to the error of solving a certain linear system with the Lanczos-FA. This allows us to leverage the range of existing bounds for the convergence of Lanczos-FA on linear systems to easily obtain a priori and a posteriori bounds for other matrix functions including piecewise analytic functions such as the sign function. Our a posteriori bounds are highly accurate and can be used as practical stopping criteria.

The basic idea

The kk-th Lanczos-FA approximation to f(A)bf(\vec{A}) \vec{b} is defined as

lank(f):=Qf(T)QTb,\lan_k(f) := \vec{Q} f(\vec{T}) \vec{Q}^{\T} \vec{b},

where Q\vec{Q} and T\vec{T} are produced by the Lanczos method run for kk steps on (A,b)(\vec{A},\vec{b}).

Suppose A\vec{A} is a Hermitian matrix and If ff is analytic in a neighborhood of the eigenvalues of A\vec{A} and Γ\Gamma is a contour in this neighborhood containing these eigenvalues,

f(A)b=12πiΓf(z)(AzI)1bdz.f(\vec{A})\vec{b} = - \frac{1}{2 \pi i} \oint_{\Gamma} f(z) (\vec{A} - z \vec{I} )^{-1} \vec{b} \, \d{z}.

If the eigenvalues of T\vec{T} are also contained in Γ\Gamma, we similarly have

Qf(T)QTb=12πiΓf(z)Q(TzI)1QTbdz.\vec{Q} f(\vec{T}) \vec{Q}^\T \vec{b} = - \frac{1}{2 \pi i} \oint_{\Gamma} f(z) \vec{Q} (\vec{T} - z \vec{I} )^{-1} \vec{Q}^\T \vec{b} \, \d{z} .

Combining, these, we see that the Lanczos-FA error can be written as

f(A)bQf(T)QTb=12πiΓf(z)errk(z)dz.f(\vec{A}) \vec{b} - \vec{Q} f( \vec{T} ) \vec{Q}^{\T} \vec{b} = - \frac{1}{2 \pi i} \oint_{\Gamma} f(z) \, \err_k(z) \, \d{z}.

For zCz \in \mathbb{C}, define the kk-th Lanczos-FA error and residual for the linear system (AzI)x=b(\vec{A}-z\vec{I}) \vec{x} = \vec{b} as,

errk(z,A,b):=(AzI)1bQ(TzI)1QTb,resk(z,A,b):=b(AzI)Q(TzI)1QTb.\begin{align*} \err_k(z,\vec{A},\vec{b}) &:= (\vec{A} - z \vec{I})^{-1} \vec{b} - \vec{Q}(\vec{T}-z\vec{I})^{-1}\vec{Q}^\T \vec{b}%\lan_k ( h_z ) ,\\ \Res_k(z,\vec{A},\vec{b}) &:= \vec{b} - (\vec{A} - z \vec{I}) \vec{Q}(\vec{T}-z\vec{I})^{-1}\vec{Q}^\T \vec{b}.%\,\lan_k ( h_z ). \end{align*}

It is a well-known fact that

resk(z)=((1)kdet(TzI)j=1kβj)b2qk+1.\Res_k(z) = \left( \frac{(-1)^{k}}{\det(\vec{T} -z \vec{I}) }\prod_{j=1}^{k} \beta_j \right) \| \vec{b} \|_2\: \vec{q}_{k+1}.

Consequently, for all z,wCz , w \in \mathbb{C}, where AzI\vec{A} - z \vec{I} and AwI\vec{A} - w \vec{I} are both invertible,

errk(z)=det(hw,z(T))hw,z(A)errk(w)resk(z)=det(hw,z(T))resk(w),\begin{align*} \err_k(z) &= \det(h_{w,z}(\vec{T})) h_{w,z}(\vec{A}) \,\err_k(w) \\ \Res_k(z) &= \det(h_{w,z}(\vec{T})) \,\Res_k(w), \end{align*}

where

hw,z(x)=xwxz.h_{w,z}(x) = \frac{x-w}{x-z}.

Thus, if Γ\Gamma is a simple closed curve or union of simple closed curves inside this neighborhood and enclosing the eigenvalues of A\vec{A} and T\vec{T} and ww a point not in Λ(T)Λ(A)\Lambda(\vec{T})\cup\Lambda(\vec{A}),

f(A)blank(f)=(12πiΓf(z)det(hw,z(T))hw,z(A)dz)errk(w).f(\vec{A}) \vec{b} - \lan_k(f) = \left( - \frac{1}{2\pi i} \oint_{\Gamma} f(z) \det(h_{w,z}(\vec{T})) h_{w,z}(\vec{A}) \d{z} \right) \, \err_k(w).

Applying the triangle inequality and the submultiplicitivity of matrix-norms, we can then obtain a bound

f(A)blank(f)(12πΓf(z)(i=1khw,zSi)hw,zS0dz)integral termerrk(w),linear system error\| f(\vec{A})\vec{b} - \lan_k(f) \| \leq \underbrace{\vphantom{ \bigg| }\left( \frac{1}{2\pi}\oint_{\Gamma} |f(z)| \left(\prod_{i=1}^{k} \| h_{w,z}\|_{S_i}\right) \|h_{w,z}\|_{S_0} | \d{z} | \right)}_{\text{integral term}} \hspace{-.5 em}\underbrace{\vphantom{ \Bigg| } \| \err_k(w) \| , \hspace{-.4em} }_{\text{linear system error}} \hspace{-.5em}

where SiS_i are some suitably chosen sets and gS:=maxxSg(x)\|g\|_S:= \max_{x\in S}|g(x)|.

Note that the integral term and linear system error term in the theorem are entirely decoupled! Thus, once the integral term is computed, bounding the error of Lanczos-FA for f(A)bf(\vec{A})\vec{b} is reduced to bounding errk(w)\| \err_k(w) \|, and if the integral term can be bounded independently of kk, This implies that, up to a constant factor, the Lanczos-FA approximation to f(A)bf(\vec{A})\vec{b} converges at least as fast as errk(w)\| \err_k(w) \|.

What else is in the paper?

Much of the paper is focused on practical aspects regarding the use of such a bound. In particular:

It’s worth noting that similar ideas have been previously used to get error bounds for Stieltjes functions Ilic et al., 2009Frommer et al., 2014Frommer & Schweitzer, 2015. As discussed in Section 2.2 of the paper, our bounds differ in a number of key ways. One key difference is that we reduce error bounds for Lanczos-FA a general function to error bounds for Lanczos-FA on a fixed linear system. This allows intuition about the convergence of Lanczos-FA on linear systems to be transferred to other functions. In addition, our analysis allows bounds for piecewise continuous functions like the sign function.

References
  1. Chen, T., Greenbaum, A., Musco, C., & Musco, C. (2022). Error Bounds for Lanczos-Based Matrix Function Approximation. SIAM Journal on Matrix Analysis and Applications, 43(2), 787–811. 10.1137/21m1427784
  2. Xu, Q., & Chen, T. (2024). A posteriori error bounds for the block-Lanczos method for matrix function approximation. Numerical Algorithms. 10.1007/s11075-024-01819-7
  3. Simunec, I. (2024). Error bounds for the approximation of matrix functions with rational Krylov methods. Numerical Linear Algebra with Applications, 31(5), e2571. https://doi.org/10.1002/nla.2571
  4. Adler, J. H., Hu, X., Pan, W., & Xue, Z. (2025). Error Estimates for the Arnoldi Approximation of a Matrix Square Root. https://arxiv.org/abs/2506.22615
  5. Ilic, M. D., Turner, I. W., & Simpson, D. P. (2009). A restarted Lanczos approximation to functions of a symmetric matrix. IMA Journal of Numerical Analysis, 30(4), 1044–1061. 10.1093/imanum/drp003
  6. Frommer, A., Güttel, S., & Schweitzer, M. (2014). Efficient and Stable Arnoldi Restarts for Matrix Functions Based on Quadrature. SIAM Journal on Matrix Analysis and Applications, 35(2), 661–683. 10.1137/13093491x
  7. Frommer, A., & Schweitzer, M. (2015). Error bounds and estimates for Krylov subspace approximations of Stieltjes matrix functions. BIT Numerical Mathematics, 56(3), 865–892. 10.1007/s10543-015-0596-3