Bounds for Lanczos-FA on linear systems

Tyler Chen

Introduction

Krylov subspace methosd are among the most widely used class of algorithms for solving Ax=b\mathbf{A}\mathbf{x} = \mathbf{b} when A\mathbf{A} is a very large-sparse matrix. These methods work by constructing the Krylov subspace Kk(A,b):=span{b,Ab,,Ak1b}. \mathcal{K}_{k}(\mathbf{A},\mathbf{b}) := \operatorname{span}\{\mathbf{b}, \mathbf{A}\mathbf{b}, \ldots, \mathbf{A}^{k-1}\mathbf{b} \}. This page focuses on the Lanczos method for matrix function approximation (Lanczos-FA) used to approximate A1b\mathbf{A}^{-1}\mathbf{b}. The Lanczos-FA iterate is defined as lank(1/x):=b2Qf(T)e1, \mathsf{lan}_{k}(1/x) := \| \mathbf{b} \|_2 \mathbf{Q} f(\mathbf{T}) \mathbf{e}_1, where Q=[q1,,qk]\mathbf{Q} = [\mathbf{q}_1, \ldots, \mathbf{q}_k] is an orthonormal basis for Kk(A,b)\mathcal{K}_k(\mathbf{A},\mathbf{b}) such that span{q1,,qj}=Kj(A,b)\operatorname{span}\{\mathbf{q}_1, \ldots, \mathbf{q}_j\} = \mathcal{K}_{j}(\mathbf{A},\mathbf{b}) for all jkj\leq k, and T:=QTAQ\mathbf{T}:=\mathbf{Q}^\mathsf{T}\mathbf{A} \mathbf{Q}.

Positive definite systems

In the case that A\mathbf{A} is positive definite, then the Lanczos-FA iterate is equivalent to the well-known conjugate gradient algorithm. The simplest proof of this amounts to showing the Lanczos-FA iterate satisfies the same optimality property as CG.

Theorem. If A\mathbf{A} is positive definite, then A1blank(1/x)A=minxKk(A,b)A1bxA. \| \mathbf{A}^{-1} \mathbf{b} - \mathsf{lan}_{k}(1/x) \|_{\mathbf{A}} = \min_{\mathbf{x}\in\mathcal{K}_k(\mathbf{A},\mathbf{b})} \| \mathbf{A}^{-1} \mathbf{b} - \mathbf{x} \|_{\mathbf{A}}.

Proof. An arbitrary vector in Kk(A,b)\mathcal{K}_k(\mathbf{A},\mathbf{b}) can be written Qc\mathbf{Q}\mathbf{c} for some xRk\mathbf{x} \in \mathbb{R}^k. Thus, minxKk(A,b)A1bxA=mincRkA1bQcA.=mincRkA1/2(A1bQc)2.=mincRkA1/2bA1/2Qc2. \begin{align*} \min_{\mathbf{x}\in\mathcal{K}_k(\mathbf{A},\mathbf{b})} \| \mathbf{A}^{-1} \mathbf{b} - \mathbf{x} \|_{\mathbf{A}} &= \min_{\mathbf{c}\in\mathbb{R}^k} \| \mathbf{A}^{-1} \mathbf{b} - \mathbf{Q}\mathbf{c} \|_{\mathbf{A}}. \\&= \min_{\mathbf{c}\in\mathbb{R}^k} \| \mathbf{A}^{1/2} (\mathbf{A}^{-1} \mathbf{b} - \mathbf{Q}\mathbf{c} ) \|_{2}. \\&= \min_{\mathbf{c}\in\mathbb{R}^k} \| \mathbf{A}^{-1/2} \mathbf{b} - \mathbf{A}^{1/2} \mathbf{Q}\mathbf{c} \|_{2}. \end{align*} Writing the solution to the normal equations, we find that the above equations are minimized for c=((A1/2Q)T(A1/2Q))1(A1/2Q)T(A1/2b)=(QTAQ)1QTb=b2T1e1. \mathbf{c} = ((\mathbf{A}^{1/2}\mathbf{Q} )^\mathsf{T}(\mathbf{A}^{1/2} \mathbf{Q}) )^{-1} (\mathbf{A}^{1/2} \mathbf{Q})^\mathsf{T}(\mathbf{A}^{-1/2} \mathbf{b}) = (\mathbf{Q}^\mathsf{T}\mathbf{A} \mathbf{Q})^{-1} \mathbf{Q}^\mathsf{T}\mathbf{b} = \| \mathbf{b} \|_2 \mathbf{T}^{-1} \mathbf{e}_1. Thus, we have solution x=Qc=b2QT1e1=lank(1/x). \mathbf{x} = \mathbf{Q} \mathbf{c} = \| \mathbf{b}\|_2 \mathbf{Q} \mathbf{T}^{-1} \mathbf{e}_1 = \mathsf{lan}_{k}(1/x). This proves the theorem.   ~~~\square

This optimality property allows us to derive a number of prior bounds for the convergence of Lanczos-FA (and equivalently CG).

Indefinite systems

If A\mathbf{A} is not positive definite, T\mathbf{T} may have an eigenvalue at or near to zero and the error of the Lanczos-FA approximation to A1b\mathbf{A}^{-1}\mathbf{b} can be arbitrarily large.

The MINRES iterates are defined as y^k:=argminyKk(A,b)bAy2=argminyKk(A,b)A1byA2. \hat{\mathbf{y}}_k := \operatornamewithlimits{argmin}_{\mathbf{y}\in\mathcal{K}_k(\mathbf{A},\mathbf{b})} \| \mathbf{b} - \mathbf{A} \mathbf{y} \|_{2} = \operatornamewithlimits{argmin}_{\mathbf{y}\in\mathcal{K}_k(\mathbf{A},\mathbf{b})} \| \mathbf{A}^{-1}\mathbf{b} - \mathbf{y} \|_{\mathbf{A}^2}. Define the residual vectors rk:=bAlank(1/x),rkM:=bAy^k, \mathbf{r}_k := \mathbf{b} - \mathbf{A} \mathsf{lan}_{k}(1/x) ,\qquad \mathbf{r}_k^{\mathrm{M}} := \mathbf{b} - \mathbf{A} \hat{\mathbf{y}}_k, and note that the MINRES residual norms are non-increasing due to the optimality of the MINRES iterates. In [1], it is shown that the CG residual norms are near the MINRES residual norms at iterations where MINRES makes good progress. More precisely, the algorithms are related by rk2=rkM21(rkM2/rk1M2)2. \| \mathbf{r}_k \|_2 = \frac{\| \mathbf{r}_{k}^{\mathrm{M}} \|_2}{\sqrt{1- \left( \| \mathbf{r}_{k}^{\mathrm{M}} \|_2 / \| \mathbf{r}_{k-1}^{\mathrm{M}} \|_2 \right)^2}}. The following theorem represents a strengthening of the bounds in Appendix A of [2]. To the best of our knowledge, the result is new.

Theorem. For every kk, there exists kkk^* \leq k such that bAlank(1/x)2(ek+1k)mindeg(p)<kbAp(A)b2. {\| \mathbf{b} - \mathbf{A} \mathsf{lan}_{k^*}(1/x) \|_2} \leq \left( \mathrm{e}\sqrt{k} + \frac{1}{\sqrt{k}} \right) \min_{\deg(p)<k} {\| \mathbf{b} - \mathbf{A}p(\mathbf{A})\mathbf{b} \|_2}.

Proof. If k=0k=0, the theorem is trivially true. Fix k>0k>0. If erkM2>b2\sqrt{\mathrm{e}} \|\mathbf{r}_k^{\mathrm{M}}\|_2 > \| \mathbf{b} \|_2, then the theorem holds as e(ek+1/k)\sqrt{\mathrm{e}} \leq (\mathrm{e}\sqrt{k}+1/\sqrt{k}). Assume erkM2r0M2=b2\sqrt{\mathrm{e}} \| \mathbf{r}_k^{\mathrm{M}} \|_2 \leq \| \mathbf{r}_0^{\mathrm{M}} \|_2 = \| \mathbf{b} \|_2 so that there exists an integer k(0,k]k' \in (0,k] such that rkM2erkM2rk+1M2. \| \mathbf{r}_{k'}^{\mathrm{M}} \|_2 \geq \sqrt{\mathrm{e}} \| \mathbf{r}_k^{\mathrm{M}} \|_2 \geq \| \mathbf{r}_{k'+1}^{\mathrm{M}} \|_2. We then have that rk+1M2rkM2rk+2M2rk+1M2rkM2rk1M2=rkM2rkM21e. \frac{\|\mathbf{r}_{k'+1}^{\mathrm{M}} \|_2}{\| \mathbf{r}_{k'}^{\mathrm{M}}\|_2} \cdot \frac{\|\mathbf{r}_{k'+2}^{\mathrm{M}} \|_2}{\| \mathbf{r}_{k'+1}^{\mathrm{M}}\|_2} \:\cdots \: \frac{\|\mathbf{r}_{k}^{\mathrm{M}} \|_2}{\| \mathbf{r}_{k-1}^{\mathrm{M}}\|_2} = \frac{\|\mathbf{r}_{k}^{\mathrm{M}} \|_2}{\| \mathbf{r}_{k'}^{\mathrm{M}}\|_2} \leq \frac{1}{\sqrt{\mathrm{e}}}. Thus, there exists an integer k(k,k]k^* \in(k',k] such that rkM2rk1M2(1e)1/(kk)(1e)1/k. \frac{\|\mathbf{r}_{k^*}^{\mathrm{M}} \|_2}{\| \mathbf{r}_{k^*-1}^{\mathrm{M}}\|_2} \leq \left( \frac{1}{\sqrt{\mathrm{e}}} \right)^{1/(k-k')} \leq \left( \frac{1}{\sqrt{\mathrm{e}}} \right)^{1/k}. Since the MINRES residual norms are non-increasing, rkM2rk+1M2erkM2\| \mathbf{r}_{k^*}^{\mathrm{M}} \|_2 \leq \| \mathbf{r}_{k'+1}^{\mathrm{M}} \|_2 \leq \sqrt{\mathrm{e}} \| \mathbf{r}_{k}^{\mathrm{M}} \|_2. Thus, rk2=rkM21(rkM2/rk1M2)2erkM21(1/e)2/k. \| \mathbf{r}_{k^*} \|_2 = \frac{\| \mathbf{r}_{k^*}^{\mathrm{M}} \|_2 }{\sqrt{1- \left( \| \mathbf{r}_{k^*}^{\mathrm{M}} \|_2 / \| \mathbf{r}_{k^*-1}^{\mathrm{M}} \|_2 \right)^2}} % \leq \frac{1}{\sqrt{1-(1/\sqrt{\mathrm{e}})^{2/k}}} \| \vec{r}_{k^*}^{\mathrm{M}} \|_2 \leq \frac{\sqrt{\mathrm{e}}\| \mathbf{r}_{k}^{\mathrm{M}} \|_2}{\sqrt{1-(1/\sqrt{\mathrm{e}})^{2/k}}} . It can be verified that eke1(1/e)1/kek+1k. \mathrm{e}\sqrt{k} \leq \frac{\mathrm{e}}{\sqrt{1-(1/\mathrm{e})^{1/k}}} \leq \mathrm{e}\sqrt{k} + \frac{1}{\sqrt{k}}. Thus, we find that bAlank(1/x)2=rk2(ek+1k)rkM2. \| \mathbf{b} - \mathbf{A} \mathsf{lan}_{k^*}(1/x) \|_{2} = \| \mathbf{r}_{k^*} \|_2 \leq \left( \mathrm{e}\sqrt{k} + \frac{1}{\sqrt{k}} \right) \|\mathbf{r}_k^{\textup{M}}\|_2. The result follows from the optimality of the MINRES iterates.    ~~~\square

References

1. Cullum, J.; Greenbaum, A. Relations Between Galerkin and Norm-Minimizing Iterative Methods for Solving Linear Systems. SIAM Journal on Matrix Analysis and Applications 1996, 17, 223–247, doi:10.1137/S0895479893246765.

2. Chen, T.; Greenbaum, A.; Musco, C.; Musco, C. Error Bounds for Lanczos-Based Matrix Function Approximation. SIAM Journal on Matrix Analysis and Applications 2022, 43, 787–811, doi:10.1137/21m1427784.