By the triangle inequality and since ( x + y ) 2 ≤ 2 ( x 2 + y 2 ) (x+y)^2\leq 2(x^2+y^2) ( x + y ) 2 ≤ 2 ( x 2 + y 2 ) , we have
∣ tr ( f ( A ) ) − SLQ k , m ( f ; A ) ∣ 2 ≤ 2 ∣ tr ( f ( A ) ) − 1 m ∑ i = 1 m x i T f ( A ) x i ∣ 2 + 2 ∣ 1 m ∑ i = 1 m x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ 2 . \begin{aligned}
\hspace{1em}&\hspace{-1em}| \tr(f(\vec{A})) - \Call{SLQ}_{k,m}(f;\vec{A}) |^2
\\&\leq 2\left| \tr(f(\vec{A})) - \frac{1}{m}\sum_{i=1}^{m} \vec{x}_i^\T f(\vec{A})\vec{x}_i \right|^2 + 2\left| \frac{1}{m}\sum_{i=1}^{m} \vec{x}_i^\T f(\vec{A})\vec{x}_i - \Call{Lan-QF}_k(f;\vec{A},\vec{x}_i) \right|^2.
\end{aligned} ∣ tr ( f ( A )) − SLQ k , m ( f ; A ) ∣ 2 ≤ 2 ∣ ∣ tr ( f ( A )) − m 1 i = 1 ∑ m x i T f ( A ) x i ∣ ∣ 2 + 2 ∣ ∣ m 1 i = 1 ∑ m x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ ∣ 2 . Note that
E [ ∣ tr ( f ( A ) ) − 1 m ∑ i = 1 m x i T f ( A ) x i ∣ 2 ] = V [ tr ^ m ( f ( A ) ) ] = 2 ∥ f ( A ) ∥ F 2 m , \EE\left[ \left| \tr(f(\vec{A})) - \frac{1}{m}\sum_{i=1}^{m} \vec{x}_i^\T f(\vec{A})\vec{x}_i \right|^2 \right]
=
\VV\left[ \widehat{\tr}_m(f(\vec{A})) \right]
= \frac{2 \| f(\vec{A}) \|_\F^2}{m}, E ⎣ ⎡ ∣ ∣ tr ( f ( A )) − m 1 i = 1 ∑ m x i T f ( A ) x i ∣ ∣ 2 ⎦ ⎤ = V [ tr m ( f ( A )) ] = m 2∥ f ( A ) ∥ F 2 , where tr ^ m ( ⋅ ) \widehat{\tr}_m(\cdot) tr m ( ⋅ ) is the Girard--Hutchinson estimator .
Next, by the triangle inequality and Theorem 6.3 ,
∣ 1 m ∑ i = 1 m x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ ≤ 1 m ∑ i = 1 m ∣ x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ ≤ 1 m ∑ i = 1 m ∥ x i ∥ 2 min deg ( p ) < 2 k − 1 max x ∈ [ λ n , λ 1 ] ∣ f ( x ) − p ( x ) ∣ . \begin{aligned}
\hspace{2em}&\hspace{-2em}
\left| \frac{1}{m}\sum_{i=1}^{m} \vec{x}_i^\T f(\vec{A})\vec{x}_i - \Call{Lan-QF}_k(f;\vec{A},\vec{x}_i) \right|
\\&\leq \frac{1}{m}\sum_{i=1}^{m} \left| \vec{x}_i^\T f(\vec{A})\vec{x}_i - \Call{Lan-QF}_k(f;\vec{A},\vec{x}_i) \right|
\\&\leq \frac{1}{m} \sum_{i=1}^{m} \|\vec{x}_i\|^2 \min_{\deg(p)<2k-1} \max_{x\in[\lambda_n,\lambda_1]} | f(x) - p(x) |.
\end{aligned} ∣ ∣ m 1 i = 1 ∑ m x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ ∣ ≤ m 1 i = 1 ∑ m ∣ ∣ x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ ∣ ≤ m 1 i = 1 ∑ m ∥ x i ∥ 2 d e g ( p ) < 2 k − 1 min x ∈ [ λ n , λ 1 ] max ∣ f ( x ) − p ( x ) ∣. Hence,
E [ ∣ 1 m ∑ i = 1 m x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ 2 ] ≤ 1 m 2 E [ ( ∑ i = 1 m ∥ x i ∥ 2 ) 2 ] min deg ( p ) < 2 k − 1 max x ∈ [ λ n , λ 1 ] ∣ f ( x ) − p ( x ) ∣ \begin{aligned}
\hspace{2em}&\hspace{-2em}
\EE\left[ \left| \frac{1}{m}\sum_{i=1}^{m} \vec{x}_i^\T f(\vec{A})\vec{x}_i - \Call{Lan-QF}_k(f;\vec{A},\vec{x}_i) \right|^2 \right]
\\&\leq \frac{1}{m^2}\EE\left[ \left(\sum_{i=1}^{m} \|\vec{x}_i\|^2 \right)^2 \right] \min_{\deg(p)<2k-1} \max_{x\in[\lambda_n,\lambda_1]} | f(x) - p(x) |
\end{aligned} E ⎣ ⎡ ∣ ∣ m 1 i = 1 ∑ m x i T f ( A ) x i − Lan-QF k ( f ; A , x i ) ∣ ∣ 2 ⎦ ⎤ ≤ m 2 1 E ⎣ ⎡ ( i = 1 ∑ m ∥ x i ∥ 2 ) 2 ⎦ ⎤ d e g ( p ) < 2 k − 1 min x ∈ [ λ n , λ 1 ] max ∣ f ( x ) − p ( x ) ∣ Now, note that ∑ i = 1 m ∥ x i ∥ 2 \sum_{i=1}^{m} \|\vec{x}_i\|^2 ∑ i = 1 m ∥ x i ∥ 2 is a Chi-squared random variable with m n mn mn degrees of freedom.
By looking on Wikipedia , we see that
E [ ( ∑ i = 1 m ∥ x i ∥ 2 ) 2 ] = 2 m n + ( m n ) 2 ≤ 3 ( m n ) 2 . \EE\left[ \left(\sum_{i=1}^{m} \|\vec{x}_i\|^2 \right)^2 \right] = 2mn + (mn)^2 \leq 3(mn)^2. E ⎣ ⎡ ( i = 1 ∑ m ∥ x i ∥ 2 ) 2 ⎦ ⎤ = 2 mn + ( mn ) 2 ≤ 3 ( mn ) 2 . Combining all of the above gives the result.