The main downside to classical QR algorithms is that they are sequential. As we have seen, we get a much higher flop rate with matrix-multiplication than with matrix factorization. We can use this to our advantage by computing a QR factorization using the Cholesky factorization of the Gram matrix . Such an approach
This algorithm is easily implemented in Numpy.
def cholesky_QR(A):
X = A.T@A
R = np.linalg.cholesky(X).T
Q = sp.linalg.solve_triangular(R.T,A.T,lower=True).T
return Q,R
The cost of Algorithm 3.1 is dominated by the matrix-matrix product in the first line, which costs operations. Thus, we might hope that this algorithm runs faster in practice than standard QR factorization algorithms, since matrix-matrix multiplication has a very high flop-rate. In addition, Algorithm 3.1 is mathematically exact; i.e. in exact arithmetic, it will produce a true QR factorization.
Proof
By construction is upper triangular and . Since is the Cholesky factorization of , we have that . This means that
so is orthogonal.
Numerical Experiment¶
Let’s try to understand the performance of Cholesky QR relative to the default QR factorization method in Numpy. In addition to the runtime, we will compute the orthogonality and reconstruction errors
which are both zero for a perfect QR factorization.
# Generate a random matrix with controlled condition number
n = 5000
d = 300
U,s,Vt = np.linalg.svd(np.random.rand(n,d),full_matrices=False)
s = np.geomspace(1e-4,1,d) # define singular values
A = U@np.diag(s)@Vt
# Define QR factorization methods
qr_methods = [
{'name':'Householder QR',
'func': lambda: np.linalg.qr(A,mode='reduced')},
{'name':'Cholesky QR',
'func': lambda: cholesky_QR(A)}
]
Source
# Time the QR factorization methods
n_repeat = 10 # Number of repetitions for averaging
results = []
for qr_method in qr_methods:
method_name = qr_method['name']
# Time the method
start = time.time()
for _ in range(n_repeat):
Q, R = qr_method['func']()
end = time.time()
avg_time = (end - start) / n_repeat
# Compute accuracy metrics
results.append({
'method': method_name,
'time (s)': avg_time,
'orthogonality': np.linalg.norm(Q.T @ Q - np.eye(d)),
'reconstruction': np.linalg.norm(A - Q @ R)
})
# Create DataFrame and compute relative performance
results_df = pd.DataFrame(results)
results_df['speedup'] = results_df['time (s)'].max() / results_df['time (s)']
# Display results with formatting
results_df.reindex(columns=['method','time (s)','speedup','orthogonality','reconstruction']).style.format({
'time (s)': '{:.4f}',
'speedup': '{:.1f}x',
'orthogonality': '{:1.1e}',
'reconstruction': '{:1.1e}',
})
As expected, Cholesky QR is much faster than the standard Householder QR factorization.
Too good to be true?¶
While CholeskyQR is faster than a standard QR method, our numerical experiment reveals that the matrix produced by Cholesky SQR is much less orthogonal! The numerical instability can be traced back to presence of the Gram matrix . Indeed, , so by forming the Gram matrix we end up making the conditioning way worse 💔.
In the next section, we will explore how RandNLA can be used to produce a more accurate approximation, while maintaining the efficiency of Cholesky QR.