Factorization-based Solvers - Randomized Numerical Linear Algebra with Examples

As noted in the introduction to this chapter, the classical direct approach to solving the Linear Regression problem is to first compute a factorization (e.g. QR, SVD) of $\vec{A}$ in $O(nd^2)$ operations, and then use the factorization to solve the problem in $O(nd)$ operations. The factorization step is typically the computational bottleneck, but the algorithms discussed in the previous chapter can be applied in a black-box manner.

Recall, however, that the $\vec{Q}$ factor produced by the randomized Cholesky QR algorithm is compute as $\vec{Q} = \vec{Q}_1 \vec{R}_1^{-1}$ , where $\vec{R}_1$ , where $\vec{R}_1$ is the Cholesky factor of $\vec{Q}_1^\T\vec{Q}_1$ , and $\vec{Q}_1$ is the approximate orthogonal basis produced by the Sketched QR algorithm. Since we only need to compute $\vec{Q}^\T\vec{b}$ , but not $\vec{Q}$ itself, we can avoid forming forming the product $\vec{Q}_1\vec{R}_1^{-1}$ explicitly. This results in the following implementation:^[1]

We can easily implement these algorithms in Numpy:

def randomized_cholesky_QR_regression_naive(A,b,k,zeta,rng):

    Q,R = randomized_cholesky_QR(A,k,zeta,rng)
    x = sp.linalg.solve_triangular(R,Q.T@b,lower=False)    

    return x

def randomized_cholesky_QR_regression(A,b,k,zeta,rng):

    Q1,R1 = sketched_qr(A,k,zeta,rng)

    y = sp.linalg.solve(Q1.T@Q1,Q1.T@b,assume_a='pos')
    x = sp.linalg.solve_triangular(R1,y,lower=False)
    
    return x

Numerical Example¶

Let’s compare the performance of the three approaches described above, on a least squares problem where $\|\vec{b}\|$ and $\|\vec{b} - \vec{A}\vec{x}^*\|$ are prescribed.

d = 300

R1 = np.triu(np.random.randn(d,d))
R2 = np.triu(np.random.randn(d,d))
x = np.random.randn(d)

# Generate a random matrix with controlled condition number
n = 5000
d = 300

U,s,Vt = np.linalg.svd(np.random.rand(n,d),full_matrices=False)
s = np.geomspace(1e-4,1,d) # define singular values
A = U@np.diag(s)@Vt

# define ||b|| and ||b-Ax*||
b_norm = 1
residual_norm = .1

# construct right hand side
v = np.random.randn(n)
v_span = U@(U.T@v)
v_perp = v-v_span
v_span /= np.linalg.norm(v_span)
v_perp /= np.linalg.norm(v_perp)

b = v_span*np.sqrt(b_norm**2-residual_norm**2)+v_perp*residual_norm

# construct true solution
x_true = Vt.T@np.diag(1/s)@U.T@b

k = int(1.5*d)
zeta = 4

# Define regression factorization methods
regression_methods = [
    {'name': 'Numpy',
     'func': lambda: np.linalg.lstsq(A, b, rcond=None)[0]},
    {'name': 'Rand. Cholesky QR (naive)',
     'func': lambda: randomized_cholesky_QR_regression_naive(A,b,k,zeta,rng)},
    {'name': 'Rand. Cholesky QR',
     'func': lambda: randomized_cholesky_QR_regression(A,b,k,zeta,rng)},
]

rng = np.random.default_rng(0) 

# Time the regression methods
n_repeat = 20  # Number of repetitions for averaging

results = []

for regression_method in regression_methods:
    method_name = regression_method['name']
    
    # Time the method
    start = time.time()
    for _ in range(n_repeat):
        x = regression_method['func']()
    end = time.time()
    
    avg_time = (end - start) / n_repeat
    
    # Compute accuracy metrics
    results.append({
        'method': method_name,
        'time (s)': avg_time,
        'error': np.linalg.norm(x_true - x)/np.linalg.norm(x_true),
    })

# Create DataFrame and compute relative performance
results_df = pd.DataFrame(results)
results_df['speedup'] = results_df['time (s)'].max() / results_df['time (s)']

# Display results with formatting
results_df.reindex(columns=['method','time (s)','speedup','error']).style.format({
    'time (s)': '{:.4f}',
    'speedup': '{:.1f}x',
    'error': '{:1.1e}',
})

The results are as expected. All methods get similar accuracy, but the randomized methods are faster due to offloading most of the computation to matrix-matrix multiplication. The optimized variant is slightly faster than the typical Randomized Cholesky QR-based method.

Footnotes¶

This implementation was communicated to me by Ethan Epperly. A similar implementation appears in Ipsen, 2025.
↩

References¶

Ipsen, I. C. F. (2025). Solution of Least Squares Problems with Randomized Preconditioned Normal Equations. https://arxiv.org/abs/2507.18466

Randomized Numerical Linear Algebra with Examples

Linear Regression

Randomized Numerical Linear Algebra with Examples

Sketch and Solve

4.1Factorization-based Solvers

Numerical Example¶