Skip to article frontmatterSkip to article content

This chapter focuses on the (tall) linear regression problem.

Direct Methods

Classical direct solvers for Linear Regression such as the LAPACK method underlying np.linalg.lstsq work in two stages. First, A\vec{A} is factorized (e.g QR factorization). Second, the factorization is used to solve the linear system. When implemented properly, such algorithms are numerically stable and require O(nd2)O(nd^2) operations. However, the dominant cost of this approach is computing a matrix factorization which, as we have seen in our discussion on the cost of linear algebra, does not have a particularly high flop-rate. Moreover, the total number of flops is O(nd2)O(nd^2), which might be too expensive when nd1n\gg d \gg 1.

Iterative Methods

Iterative methods such as LSQR begin with an initial guess x0\vec{x}_0 and iteratively produce a sequence of approximate solutions x1,,xt\vec{x}_1,\ldots,\vec{x}_t, where ideally xk\vec{x}_k is close to the solution of Linear Regression. At each iteration, such methods perform a matrix-vector product with A\vec{A} and AT\vec{A}^\T, in addition to some vector operations. Thus, the matrix-vector products are typically the dominant cost, and require O(nnz(A))O(nd)O(\nnz(\vec{A})) \leq O(nd) operations per iteration. While iterative methods are able to take advantage of sparsity in A\vec{A}, they may require many iterations to converge when A\vec{A} is ill-conditioned problems.

Further Reading

References
  1. Sarlos, T. (2006). Improved Approximation Algorithms for Large Matrices via Random Projections. 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), 143–152. 10.1109/focs.2006.37
  2. Rokhlin, V., & Tygert, M. (2008). A fast randomized algorithm for overdetermined linear least-squares regression. Proceedings of the National Academy of Sciences, 105(36), 13212–13217. 10.1073/pnas.0804869105
  3. Clarkson, K. L., & Woodruff, D. P. (2013). Low rank approximation and regression in input sparsity time. Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, 81–90. 10.1145/2488608.2488620
  4. Chenakkod, S., Dereziński, M., Dong, X., & Rudelson, M. (2024). Optimal Embedding Dimension for Sparse Subspace Embeddings. Proceedings of the 56th Annual ACM Symposium on Theory of Computing, 1106–1117. 10.1145/3618260.3649762
  5. Chenakkod, S., Dereziński, M., & Dong, X. (2025). Optimal Subspace Embeddings: Resolving Nelson-Nguyen Conjecture Up to Sub-Polylogarithmic Factors. https://arxiv.org/abs/2508.14234
  6. Epperly, E. N. (2024). Fast and Forward Stable Randomized Algorithms for Linear Least-Squares Problems. SIAM Journal on Matrix Analysis and Applications, 45(4), 1782–1804. 10.1137/23m1616790
  7. Epperly, E., Meier, M., & Nakatsukasa, Y. (2025). Fast randomized least-squares solvers can be just as accurate and stable as classical direct solvers. Communications on Pure and Applied Mathematics.