Introduction¶
The “median trick” (sometimes called “boosting”) is a clever way of turning a (scaler valued) randomized algorithm that succeeds with some constant probability (e.g. 2/3) into one that succeeds with arbitrarily high probability with only logarithmic overhead.
The idea is simple: suppose we have a randomized algorithm that produces an approximation to satisfying . We can boost the success probability by running the algorithm multiple times and taking the median of the results.
Run the algorithm times to get estimates
Let
Notably, the dependence on the target failure probability is only logarithmic.
The proof is simple and standard, so we will just provide a high level overview.
Proof
By definition of the median, half of the estimates are below and half are above. Thus, the only way for to be outside of is if at least half of the estimates are outside of this interval. However, since each live within this interval with probability at least , it is (exponentially) unlikely that half of them live outside of the interval.
High dimensions¶
What about in high dimensions? I.e. if we have a randomized algorithm that produces an approximation to satisfying ? While the median trick in 1d appears in almost all course notes for a class on randomized algorithms, finding a statement of a high-dimensional analog is surprisingly much more difficult.
The simplest approach is to apply the one-dimensional estimator along each dimension. The main issue with this is that being small is not necessarily the same thing as each coordinate being close (i.e. being small). So we will lose something in the accuracy from converting between norms. For instance, if is the standard Euclidan norm, then we will lose a factor of in the accuracy.
A better estimator¶
Here is a better approach that only loses a factor of 3 on the accuracy. I was introduced to this approach by Chris Musco, but I’d expect its a folklore type result.
Make independent copies of
For all , define
For all , define
Proof
Let . Since , it follows from a standard Chernoff Bound that for . Therefore, it suffices to show that if , then .
Suppose . By the triangle inequality, for each ,
Therefore, for each
and hence, [ c(i) = \operatorname{median}\big(d(i,1),\ldots, d(i,t)\big) \leq 2\varepsilon. ]
It follows that , and since , we have
But also . So, by the pigeonhole principle, there is at lest one index for which
Therefore, by the triangle inequality
as desired.