We would like the algorithms we use to produce exactly the right answer all of the time. However, in many cases, this is not practically possible, as an algorithm would have to brute-force through all possible solutions to find the right one. Randomized algorithms are a class of algorithms that use randomness as a tool to help find good (but not necessarily exact) solutions quickly.
Here’s a toy example: Suppose we want to find the best restaurant in NYC. To do this, we’d have to try every single restaurant in NYC, an impossible task. However, if instead we’re just asked to find a restaurant in the top 10% of all restaurants, we could simply try a random selection of a small number of restaurants (e.g. 20), and report the best one we found.
Let’s try to analyze this problem mathematically. Suppose there are restaurants (in real life ). Each restaurant has a quality score which is unknown, until we visit it (for simplicity, we’ll assume that these scores are unique and deterministic).
How can we find a restaurant in the top 10% of all restaurants?
Here’s a natural algorithm: We’ll commit to visiting restaurants that we will pick completely randomly (i.e. by putting the names of all restaurants in a giant hat and drawing of them). Then, we will visit each of these restaurants, and return the one with the highest score. The big question is: how large does have to be to guarantee that we will find a restaurant in the top 10% of all restaurants?
Unless is more than 90% of , then there is a chance that we will not find a restaurant in the top 10%; we might get exceptionally unlucky and draw only restaurants in the bottom 90%. However, because we’ve chosen the restaurants randomly, we know this is very unlikely!
Let’s call the set of restaurants in the bottom 90% of all restaurants the ``bad restaurants’’. And to simplify analysis, let’s assume that when we draw a restaurant from the hat, we return it to the hat before drawing the next one (so we might draw the same restaurant multiple times).
We can now choose to get the desired probability of success. For instance, suppose we want this procedure to succeed with probability at least . Then we can set (or equivalently ). Solving this equation (by taking the logarithm of both sides) we see that we need to guarantee that we will find a restaurant in the top 10% of all restaurants with probability at least 95%.
What if we want to find a restaurant in the top fraction of all restaurants with probability at least ? Then we can set . Solving for we see that we need where we have used that for small and .
So we see how the cost of the algorithm depends on the desired accuracy and the desired probability of success . Interestingly, the dependence on the probability of success is very mild. For instance, if in our example we wanted to succeed at least 99.9% of the time, we would only need to increase from 29 to 66. On the other hand, the dependence on the accuracy is much worse. Finding a restaurant in the top 1% of all restaurants requires visiting roughly 10 times as many restaurants as finding a restaurant in the top 10% of all restaurants.
A deterministic algorithm is one does not rely on randomness at all. For instance, we might order all of the restaurants in alphabetical order, and then visit the first of these restaurants. However, it is very hard to guarantee such an approach will work. For instance, what if all of the good restaurants happen to be later in the alphabet? While our intuition tells us this is unlikely, formalizing this intuition is very difficult even in this simple example, and may not even be possible in other examples.
On the other hand, the use of randomness allowed us to provide strong guarantees about the performance of our very simple algorithm.