Reservoir Sampling
Algorithm to randomly choose k samples from an INFINITE list or STREAM of data, or a list of size N, where N is UNKNOWN or large enough that the list DOESN'T FIT into main memory.

Customer Support: admin@thealgorists.com

Feedback:
We are listening to your every feedback,
and taking action to constantly improve your learning experience.
If you have any feedback, please use this form:
https://thealgorists.com/Feedback.
 If you are a student, email to admin@thealgorists.com to get special student discount.
Reservoir sampling is a family of randomized algorithms for choosing a simple random sample, without replacement, of k items from a population of unknown size n in a single pass over the items. The size of the population n is not known to the algorithm and is typically too large for all n items to fit into main memory.
Algorithm to randomly select k samples from n elements:
 Take an array reservoir[] of length k and initialize it with the first k elements from the given array input[] of size n.

for index i := k to (n  1) repeat:

Generate a random number. Let's say the random number is m.
If m > k : do nothing
else: swap reservoir[m] with input[i].

Generate a random number. Let's say the random number is m.
 return reservoir array.
This is a Premium content. Please subscribe to access the code.
After subscribing please come back and refresh this page.
We could also implement the algorithm recursively as shown below.
Suppose we have an algorithm that can pull a random set of k elements from an array of size (n  1). How can you use this algorithm to pull a random set of k elements from an array of size n ?
We can first pull a random set of size k, say reservoir[0...(k  1)], from the first (n  1) elements. Then, we just need to decide if array[n] should be inserted into our subset (which would require pulling out a random element from it, to keep the size k). An easy way to do this is to pick a random number m from 0 through n. If m < k, then replace reservoir[m] with array[n].
This is a Premium content. Please subscribe to access the code.
After subscribing please come back and refresh this page.
Related Chapter:
The above content is written by:
Abhishek Dey
A Visionary Software Engineer With A Mission To Empower Every Person & Every Organization On The Planet To Achieve More
Microsoft  University of Florida
If you have any feedback, please use this form: https://thealgorists.com/Feedback.
Follow Us On LinkedIn