Median of distribution with memory constraint


I want to approximate the median of a given distribution $ D$ that I can sample from.

A simple algorithm for this, using $ n$ samples, is:

samples = [D.sample() for i in range(n)] # generate n samples from D sort(samples) return samples[n/2] 

However, I am looking for an algorithm that requires less than $ O(n)$ space.


I have looked into these algorithms:

  • Median of medians: Needs $ O(n)$ space, so it does not work for me.
  • Randomized median: It seems like this could be easily generalized to an algorithm that uses $ O(n^{3/4})$ space.

Are there any other algorithms that use less then $ O(n)$ space that could solve my problem? In particular, I was thinking there may be an algorithm that uses $ O(m)$ space by generating batches of samples from $ D$ of size $ m$


  • Ideally, I am looking for a reference to an algorithm that also includes analysis (success probability, expected runtime, etc).
  • Actually, I need an algorithm to estimate $ D$ ‘s $ p$ -th percentile for a given $ p$ , but I am hoping most median-finding algorithms can be generalized to that.