Problem Statement in English
You’re supposed to find the median of a data stream in constant time. Your data structure should support the following operations:
addNum(int num): Add an integernumfrom the data stream to the data structure.findMedian(): Return the median of all elements so far.
Approach
We can solve this problem using 2 heaps.
The idea is that we maintain a sorted view, except that we can only view the greatest element from the left half and the smallest element from the right half at any time.
One heap will be a max heap and the other heap will be a min heap.
The max heap will store the left half of the data stream and the min heap will store the right half of the data stream.
From the max heap, we can get the maximum element of the left half of the data stream. From the min heap, we can get the minimum element of the right half of the data stream.
The median will be the average of the maximum element of the left half of the data stream and the minimum element of the right half of the data stream, in case of even number of elements.
In case of odd number of elements, the median will be the maximum element of the left half of the data stream. Since we ensure that the left half will always have 1 more element than the right half in case of odd number of elements.
Solution in Python
from heapq import heappush, heappop
class MedianFinder:
def __init__(self):
self.maxheap, self.minheap = [], []
def addNum(self, num: int) -> None:
# Add to max heap
heappush(self.maxheap, -num)
# Balance max heap to min heap
# this ensures that when we have an odd number of elements,
# the min heap has 1 more element than the max heap.
#
# In the next step we balance out the state in case this was a bad addition
heappush(self.minheap, -heappop(self.maxheap))
# if the min heap has more than 1 + half the elements,
# return one element to max heap
if len(self.minheap) > len(self.maxheap) + 1:
heappush(self.maxheap, -heappop(self.minheap))
def findMedian(self) -> float:
# If both heaps are of equal length,
# return the average of the two middle elements
if len(self.maxheap) == len(self.minheap):
return (-self.maxheap[0] + self.minheap[0]) / 2
# else return the middle element from the min heap
# since we ensure that the min heap has 1 more element
# than the max heap in case of odd number of elements
return self.minheap[0]
Complexity
Time: $O(1)$
We have constant time complexity for thefindMedianoperation, and $O(log(n))$ for theaddNumoperation.Space: $O(n)$
Since we are storing all the elements in the heaps, the space complexity is $O(n)$.
Mistakes I Made
I had to look this one up :(
And we are done.