C++: Find Median from Data Stream

Thought Process

To efficiently find the median of a stream of numbers, we need a dynamic data structure approach. We use two heaps: a max-heap for the lower half and a min-heap for the upper half. The key is to maintain balanced heaps where their sizes differ by at most 1. When adding a number, we first push it to one heap and then rebalance by moving the top element to the other heap. The median is either the top of the larger heap or the average of the top elements when sizes are equal.

class MedianFinder {
public:
    priority_queue<int> maxHeap;                            // First Heap
    priority_queue<int, vector<int>, greater<int>> minHeap; // Second Heap
    MedianFinder() {}

    void addNum(int num) {

        // DiffInSize can be 0 or 1
        int diffInSize = maxHeap.size() - minHeap.size();

        // Push element in first Heap i.e. maxHeap
        if (diffInSize == 0) {
            // In order to put right element, we first push it in second heap
            // & whatever minimum comes, will push it to first heap.
            minHeap.push(num);
            maxHeap.push(minHeap.top());
            minHeap.pop();
        }
        // Push element in second Heap i.e. minHeap
        else {
            // In order to put right element, we first push it in first heap
            // & whatever maximum comes, will push it to second heap.
            maxHeap.push(num);
            minHeap.push(maxHeap.top());
            maxHeap.pop();
        }
    }

    double findMedian() {

        if (maxHeap.size() == minHeap.size()) {

            int a = maxHeap.top();
            int b = minHeap.top();

            return (a + b) / (double)2;
        } else
            return maxHeap.top();
    }
};

/**
 * Your MedianFinder object will be instantiated and called as such:
 * MedianFinder* obj = new MedianFinder();
 * obj->addNum(num);
 * double param_2 = obj->findMedian();
 */

Code Complexity

Time Complexity: O(log n) per insertion

Each insertion involves heap operations which take logarithmic time relative to the number of elements.

Space Complexity: O(n)

We store all elements across both heaps, requiring linear space proportional to the input size.

Code copied!