Problem Statement in English
You’re given a 0-indexed integer array nums of length n and an integer target.
You must return the number of subarrays of nums that have target as a majority element.
A majority element in a subarray is an element that appears more than half the time in that subarray.
Approach
The first order of business is how to represent the given array so that it becomes useful for us when it comes to counting the number of subarrays with the majority element.
The way to go here is bookkeeping. Everytime we see the target element, we represent it as +1 and every other element as -1. This way, we can keep track of the number of times the target element has appeared in the subarray and the number of times other elements have appeared.
This means that if we maintain a running sum of the array, we can check if the target element is a majority element in the subarray by checking if the running sum is greater than 0. Stop and think about that for a moment.
In a standard prefix sum array, the sum of the subarray nums[i...j] is given by:
$$\text{SubarraySum} = \text{pref}[j] - \text{pref}[i-1]$$
For target to be the majority element in nums[i...j], the net sum of our +1/-1 values for that subarray must be strictly greater than 0:
$$\text{pref}[j] - \text{pref}[i-1] > 0$$
If we move $\text{pref}[i-1]$ to the other side of the inequality, it becomes: $$\text{pref}[j] > \text{pref}[i-1]$$
This means that every single time we find a previous prefix sum ($\text{pref}[i-1]$) that is less than our current prefix sum ($\text{pref}[j]$), it guarantees that the subarray starting right after it is a valid majority subarray.
This means that if we keep track of the number of times a prefix sum that is less than our current prefix sum has appeared, we can add that to our answer!
Thus we need to find a way to keep track of the number of times a prefix sum has appeared, and the number of times it is less than our current prefix sum. This can be done using a frequency array.
Notice that if we have a subarray in which target never appears, the running sum will be negative. This means that we need to keep track of negative prefix sums as well. The way to do this is to offset the prefix sum by n (the length of the array). This way, we can use a frequency array of size 2*n + 1 to keep track of all possible prefix sums.
So when we start out, we can safely make the first prefix sum equal to n (the offset) and increment the frequency of that prefix sum by 1. This is because the empty subarray is a valid subarray.
As we iterate through the bookkeeping array 2 things can happen:
The current element is the target element. In this case, we increment the running sum by 1 and add the number of times the current prefix sum has appeared to our answer. We then increment the frequency of our current prefix sum by 1.
This is because when we increment the current prefix sum it guarantees that the previous prefix sum was lesser than it.
The current element is not the target element. In this case, we decrement the running sum by 1 and subtract the number of times the current prefix sum has appeared from our answer. We then increment the frequency of our current prefix sum by 1.
This is because the current prefix sum is no longer lesser than the previous prefix sum, so we need to remove it from our running sum.
At the end of the iteration, we will have the total number of subarrays in which the target element is a majority element - less.
We also need to update the frequency of the current prefix sum by 1, so that it can be used in future iterations.
And we’re done!
Solution in Python
class Solution:
def countMajoritySubarrays(self, nums: List[int], target: int) -> int:
n = len(nums)
currSum = n
freq = [0] * (2 * n + 1)
freq[n] = 1
less = 0
ans = 0
for num in nums:
if num == target:
less += freq[currSum]
currSum += 1
else:
currSum -= 1
less -= freq[currSum]
freq[currSum] += 1
ans += less
return ans
Complexity
Time: $O(n)$
Since we iterate over all elements in the array once.Space: $O(n)$
Since we use a frequency array of size2*n + 1to keep track of all possible prefix sums.
Mistakes I Made
I had to look this up :(
And we are done.