Problem Statement in English
You’re given a binary string s consisting of only ‘0’s and ‘1’s. You can perform at most one swap operation on any substring, where you can choose two indices i and j (0-indexed) and swap the characters at those indices.
Return the length of the longest balanced substring that can be obtained after performing at most one swap operation. A balanced substring is a substring that contains an equal number of ‘0’s and ‘1’s.
Approach
There’s a little observation we need to make here. When we make a change, we’re going to either increase the number of ‘1’s by 1 and decrease the number of ‘0’s by 1, or vice versa.
That’s actually 2 changes per swap.
Thus, we can use prefix sums to keep track of the difference between the number of ‘1’s and ‘0’s at each index. But how do we do that?
Bookkeeping. We count each ‘1’ as $+1$ and each ‘0’ as $-1$. This way, the prefix sum at any index will give us the difference between the number of ‘1’s and ‘0’s up to that index.
Now to calculate the number of ‘1’s and ‘0’s in a substring, we can use the prefix sums we computed. The length of the substring is length = i - l, where i is the current index and l is the index of the first occurrence of the target prefix sum. The difference in the number of ‘1’s and ‘0’s in that substring is diff = current_sum - target. We’ll get to what the target prefix sum is in a moment.
To calculate the number of ‘1’s in the substring, we can use the formula ones = (length + diff) // 2, since the number of ‘1’s is equal to the number of ‘0’s plus the difference. The number of ‘0’s in the substring can be calculated as zeros = length - ones. Here’s how we derive that:
- Let
onesbe the number of ‘1’s in the substring andzerosbe the number of ‘0’s in the substring. - We know that
ones + zeros = length(the total length of the substring). - We also know that
ones - zeros = diff(the difference in the number of ‘1’s and ‘0’s in the substring). - Adding these two equations gives us
2 * ones = (length + diff). - Therefore,
ones = (length + diff) // 2, andzeros = length - ones.
What is the target prefix sum? It can be the following:
current_sum: This means we are trying to see if the current substring can be such that it is already balanced without any swaps.current_sum - 2: This means we are trying to see if the current substring can be such that it has 2 more ‘1’s than ‘0’s, and we can swap a ‘1’ with a ‘0’ to balance it - because removing a $1$ would subtract 1 from the sum, and adding a $0$ would add 1 to the sum.current_sum + 2: This means we are trying to see if the current substring can be such that it has 2 more ‘0’s than ‘1’s, and we can swap a ‘0’ with a ‘1’ to balance it - because removing a $0$ would subtract 1 from the sum, and adding a $1$ would add 1 to the sum.
Now we need to check if the substring can be balanced with at most one swap. This means that either the substring is already balanced (diff == 0), or we can swap a ‘1’ with a ‘0’ to balance it (diff == 2 and there is an extra ‘0’ to swap for a ‘1’ outside the current substring), or diff == -2 (and there is an extra ‘1’ to swap for a ‘0’ outside the current substring).
To check if there are enough ‘0’s or ‘1’s to swap, we can keep track of the total number of ‘0’s and ‘1’s in the string and compare it with the number of ‘0’s and ‘1’s in the current substring. If we have more ‘0’s in the string than in the substring, we can swap a ‘0’ with a ‘1’ to balance it. Similarly, if we have more ‘1’s in the string than in the substring, we can swap a ‘1’ with a ‘0’ to balance it.
There’s one last thing that we need to take care of. Keeping track of target prefix sums. We only need to store the first two occurrences of each prefix sum in the first dictionary. This is because we are only interested in the longest substring that can be formed with at most one swap, and if we have already found a substring with the same prefix sum, we don’t need to consider it again. Why 2 occurrences? Because the first occurence gives us the longest and is ideal, but it may not have enough ‘0’s or ‘1’s to swap outside the current substring, so we also consider the second occurence which makes our resultant substring a little shorter but may have enough ‘0’s or ‘1’s to swap, since we have some extra characters outside the current substring, both on the left and right of the current substring.
And we are done!
Solution in Python
class Solution:
def longestBalanced(self, s: str) -> int:
n = len(s)
total0 = s.count('0')
total1 = n - total0
first = {0: [-1]}
curr_sum = ans = 0
for i in range(n):
curr_sum += 1 if s[i] == '1' else -1
for target in (curr_sum, curr_sum-2, curr_sum+2):
if target not in first:
continue
for l in first[target]:
length = i - l
diff = curr_sum - target
ones = (length + diff) // 2
zeros = length - ones
if (diff == 0 or
(diff == 2 and total0 > zeros) or
(diff == -2 and total1 > ones)):
ans = max(ans, length)
break
if curr_sum not in first:
first[curr_sum] = []
if len(first[curr_sum]) < 2:
first[curr_sum].append(i)
return ans
Complexity
Time: $O(n)$
Since we are iterating through the string once and performing constant time operations for each character, the time complexity is $O(n)$.Space: $O(n)$
Since we are storing the first two occurrences of each prefix sum in thefirstdictionary, the space complexity is $O(n)$ in the worst case when all prefix sums are unique.
Mistakes I Made
I had to look this one up :(
And we are done.