Find The Shortest Substring Whose Replacement Makes The String Contain Equal Number Of Each Character
Solution 1:
The minimal substring can be found in O(N)
time and O(N)
space.
First count a frequency fr[i]
of each character from the input of length n
.
Now, the most important thing to realise is that the necessary and sufficient condition for a substring to be considered minimal, it must contain each excessive character with a frequency of at least fr[i] - n/4
. Otherwise, it won't be possible to replace the missing characters. So, our task is to go through each such substring and pick the one with the minimal length.
But how to find all of them efficiently?
At start, minLength
is n
. We introduce 2
pointer indices - left
and right
(initially 0
) that define a substring from left
to right
in the original string str
. Then, we increment right
until the frequency of each excessive character in str[left:right]
is at least fr[i] - n/4
. But it's not all yet since str[left : right]
may contain unnecessary chars to the left (for example, they're not excessive and so can be removed). So, we increment left
as long as str[left : right]
still contains enough excessive elements. When we're finished we update minLength
if it's larger than right - left
. We repeat the procedure until right >= n
.
Let's consider an example. Let GAAAAAAA
be the input string. Then, the algorithm steps are as below:
1.Count frequencies of each character:
['G'] = 1, ['A'] = 6, ['T'] = 0, ['C'] = 0 ('A'is excessive here)
2.Now iterate through the original string:
Step#1: |G|AAAAAAAsubstr = 'G' - no excessive chars (left = 0, right = 0)
Step#2: |GA|AAAAAAsubstr = 'GA' - 1 excessive char, we need 5 (left = 0, right = 1)
Step#3: |GAA|AAAAAsubstr = 'GAA' - 2 excessive chars, we need 5 (left = 0, right = 2)
Step#4: |GAAA|AAAAsubstr = 'GAAA' - 3 excessive chars, we need 5 (left = 0, right = 3)
Step#5: |GAAAA|AAAsubstr = 'GAAAA' - 4 excessive chars, we need 5 (left = 0, right = 4)
Step#6: |GAAAAA|AAsubstr = 'GAAAAA' - 5 excessive chars, nice but can we remove something from left? 'G' is not excessive anyways. (left = 0, right = 5)
Step#7: G|AAAAA|AAsubstr = 'AAAAA' - 5 excessive chars, wow, it's smaller now. minLength = 5 (left = 1, right = 5)
Step#8: G|AAAAAA|A
substr = 'AAAAAA' - 6 excessive chars, nice, but can we reduce the substr? There's a redundant 'A'(left = 1, right = 6)
Step#9: GA|AAAAA|Asubstr = 'AAAAA' - 5 excessive chars, nice, minLen = 5 (left = 2, right = 6)
Step#10: GA|AAAAAA|substr = 'AAAAAA' - 6 excessive chars, nice, but can we reduce the substr? There's a redundant 'A'(left = 2, right = 7)
Step#11: GAA|AAAAA|
substr = 'AAAAA' - 5 excessive chars, nice, minLen = 5 (left = 3, right = 7)
Step#12: That's it as right >= 8
Or the full code below:
from collections import Counter
n =int(input())
gene = raw_input()
char_counts = Counter()
for i inrange(n):
char_counts[gene[i]] +=1
n_by_4 = n /4
min_length = n
left=0right=0
substring_counts = Counter()
while right< n:
substring_counts[gene[right]] +=1right+=1
has_enough_excessive_chars =Truefor ch in "ACTG":
diff = char_counts[ch] - n_by_4
# the char cannot be used to replace other items
if (diff >0) and (substring_counts[ch] < diff):
has_enough_excessive_chars =False
break
if has_enough_excessive_chars:
while left<rightand substring_counts[gene[left]] > (char_counts[gene[left]] - n_by_4):
substring_counts[gene[left]] -=1left+=1
min_length =min(min_length, right-left)
print (min_length)
Solution 2:
Here's one solution with limited testing done. This should give you some ideas on how to improve your code.
from collections import Counter
import sys
import math
n =int(input())
s1 = input()
s = Counter(s1)
if all(e <= n/4for e in s.values()):
print(0)
sys.exit(0)
result= math.inf
out=0for mnum inrange(n):
s[s1[mnum]] -=1
while all(e <= n/4for e in s.values()) andout<= mnum:
result=min(result, mnum -out+1)
s[s1[out]] +=1out+=1
print(result)
Post a Comment for "Find The Shortest Substring Whose Replacement Makes The String Contain Equal Number Of Each Character"