Counting The Number Of Cgg In Microsatelites
I have this task to find a number of repeats of CGG in a sequence that stored as a value in a dictionary (named 'dict' below as an example). The number of repeats in a row should b
Solution 1:
It can be done by testing the presence of n*"CGG" in the string with .index() and decreasing the value of n (int). For example, in a string of length 20, you test if 6*"CGG" is present : if yes, you remember it and you create the substring without this 6*"CGG" and then you try it with 5*"CGG" etc...
The function below works on this logic and is able to detect if you have more than one tandem of the same lenght in the string:
deftandem_search(pattern,string):
st=string
result=[]
for i inrange(len(dic['ind_1'])//3+1,5,-1):
whileTrue:
try:
j=st.index(i*pattern)
result.append(i)
st=st[:j]+st[j+i*3:]
except:
breakreturn(result)
With it, I get the following results:
tandem_search("CGG",dic['ind_1']) = [47]
tandem_search("CGG",dic['ind_10']) = [70]
Post a Comment for "Counting The Number Of Cgg In Microsatelites"