Find The Start Position Of The Longest Sequence Of 1's

May 31, 2023 Post a Comment

I want to find the start position of the longest sequence of 1's in my array: a1=[0,0,1,1,1,1,0,0,1,1] #2 I am following this answer to find the length of the longest sequence. Ho

Solution 1:

Inspired by this solution, here's a vectorized approach to solve it -

# Get start, stop index pairs for islands/seq. of 1sidx_pairs = np.where(np.diff(np.hstack(([False],a1==1,[False]))))[0].reshape(-1,2)

# Get the island lengths, whose argmax would give us the ID of longest island.# Start index of that island would be the desired outputstart_longest_seq = idx_pairs[np.diff(idx_pairs,axis=1).argmax(),0]

Sample run -

In[89]: a1 # InputarrayOut[89]: array([0, 0, 1, 1, 1, 1, 0, 0, 1, 1])

In[90]: idx_pairs # Start, stop+1indexpairsOut[90]: 
array([[ 2,  6],
       [ 8, 10]])

In[91]: np.diff(idx_pairs,axis=1) # IslandlengthsOut[91]: 
array([[4],
       [2]])

In[92]: np.diff(idx_pairs,axis=1).argmax() # LongestislandIDOut[92]: 0In[93]: idx_pairs[np.diff(idx_pairs,axis=1).argmax(),0] # LongestislandstartOut[93]: 2

Solution 2:

This seems to work, using groupby from itertools, this only goes through the list once:

from itertools import groupby

pos, max_len, cum_pos = 0, 0, 0for k, g in groupby(a1):
    if k == 1:
        pat_size = len(list(g))
        pos, max_len = (pos, max_len) if pat_size < max_len else (cum_pos, pat_size)
        cum_pos += pat_size
    else:
        cum_pos += len(list(g))

pos# 2
max_len
# 4

Solution 3:

A more compact one-liner using groupby(). Uses enumerate() on the raw data to keep the starting positions through the analysis pipeline, evenutally ending up with the list of tuples [(2, 4), (8, 2)] each tuple containing the starting position and length of non-zero runs:

from itertools import groupby

L = [0,0,1,1,1,1,0,0,1,1]

printmax(((lambda y: (y[0][0], len(y)))(list(g)) for k, g in groupby(enumerate(L), lambda x: x[1]) if k), key=lambda z: z[1])[0]

lambda: x is the key function for groupby() since we enumerated L

lambda: y packages up results we need since we can only evaluate g once, without saving

lambda: z is the key function for max() to pull out the lengths

Prints '2' as expected.

Solution 4:

You could use a for loop and check if the next few items (of length m where m is the max length) are the same as the maximum length:

# Using your list and the answer from the post you referredfrom itertools import groupby
L = [0,0,1,1,1,1,0,0,1,1]
m = max(sum(1for i in g) for k, g in groupby(L))
# Here is the for loopfor i, s inenumerate(L):
    iflen(L) - i + 2 < len(L) - m:
        breakif s == 1and0notin L[i:i+m]:
        print i
        break

This will give:

Solution 5:

Another way of doing in a single loop, but without resorting to itertool's groupby.

max_start = 0
max_reps = 0
start = 0
reps = 0for (pos, val) in enumerate(a1):
    start = pos ifreps== 0elsestartreps= reps + 1ifval== 1else0
    max_reps = max(reps, max_reps)
    max_start = start ifreps== max_reps else max_start

This could also be done in a one-liner fashion using reduce:

max_start = reduce(lambda (max_start, max_reps, start, reps), (pos, val): (start if reps ==max(reps, max_reps) else max_start, max(reps, max_reps), pos if reps ==0elsestart, reps +1 if val ==1else0), enumerate(a1), (0, 0, 0, 0))[0]

In Python 3, you cannot unpack tuples inside the lambda arguments definition, so it's preferable to define the function using def first:

deffunc(acc, x):
    max_start, max_reps, start, reps = acc
    pos, val = x
    return (start if reps == max(reps, max_reps) else max_start,
            max(reps, max_reps),
            pos if reps == 0else start,
            reps + 1if val == 1else0)

max_start = reduce(func, enumerate(a1), (0, 0, 0, 0))[0]

In any of the three cases, max_start gives your answer (i.e. 2).

Python Guru