Skip to content Skip to sidebar Skip to footer

Pandas Number Of Consecutive Occurrences In Previous Rows

I have OHLC data. The candle can be either 'green' (if the close is above open) or 'red' (if the open is above the close). The format is: open close candletype 0 542 543 GREEN 1

Solution 1:

You can apply lambda functions to rolling windows. See Applying lambda function to a pandas rolling window series

You can either categorize them or map them on your own to some numbers:

df = pd.read_clipboard()
df['code'] = df.candletype.astype('category').cat.codes

This results in following DataFrame:

    open    close   candletype  code
0   542 543 GREEN   0
1   543 544 GREEN   0
2   544 545 GREEN   0
3   545 546 GREEN   0
4   546 547 GREEN   0
5   547 542 RED 1
6   542 543 GREEN   0

Now just apply df['code'].rolling(3).apply(lambda x: all(x==0)).shift(), resulting in 0

     NaN
1    NaN
2    NaN
3    1.0
4    1.0
5    1.0
6    0.0

and fill your nans and zeros as expected/wanted.

This neither is a oneliner, but maybe more pretty than the string comparison. Hope it helps you!


Solution 2:

Rolling window works over numbers rather than strings, so factorize and apply and use set to check of equality i.e

df['new'] = pd.Series(df['candletype'].factorize()[0]).rolling(window=4).apply(lambda x : set(x[:-1]) == {0})

df['new'].replace({1:'Consective 3 Green',0:'No Pattern'})

0                   NaN
1                   NaN
2                   NaN
3    Consective 3 Green
4    Consective 3 Green
5    Consective 3 Green
6            No Pattern
Name: new, dtype: object

Along side rolling apply you can play with zip too for this i.e

def get_list(x,m) : 
    x = zip(*(x[i:] for i in range(m)))
    return ['3 Greens' if set(i[:-1]) == {'GREEN'} else 'no pattern' for i in x]

df['new'] = pd.Series(get_list(df['candletype'],4), index=df.index[4 - 1:])

   open  close candletype         new
0   542    543      GREEN         NaN
1   543    544      GREEN         NaN
2   544    545      GREEN         NaN
3   545    546      GREEN    3 Greens
4   546    547      GREEN    3 Greens
5   547    542        RED    3 Greens
6   542    543      GREEN  no pattern

Solution 3:

This one-liner can count the number of consecutive occurences in your serie. However it is kind of tricky and therefore not so easy to read for other users or future you! It is very well explained in this post.

df = pd.read_clipboard()
df['pattern'] = df.groupby((df.candletype != df.candletype.shift()).cumsum()).cumcount()
df
>>>    open  close candletype  pattern
0   542    543      GREEN        0
1   543    544      GREEN        1
2   544    545      GREEN        2
3   545    546      GREEN        3
4   546    547      GREEN        4
5   547    542        RED        0
6   542    543      GREEN        0

This is not exactly the output you provided but here you have the exact count of consecutive values. You can then apply any cosmetic details to this series (replace values below your threshold by Toofewrows, etc.)


Post a Comment for "Pandas Number Of Consecutive Occurrences In Previous Rows"