Skip to content Skip to sidebar Skip to footer

Multi Level Pandas Groupby

I need to maintain position of 'each scrip per team per account'. So I think pandas groupby might be used. I have tried to state my problem in comment of dataframe aswell. The posi

Solution 1:

Your problem can be solved easily in two steps:

First Step:

import math
df['some_stuff'] = df.apply(lambda x: -x.qty if math.isnan(x.buy_price) else x.qty,axis=1)

this line is creating a new column some_stuff, why I did this is just to introduce some logic of gain and loss in your data.

if you don't want a new column and you like the idea just replace qty column with it like this:

df['qty'] = df.apply(lambda x: -x.qty if math.isnan(x.buy_price) else x.qty,axis=1)

next, I use this new column to create your position column as follows:

df['position'] = df.groupby(['team','account','scrip'])['some_stuff'].cumsum()

which generates this column:

position
       2
       2
       2
       0
       1
      -1

bonus:

if you want to delete the extra column some_stuff just use:

del df['some_stuff']

Second Step:

This is the step where you get your final grouped table with this line:

print(df.groupby(['team', 'account', 'scrip']).min())

final output:

                               time  buy_price  sell_price  qty  position
team  account scrip                                                      
team1 A1      FUT1   06/07/17 09:36       50.0        50.0    1         1
      A2      FUT1   06/07/17 09:46      100.0         NaN    2         2
team2 A3      FUT1   06/07/17 09:56       10.0        10.0    1        -1

I believe this answers your questions.

Documentation:

pandas.DataFrame.apply

pandas.Groupby

pandas.DataFrame.cumsum

pandas.DataFrame.min


Solution 2:

is this what you're looking for?

df.groupby(['team', 'account', 'scrip']).min()

it gives me:

                      time  buy_price  sell_price  qty  position
team  account scrip                                             
team1 A1      FUT1   09:36       50.0        50.0    1         1
      A2      FUT1   09:46      100.0         NaN    2         2
team2 A3      FUT1   09:56       10.0        10.0    1        -1

that's a few more columns than you wanted but you can subset out what you're looking for.

(groupby by default moves the grouped columns to a multilevel index, but if this isn't what you want you can add as_index=False as an arg in the .groupby())


Post a Comment for "Multi Level Pandas Groupby"