Skip to content Skip to sidebar Skip to footer

Splitting Pandas Dataframe Into Multiple Dataframes Based On Condition In Column

To prep my data correctly for a ML task, I need to be able to split my original dataframe into multiple smaller dataframes. I want to get all the rows above and including the row w

Solution 1:

You can use np.split which accepts an array of indices where to split:

np.split(df, *np.where(df.BOOL == 1))

If you want to include the rows with BOOL == 1 to the previous data frame you can just add 1 to all the indices:

np.split(df, np.where(df.BOOL == 1)[0] + 1)

Solution 2:

I think using for loop is better here

idx=df.BOOL.nonzero()[0]

d={x : df.iloc[:y+1,:] for x , y in enumerate(idx)}
d[0]
   BOOL USER_ID  VALUE
0     0     001      1
1     1     001      2

Solution 3:

Why not list comprehension? like:

>>> l=[df.iloc[:i+1] for i in df.index[df['BOOL']==1]]
>>> l[0]
   BOOL USER_ID  VALUE
0     0     001      1
1     1     001      2
>>> l[1]
   BOOL USER_ID  VALUE
0     0     001      1
1     1     001      2
2     0     001      3
3     1     001      4
>>> 

Post a Comment for "Splitting Pandas Dataframe Into Multiple Dataframes Based On Condition In Column"