How To Split Dataframe On Based On Columns Row
I have one excel file , dataframe have 20 rows . after few rows there is again column names row, i want to divide dataframe based on column names row. here is example: x 0 1 2 3 4
Solution 1:
Considering your column name is col
, you can first group the dataframe taking a cumsum
on the col
where the value equals x
by df['col'].eq('x').cumsum()
, then for each group create a dataframe by taking the values from the 2nd row of that group and the columns as the first value of that group using df.iloc[]
and save them in a dictionary:
d={f'df{i}':pd.DataFrame(g.iloc[1:].values,columns=g.iloc[0].values)
for i,g in df.groupby(df['col'].eq('x').cumsum())}
print(d['df1'])
x
0 0
1 1
2 2
3 3
4 4
print(d['df2'])
x
0 23
1 34
2 5
3 6
Solution 2:
Use df.index[df['x'] == 'x']
to look for the row index of where the column name appears again.
Then, split the dataframe into 2 based on the index found
df = pd.DataFrame(columns=['x'], data=[[0], [1], [2], [3], [4], ['x'], [23], [34], [5], [6]])
df1 = df.iloc[:df.index[df['x'] == 'x'].tolist()[0]]
df2 = df.iloc[df.index[df['x'] == 'x'].tolist()[0]+1:]
Post a Comment for "How To Split Dataframe On Based On Columns Row"