Skip to content Skip to sidebar Skip to footer

Pivot Duplicates Rows Into New Columns Pandas

I have a data frame like this and I'm trying reshape my data frame using Pivot from Pandas in a way that I can keep some values from the original rows while making the duplicates r

Solution 1:

Use cumcount for count groups, create MultiIndex by set_index with unstack and last flatten values of columns:

g = df.groupby(["ID","Agent", "OV"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack(fill_value=0).sort_index(axis=1, level=1)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]

df = df.reset_index()
print (df)
   ID  Agent    OV Zone1  Value1  PTC1 Zone2  Value2  PTC2 Zone3  Value3  PTC3
0   1   10.0  26.0    M1      10   100     0       0     0     0       0     0
1   2   26.5   8.0    M2      50    95    M1       6     5     0       0     0
2   3    4.5   6.0    M3       4    40    M4       6    60     0       0     0
3   4    1.2   0.8    M1       8   100     0       0     0     0       0     0
4   5    2.0   0.4    M1       6    10    M2      41    86    M4       2     4

If want replace to 0 only numeric columns:

g = df.groupby(["ID","Agent"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack().sort_index(axis=1, level=1)

idx = pd.IndexSlice
df.loc[:, idx[['Value','PTC']]] = df.loc[:, idx[['Value','PTC']]].fillna(0).astype(int)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]

df = df.fillna('').reset_index()
print (df)
   ID  Agent    OV Zone1  Value1  PTC1 Zone2  Value2  PTC2 Zone3  Value3  PTC3
0110.026.0    M1      1010000001226.58.0    M2      5095    M1       6500234.56.0    M3       440    M4       66000341.20.8    M1       81000000452.00.4    M1       610    M2      4186    M4       24

Solution 2:

You can using cumcount create the help key , then we do unstack with multiple index flatten (PS : you can add fillna(0) at the end , I did not add it cause I do not think for Zone value 0 is correct )

df['New']=df.groupby(['ID','Agent','OV']).cumcount()+1
new_df=df.set_index(['ID','Agent','OV','New']).unstack('New').sort_index(axis=1 , level=1)
new_df.columns=new_df.columns.map('{0[0]}{0[1]}'.format) 
new_df
Out[40]: 
              Zone1  Value1   PTC1 Zone2  Value2  PTC2 Zone3  Value3  PTC3
ID Agent OV                                                               
110.026.0    M1    10.0100.0None     NaN   NaN  None     NaN   NaN
226.58.0     M2    50.095.0    M1     6.05.0None     NaN   NaN
34.56.0     M3     4.040.0    M4     6.060.0None     NaN   NaN
41.20.8     M1     8.0100.0None     NaN   NaN  None     NaN   NaN
52.00.4     M1     6.010.0    M2    41.086.0    M4     2.04.0

Post a Comment for "Pivot Duplicates Rows Into New Columns Pandas"