Skip to content Skip to sidebar Skip to footer

Pandas Dataframe Merge

I have a concatenated pandas dataframe from 4 dataframes like this: In [121]: all Out[121]: E H N S 102P Y NaN NaN NaN 103R Y NaN NaN NaN 102P NaN

Solution 1:

Do a groupby on the index (I presume from the data you posted that the values 102P... are in the index). And count the values. That will return a DataFrame with zeros and ones. Just replace them with appropriate values.

>>>ndf = df.groupby(level=0).count()>>>ndf[ndf == 1] = 'Y'>>>ndf[ndf == 0] = np.nan>>>ndf
         E    H  N  S
label                
102P     Y  NaN  Y  Y
103R     Y    Y  Y  Y
109F   NaN    Y  Y  Y

If you have repetitions then just change the condition from ndf[ndf == 1] to ndf[ndf > 0].

But why are you concatenating the data frames instead of combining them? Example:

>>>df1
      E   H   N   S
0                  
102P  Y NaN NaN NaN
103R  Y NaN NaN NaN
>>>df2
       E   H  N   S
0                  
102P NaN NaN  Y NaN
103R NaN NaN  Y NaN
109F NaN NaN  Y NaN

...>>>reduce(lambda first, second: first.combine_first(second),
           [df1, df2, df3, df4], pd.DataFrame())
        E    H  N  S
0                   
102P    Y  NaN  Y  Y
103R    Y    Y  Y  Y
109F  NaN    Y  Y  Y

Post a Comment for "Pandas Dataframe Merge"