Skip to content Skip to sidebar Skip to footer

Pandas Dataframe Merging Rows To Remove Nan

I have a dataframe with some NaNs: hostname period Teff 51 Peg 4.2293 5773 51 Peg 4.231 NaN 51 Peg 4.23077 NaN 55 Cnc 44.3787 NaN 55 Cnc 44.373 NaN 55 Cnc 44.4175 NaN 55

Solution 1:

Use groupby.first; It takes the first non NA value:

df.groupby('hostname')[['period', 'Teff']].first().reset_index()
#  hostname   period  Teff#0      Cnc  44.3787  5234#1      Peg   4.2293  5773#2      Vir  38.0210  5577

Or manually do this with a custom aggregation function:

df.groupby('hostname')[['period', 'Teff']].agg(lambda x: x.dropna().iat[0]).reset_index()

This requires each group has at least one non NA value.

Write your own function to handle the edge case:

deffirst_(g):
    non_na = g.dropna()
    return non_na.iat[0] iflen(non_na) > 0else pd.np.nan

df.groupby('hostname')[['period', 'Teff']].agg(first_).reset_index()

#  hostname   period  Teff#0      Cnc  44.3787  5234#1      Peg   4.2293  5773#2      Vir  38.0210  5577

Solution 2:

Is this what you need ?

pd.concat([ df1.apply(lambda x: sorted(x, key=pd.isnull)) for _, df1 in df.groupby('hostname')]).dropna()
Out[343]: 
   hostname   period    Teff
55      Cnc  44.37875234.051      Peg   4.22935773.061      Vir  38.02105577.0

Post a Comment for "Pandas Dataframe Merging Rows To Remove Nan"