Sort Top N And Group 'others' In Pandas Df

June 09, 2024 Post a Comment

Suppose I have the df import pandas as pd dic = {'001': [14], '002': [3], '003': [2], '004': [6], '005': [7], '006': [1], '007': [2]} df

Solution 1:

You can use df.append to add a row at the bottom.

sorted_df = df.sort_values("count", ascending=False)
out = sorted_df.iloc[:3]
out.append(
    {"id": "others", "count": sorted_df["count"].iloc[3:].sum()},
    ignore_index=True,
)

       id  count
0     001     141     005      72     004      63  others      8

Solution 2:

You could create a new id, where values less than the top three are mapped as others, then aggregate to get the new dataframe:

(df
.assign(id = np.where(df['count'].isin(df['count'].nlargest(3)), 
                      df['id'], 
                      'other'))
.groupby('id', 
         as_index = False, 
         sort = False)
.sum()
 )

      id  count
0    001     14
1    005      7
2    004      6
3  other      8

Python Guru

Sort Top N And Group 'others' In Pandas Df

Solution 1:

Solution 2:

Post a Comment for "Sort Top N And Group 'others' In Pandas Df"