Sort Top N And Group 'others' In Pandas Df
Suppose I have the df import pandas as pd dic = {'001': [14], '002': [3], '003': [2], '004': [6], '005': [7], '006': [1], '007': [2]} df
Solution 1:
You can use df.append
to add a row at the bottom.
sorted_df = df.sort_values("count", ascending=False)
out = sorted_df.iloc[:3]
out.append(
{"id": "others", "count": sorted_df["count"].iloc[3:].sum()},
ignore_index=True,
)
id count
0 001 141 005 72 004 63 others 8
Solution 2:
You could create a new id
, where values less than the top three are mapped as others
, then aggregate to get the new dataframe:
(df
.assign(id = np.where(df['count'].isin(df['count'].nlargest(3)),
df['id'],
'other'))
.groupby('id',
as_index = False,
sort = False)
.sum()
)
id count
0 001 14
1 005 7
2 004 6
3 other 8
Post a Comment for "Sort Top N And Group 'others' In Pandas Df"