How To Apply Different Functions To A Groupby Object?
Solution 1:
Here's a slightly tongue-in-cheek solution:
>>> df.groupby(['id', 'min_max'])['value'].apply(lambda g: getattr(g, g.name[1][:3])()).unstack()
min_max max_val min_val
id
1 3 10
2 20 -10
This applies a function that grabs the name of the real function to apply from the group key.
Obviously this wouldn't work so simply if there weren't such a simple relationship between the string "max_val" and the function name "max". It could be generalized by having a dict mapping column values to functions to apply, something like this:
func_map = {'min_val': min, 'max_val': max}
df.groupby(['id', 'min_max'])['value'].apply(lambda g: func_map[g.name[1]](g)).unstack()
Note that this is slightly less efficient than the version above, since it calls the plain Python max/min rather than the optimized pandas versions. But if you want a more generalizable solution, that's what you have to do, because there aren't optimized pandas versions of everything. (This is also more or less why there's no built-in way to do this: for most data, you can't assume a priori that your values can be mapped to meaningful functions, so it doesn't make sense to try to determine the function to apply based on the values themselves.)
Solution 2:
One option is to do the customized aggregation with groupby.apply
, since it doesn't fit with built in aggregation scenario well:
(df.groupby('id')
.apply(lambda g: pd.Series({'max': g.value[g.min_max == "max_val"].max(),
'min': g.value[g.min_max == "min_val"].min()})))
# max min
#id
# 1 3 10
# 2 20 -10
Solution 3:
Solution with pivot_table
:
df1 = df.pivot_table(index='id', columns='min_max', values='value', aggfunc=[np.min,np.max])
df1 = df1.loc[:, [('amin','min_val'), ('amax','max_val')]]
df1.columns = df1.columns.droplevel(1)
print (df1)
amin amax
id
1 10 3
2 -10 20
Post a Comment for "How To Apply Different Functions To A Groupby Object?"