Equivalent Of R Function 'ave' In Python Pandas
I have a dataframe in R. Example: d1<-structure(list(A = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L), B = 1:9), .Names = c('A', 'B'), class = 'data.frame', row.names = c(NA, -9L)
Solution 1:
The R ave
function (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/ave.html) applies the function (default is averaging) to combinations of observations with the same factors levels.
In pandas, there is no such function out of the box, but you can do this with a groupby operation.
Starting from your dataframe:
In [86]: df = pd.DataFrame({'A': [1, 1, 1, 2, 2, 2, 2, 3, 3], 'B':range(1,10)})
In [87]: df
Out[87]:
A B
0 1 1
1 1 2
2 1 3
3 2 4
4 2 5
5 2 6
6 2 7
7 3 8
8 3 9
You can add a column C as the result of a grouping by A
and calculating the max of B
for each group:
In [88]: df['C'] = df.groupby('A')['B'].transform('max')
In [89]: df
Out[89]:
A B C
011311232133324742575267627773898399
Note: I use the transform method here because I want to end up with the same index as the original dataframe.
For more information on the groupby functionalities in pandas, see http://pandas.pydata.org/pandas-docs/stable/groupby.html
Post a Comment for "Equivalent Of R Function 'ave' In Python Pandas"