Equivalent Of R Function 'ave' In Python Pandas

October 04, 2023 Post a Comment

I have a dataframe in R. Example: d1<-structure(list(A = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L), B = 1:9), .Names = c('A', 'B'), class = 'data.frame', row.names = c(NA, -9L)

Solution 1:

The R ave function (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/ave.html) applies the function (default is averaging) to combinations of observations with the same factors levels.

In pandas, there is no such function out of the box, but you can do this with a groupby operation.

Starting from your dataframe:

In [86]: df = pd.DataFrame({'A': [1, 1, 1, 2, 2, 2, 2, 3, 3], 'B':range(1,10)})

In [87]: df
Out[87]: 
   A  B
0  1  1
1  1  2
2  1  3
3  2  4
4  2  5
5  2  6
6  2  7
7  3  8
8  3  9

You can add a column C as the result of a grouping by A and calculating the max of B for each group:

Baca Juga

In [88]: df['C'] = df.groupby('A')['B'].transform('max')

In [89]: df
Out[89]: 
   A  B  C
011311232133324742575267627773898399

Note: I use the transform method here because I want to end up with the same index as the original dataframe.

For more information on the groupby functionalities in pandas, see http://pandas.pydata.org/pandas-docs/stable/groupby.html

Python Guru

Equivalent Of R Function 'ave' In Python Pandas

Solution 1:

Post a Comment for "Equivalent Of R Function 'ave' In Python Pandas"