Skip to content Skip to sidebar Skip to footer

Replace Values Within A Groupby Based On Multiple Conditions

My question is related to this one but I'm still not seeing how I can apply the answer to my problem. I have a DataFrame like so: df = pd.DataFrame({ 'date': ['2001-01-01', '20

Solution 1:

  1. Determine the maximum value of val PER GROUP of cohort
  2. Determine the maximum date associated with val
  3. Perform vectorised comparison and replacement with np.where

v = df.groupby('cohort').val.transform('max')
df['val'] = np.where(
    df.date <= df.set_index('cohort').val.idxmax(), v, df.val
)

dfdatecohortval02001-01-01  2001-01-01  10212001-02-01  2001-01-01  10222001-03-01  2001-01-01  10232001-04-01  2001-01-01  10142001-02-01  2001-02-01  20152001-03-01  2001-02-01  20162001-04-01  2001-02-01  201

Post a Comment for "Replace Values Within A Groupby Based On Multiple Conditions"