Skip to content Skip to sidebar Skip to footer

Pandas Calculate Cagr With Slicing (missing Values)

As a follow-up to this question, I'd like to calculate the CAGR from a pandas data frame such as this, where there are some missing data values: df = pd.DataFrame({'A' : ['1','2',

Solution 1:

When calculating returns from a level, it's ok to use most recent available. For example, when calculating CAGR for row 1, we want to use (5/7) ^ (1/3) - 1. Also, for row 3 (9/7) ^ (1/3). There is an assumption made that we annualize across all years looked at.

With these assumptions:

df = df.bfill(axis=1).ffill(axis=1)

Then apply solution from linked question.

df['CAGR'] = df.T.pct_change().add(1).prod().pow(1./(len(df.columns) - 1)).sub(1)

With out this assumption. The only other reasonable choice would be to annualize by the number of non-NaN observations. So I need to track that with:

notnull = df.notnull().sum(axis=1)
df = df.bfill(axis=1).ffill(axis=1)
df['CAGR'] = df.T.pct_change().add(1).prod().pow(1./(notnull.sub(1))).sub(1)

In fact, this becomes the more general solution as it will work with the case with out nulls as well.

Post a Comment for "Pandas Calculate Cagr With Slicing (missing Values)"