Pandas Rolling Vs Scipy Kurtosis - Serious Numerical Inaccuracy
First and foremost, I'm sorry for the clearly not minimal examples that I listed below. I am fully aware this doesn't meet SO's minimally reproducible constraint, however, having b
Solution 1:
It looks like a bug in older Pandas version. I could reproduce on an old installation Python 3.6.2 64 bit on win32, Pandas 1.0.3, numpy 1.15.4:
>>> s3.rolling(20,min_periods=3).kurt().tail(10)
8909.5910718919.5910718929.5910718939.59107189419.66368589515.24836189640.4448948971368.233241898251407.375343899902540.031652
dtype: float64
It seems to be fixed on my newer version, Python 3.8.4 64 bit, Pandas 1.2.2, numpy 1.20.1:
>>> s3.rolling(20,min_periods=3).kurt().tail(10)
8909.5910678919.5910678929.5910678939.59106789419.66366689514.87226289614.14715889716.7169898987.03703789920.000000
dtype: float64
both installations on the same Windows 10 machine.
I cannot say which component (Pandas or numpy) is the cause. As your tests using numpy.stats.kurtosis give correct result, I would suspect Pandas, but without further analysis by Pandas experts (and I am not one) I cannot be affirmative.
IMHO, the most reasonable solution is either to upgrade your system, or add a fresh new independant Python installation with the last possible Pandas version.
Post a Comment for "Pandas Rolling Vs Scipy Kurtosis - Serious Numerical Inaccuracy"