Skip to content Skip to sidebar Skip to footer

Pandas Timeseries Diff() Reverts To Series

I am working with some TimeSeries data in this format: 1984-12-12 14:08:00 1984-12-12 14:25:00 1984-12-12 14:47:00 1984-12-12 16:37:00 1984-12-12 16:37:00 1984-12-12 16:

Solution 1:

Pandas 0.12.0, Numpy 1.7.1, Python 2.7.5, Linux Mint

import pandas as pd
import StringIO

data = '''time
1984-12-12 14:08:00
1984-12-12 14:25:00
1984-12-12 14:47:00
1984-12-12 16:37:00
1984-12-12 16:37:00
1984-12-12 16:37:00
1984-12-12 17:52:00
1984-12-12 17:52:00
1984-12-12 19:29:00'''

df = pd.read_csv(StringIO.StringIO(data))

df['time'] = pd.DatetimeIndex(df['time'])

df['delta'] = df['time'].diff()

#df['delta'] = pd.TimeSeries(df['delta']) # sorry, not needed#df['delta'][0] = 0 # to remove NaT # better method to remove NaT - thanks to Jeff
df['delta'] = df['delta'].fillna(0) 

df['cumsum'] = df['delta'].cumsum()

print df

result

timedeltacumsum01984-12-12 14:08:00   00:00:0000:00:0011984-12-12 14:25:00   00:17:0000:17:0021984-12-12 14:47:00   00:22:0000:39:0031984-12-12 16:37:00   01:50:0002:29:0041984-12-12 16:37:00   00:00:0002:29:0051984-12-12 16:37:00   00:00:0002:29:0061984-12-12 17:52:00   01:15:0003:44:0071984-12-12 17:52:00   00:00:0003:44:0081984-12-12 19:29:00   01:37:0005:21:00

Post a Comment for "Pandas Timeseries Diff() Reverts To Series"