Pandas Timeseries Diff() Reverts To Series
I am working with some TimeSeries data in this format: 1984-12-12 14:08:00 1984-12-12 14:25:00 1984-12-12 14:47:00 1984-12-12 16:37:00 1984-12-12 16:37:00 1984-12-12 16:
Solution 1:
Pandas 0.12.0, Numpy 1.7.1, Python 2.7.5, Linux Mint
import pandas as pd
import StringIO
data = '''time
1984-12-12 14:08:00
1984-12-12 14:25:00
1984-12-12 14:47:00
1984-12-12 16:37:00
1984-12-12 16:37:00
1984-12-12 16:37:00
1984-12-12 17:52:00
1984-12-12 17:52:00
1984-12-12 19:29:00'''
df = pd.read_csv(StringIO.StringIO(data))
df['time'] = pd.DatetimeIndex(df['time'])
df['delta'] = df['time'].diff()
#df['delta'] = pd.TimeSeries(df['delta']) # sorry, not needed#df['delta'][0] = 0 # to remove NaT # better method to remove NaT - thanks to Jeff
df['delta'] = df['delta'].fillna(0)
df['cumsum'] = df['delta'].cumsum()
print df
result
timedeltacumsum01984-12-12 14:08:00 00:00:0000:00:0011984-12-12 14:25:00 00:17:0000:17:0021984-12-12 14:47:00 00:22:0000:39:0031984-12-12 16:37:00 01:50:0002:29:0041984-12-12 16:37:00 00:00:0002:29:0051984-12-12 16:37:00 00:00:0002:29:0061984-12-12 17:52:00 01:15:0003:44:0071984-12-12 17:52:00 00:00:0003:44:0081984-12-12 19:29:00 01:37:0005:21:00
Post a Comment for "Pandas Timeseries Diff() Reverts To Series"