Skip to content Skip to sidebar Skip to footer

How To Resample / Downsample An Irregular Timestamp List?

SImple question but I haven't been able to find a simple answer. I have a list of data which counts the time in seconds that events occur: [200.0 420.0 560.0 1100.0 1900.0 2700.0 3

Solution 1:

all_events = [
    200.0, 420.0, 560.0, 1100.0, 1900.0, 2700.0, 3400.0, 3900.0, 4234.2, 4800.0]

def get_events_by_hour(all_events):
    return [
        len([x for x in all_events if int(x/3600.0) == hour]) 
        for hour in xrange(24)
    ]

print get_events_by_hour(all_events)

Note that all_events should contain events for one day.


Solution 2:

The act of sampling means taking data f_i (samples) at certain discrete times t_i. The number of samples per time unit gives the sampling rate. Downsampling is a special case of resampling, which means mapping the sampled data onto a different set of sampling points t_i', here onto one with a smaller sampling rate, making the sample more coarse.

Your first list is containing sample points t_i (unit is seconds), and indirectly the number of events n_i which corresponds to the index i, for example n_i = i + 1.

If you reduce the list once in a while, after a periodic time T (unit is seconds), you are resampling to a new set n_i' at times t_i' = i * T. I did not write downsampling, because nothing might happen within an the time T, which means upsampling, because you take more data points now.

For calculation you check if the input list is empty, in that case n' = 0 should go into your output list. Otherwise you have m entries in your input list, measured over time T and you can use the below equation:

n' = m * 3600 / T

The above n' would go into your output list, this is scaled to events per hour.


Solution 3:

The question has the scipy tag, and scipy depends on numpy, so I assume an answer using numpy is acceptable.

To get the hour associated with a timestamp t you can take the integer part of t/3600. Then, to get the number of events in each hour, you can count the number of occurrences of these integers. The numpy function bincount can do that for your.

Here's a numpy one-liner for the calculation. I put the timestamps in a numpy array t:

In [49]: t = numpy.array([200.0, 420.0, 560.0, 1100.0, 1900.0, 2700.0, 3400.0, 3900.0, 4234.2, 4800.0, 8300.0, 8400.0, 9500.0, 10000.0, 14321.0, 15999.0, 16789.0, 17000.0])

In [50]: t
Out[50]: 
array([   200. ,    420. ,    560. ,   1100. ,   1900. ,   2700. ,
         3400. ,   3900. ,   4234.2,   4800. ,   8300. ,   8400. ,
         9500. ,  10000. ,  14321. ,  15999. ,  16789. ,  17000. ])

Here's your calculation:

In [51]: numpy.bincount((t/3600).astype(int))
Out[51]: array([7, 3, 4, 1, 3])

Post a Comment for "How To Resample / Downsample An Irregular Timestamp List?"