Numpy: Fast Regularly-spaced Average For Large Numbers Of Line Segments / Points

January 26, 2024 Post a Comment

I have many (~ 1 million) irregularly spaced points, P, along a 1D line. These mark segments of the line, such that if the points are {0, x_a, x_b, x_c, x_d, ...}, the segments go

Solution 1:

As you would expect, there is a purely numpy solution to this problem. The trick is to cleverly mix np.searchsorted, which will place your regular grid on the nearest bin of the original, and np.add.reduceat to compute the sums of the bins:

import numpy as np

defdistribute(x, y, n):
    """
    Down-samples/interpolates the y-values of each segment across a
    domain with `n` points. `x` represents segment endpoints, so should
    have one more element than `y`. 
    """
    y = np.asanyarray(y)
    x = np.asanyarray(x)

    new_x = np.linspace(x[0], x[-1], n + 1)
    # Find the insertion indices
    locs = np.searchsorted(x, new_x)[1:]
    # create a matrix of indices
    indices = np.zeros(2 * n, dtype=np.int)
    # Fill it in
    dloc = locs[:-1] - 1
    indices[2::2] = dloc
    indices[1::2] = locs

    # This is the sum of every original segment a new segment touches
    weighted = np.append(y * np.diff(x), 0)
    sums = np.add.reduceat(weighted, indices)[::2]

    # Now subtract the adjusted portions from the right end of the sums
    sums[:-1] -= (x[dloc + 1] - new_x[1:-1]) * y[dloc]
    # Now do the same for the left of each interval
    sums[1:] -= (new_x[1:-1] - x[dloc]) * y[dloc]

    return new_x, sums / np.diff(new_x)


seg = [0, 1.1, 2.2, 2.3, 2.8, 4]
color = [0, 0.88, 0.55, 0.11, 0.44]

seg, color = distribute(seg, color, 4)
print(seg, color)

The result is

[0. 1. 2. 3. 4.][0.    0.792 0.374 0.44 ]

Which is exactly what you expected in your manual computation.

Benchmarks

I ran the following set of benchmarks to make sure that both EE_'s solution and mine agreed on the answer, and to check out the timings. I modified the other solution slightly to have the same interface as mine:

from scipy.interpolate import interp1d

def EE_(x, y, n):
    I = np.zeros_like(x)
    I[1:] = np.cumsum(np.diff(x) * y)
    f = interp1d(x, I, bounds_error=False, fill_value=(0, I[-1]))
    pix_x = np.linspace(x[0], x[-1], n + 1)
    pix_y = (f(pix_x[1:]) - f(pix_x[:-1])) / (pix_x[1:] - pix_x[:-1])
    return pix_x, pix_y

And here is the testbench (the method MadPhysicist is just renamed from the distribute function above). The input is always 1001 elements for x and 1000 elements for y. The output numbers are 5, 10, 100, 1000, 10000:

np.random.seed(0x1234ABCD)

x = np.cumsum(np.random.gamma(3.0, 0.2, size=1001))
y = np.random.uniform(0.0, 1.0, size=1000)

tests = (
    MadPhysicist,
    EE_,
)

for n in (5, 10, 100, 1000, 10000):
    print(f'N = {n}')
    results = {test.__name__: test(x, y, n) for test in tests}

    for name, (x_out, y_out) in results.items():
        print(f'{name}:\n\tx = {x_out}\n\ty = {y_out}')

    allsame = np.array([[np.allclose(x1, x2) and np.allclose(y1, y2)
                         for x2, y2 in results.values()]
                        for x1, y1 in results.values()])
    print()
    print(f'Result Match:\n{allsame}')

    from IPython import get_ipython
    magic = get_ipython().magic

    for test in tests:
        print(f'{test.__name__}({n}):\n\t', end='')
        magic(f'timeit {test.__name__}(x, y, n)')

I will skip the data and agreement printouts (the results are identical), and show the timings:

N = 5MadPhysicist:50.6 µs ± 349 ns per loop (mean ± std. dev. of7 runs, 10000 loops each)
EE_:110 µs ± 568 ns per loop (mean ± std. dev. of7 runs, 10000 loops each)

N = 10MadPhysicist:50.5 µs ± 732 ns per loop (mean ± std. dev. of7 runs, 10000 loops each)
EE_:111 µs ± 635 ns per loop (mean ± std. dev. of7 runs, 10000 loops each)   

N = 100MadPhysicist:54.5 µs ± 284 ns per loop (mean ± std. dev. of7 runs, 10000 loops each)
EE_:114 µs ± 215 ns per loop (mean ± std. dev. of7 runs, 10000 loops each)

N = 1000MadPhysicist:107 µs ± 5.73 µs per loop (mean ± std. dev. of7 runs, 10000 loops each)
EE_:148 µs ± 5.11 µs per loop (mean ± std. dev. of7 runs, 10000 loops each)

N = 10000MadPhysicist:458 µs ± 2.21 µs per loop (mean ± std. dev. of7 runs, 1000 loops each)
EE_:301 µs ± 4.57 µs per loop (mean ± std. dev. of7 runs, 1000 loops each)

You can see that at smaller output sizes, the numpy solution is much faster, probably because the overhead dominates. However, at larger numbers of break points, the scipy solution becomes much faster. You would have to compare at different input sizes to get a real idea of how the timing works out, not just different output sizes.

Solution 2:

You can still accomplish this with linear interpolation. Although your function is piecewise constant, you want its average over a whole lot of small intervals. The average of some function f(x) over an interval from a to b is just its integral over that range divided by the difference between a and b. The integral of a piecewise constant function will be a piecewise linear function. So, supposing you have your data:

x = [0, 1.1, 2.2, 2.3, 2.8, 4]
y = [0, 0.88, 0.55, 0.11, 0.44]

Make a function which will give its integral at any value of x. Here the array I will contain the value of the integral at each of your given values of x, and the function f is its linear interpolant which will give the exact value at any point:

I = numpy.zeros_like(x)
I[1:] = numpy.cumsum(numpy.diff(x) * y)
f = scipy.interpolate.interp1d(x, I)

Now evaluating the average over each of your pixels is easy:

pix_x = numpy.linspace(0, 4, 5)
pix_y = (f(pix_x[1:]) - f(pix_x[:-1])) / (pix_x[1:] - pix_x[:-1])

We can check what is in these arrays:

>>>pix_x
array([0., 1., 2., 3., 4.])
>>>pix_y
array([0.   , 0.792, 0.374, 0.44 ])

The shading values for your pixels are now in pix_y. These should match exactly the values you gave above in your example.

This should be quite fast even for many, many points:

def test(x, y):
    I = numpy.zeros_like(x)
    I[1:] = numpy.cumsum(numpy.diff(x) * y)
    f = scipy.interpolate.interp1d(x, I,
        bounds_error=False, fill_value=(0, I[-1]))
    pix_x = numpy.linspace(0, 1, 1001)
    pix_y = (f(pix_x[1:]) - f(pix_x[:-1])) / (pix_x[1:] - pix_x[:-1])
    return pix_y

timeit reports:

225 ms ± 37.6 ms per loop

on my system when x is of size 1000000 (and y of size 999999). Note that bounds_error=False and fill_value=(0, I[-1]) are passed to interp1d. This has the effect of assuming that your shading function is zero outside of the range of x-values. Also, interp1d does not need the input values to be sorted; in the above test I gave both x and y as arrays of uniform random numbers between 0 and 1. However, if you know for sure that they are sorted, you can pass assume_sorted=True and you should get a speed increase:

20.2 ms ± 377 µs per loop

Solution 3:

Given your key requirement that you do not want linear interpolation, you should take a look at using scipy.signal.resample.

What this will do is convert your signal into a spectrum, and then into a new time series which is regularly spaced along the x-axis.

Also see this question: How to uniformly resample a non uniform signal using scipy.

Python Guru

Numpy: Fast Regularly-spaced Average For Large Numbers Of Line Segments / Points

Solution 1:

Solution 2:

Solution 3:

Post a Comment for "Numpy: Fast Regularly-spaced Average For Large Numbers Of Line Segments / Points"