Vectorizing Euclidean Distance Computation - Numpy
Solution 1:
Approach #1
We could slice out the first and second columns altogether for indexing into coords
instead of iterating for each element along them and perform the euclidean distance computations that involves element-wise squaring and summing along each row and then getting the element-wise square-root. Finally, we need to sum all those values for one scalar as shown in the original code.
Thus, one vectorized implementation would be -
np.sqrt(((coords[edges[:, 0]] - coords[edges[:, 1]])**2).sum(1)).sum()
There's a built-in in NumPy to do those distance computing operations as np.linalg.norm
. In terms of performance, I would think it would be comparable to what we have just listed earlier. For the sake of completeness, the implementation would be -
np.linalg.norm(coords[edges[:, 0]] - coords[edges[:, 1]],axis=1).sum()
Approach #2
Tweaking the earlier approach, we could use np.einsum
that in one step would perform both squaring
and summing along each row
and as such would be a bit more efficient.
The implementation would look something like this -
s = coords[edges[:, 0]] - coords[edges[:, 1]]
out = np.sqrt(np.einsum('ij,ij->i',s,s)).sum()
Runtime test
Function definitions -
defnetworkLength(coords, edges): # Original code from question
distancesNetwork = np.array([])
for i inrange(edges.shape[0]):
distancesNetwork = np.append(distancesNetwork, \
distance.euclidean(coords[edges[i, 0]], coords[edges[i, 1]]))
returnsum(distancesNetwork)
defvectorized_app1(coords, edges):
return np.sqrt(((coords[edges[:, 0]] - coords[edges[:, 1]])**2).sum(1)).sum()
defvectorized_app2(coords, edges):
s = coords[edges[:, 0]] - coords[edges[:, 1]]
return np.sqrt(np.einsum('ij,ij->i',s,s)).sum()
Verification and Timings -
In [114]:# Setup bigger inputs...:coords=np.random.rand(100,3)...:edges=np.random.randint(0,100,(10000,2))# Verify results across all approachesIn [115]:networkLength(coords,edges)Out[115]:6607.8829431403547In [116]:vectorized_app1(coords,edges)Out[116]:6607.8829431403337In [117]:vectorized_app2(coords,edges)Out[117]:6607.8829431403337In [118]:%timeitnetworkLength(coords,edges)...:%timeitvectorized_app1(coords,edges)...:%timeitvectorized_app2(coords,edges)...:1loops,best of 3:519msperloop1000 loops,best of 3:822µsperloop1000 loops,best of 3:668µsperloop
Post a Comment for "Vectorizing Euclidean Distance Computation - Numpy"