Skip to content Skip to sidebar Skip to footer

Numpy Random Choice, Replacement Only Along One Axis

I need to sample a bunch of pairs of points from an arrary. I want that each pair consists of two DISTINCT points, but the points may be repeated amongst the various pairs. e.g.,

Solution 1:

To sample a pair without replacements, you can use np.random.choice:

np.random.choice(X, size=2, replace=False)

Alternatively, to sample multiple elements at a time, note that all possible pairs may be represented by the elements of range(len(X)*(len(X)-1)/2), and sample from that using np.random.randint.

combs = np.array(list(itertools.combinations(X, 2)))
sample = np.random.randint(len(combs), size=10)
combs[sample[np.newaxis]]

Following up on @user2357112's comment, given from the OP's own answer that they do not appear to care if the sample size itself is deterministic, and noting that sampling with the Mersenne Twister is slower than basic arithmetic operations, a different solution if X is so large that generating the combinations is not feasibile would be

sample = np.random.randint(len(X)**2, size=N)
i1 = sample // len(X)
i2 = sample % len(X)
X[np.vstack((i1, i2)).T[i1 != i2]]

This produces a sample whose average size is N * (1 - 1/len(X)).

Solution 2:

Here is @user2357112's solution:

def sample_indices(X, n=4):
    pair_indices = np.random.randint(X.shape[0]**2, size=n)
    pair_indices = np.hstack(((pair_indices // X.shape[0]).reshape((-1,1)), (pair_indices % X.shape[0]).reshape((-1,1))))
    good_indices = pair_indices[:,0] != pair_indices[:,1]
    return X[pair_indices[good_indices]]

Post a Comment for "Numpy Random Choice, Replacement Only Along One Axis"