Compound assignment operators in Python's Numpy library -
the "vectorizing" of fancy indexing python's numpy library gives unexpected results. example:
import numpy = numpy.zeros((1000,4), dtype='uint32') b = numpy.zeros((1000,4), dtype='uint32') = numpy.random.random_integers(0,999,1000) j = numpy.random.random_integers(0,3,1000) a[i,j] += 1 k in xrange(1000): b[i[k],j[k]] += 1
gives different results in arrays 'a' , 'b' (i.e. appearance of tuple (i,j) appears 1 in 'a' regardless of repeats, whereas repeats counted in 'b'). verified follows:
numpy.sum(a) 883 numpy.sum(b) 1000
it notable fancy indexing version 2 orders of magnitude faster loop. question is: "is there efficient way numpy compute repeat counts implemented using loop in provided example?"
this should want:
np.bincount(np.ravel_multi_index((i, j), (1000, 4)), minlength=4000).reshape(1000, 4)
as breakdown, ravel_multi_index
converts index pairs specified i
, j
integer indices c-flattened array; bincount
counts number of times each value 0..4000
appears in list of indices; , reshape
converts c-flattened array 2d array.
in terms of performance, measure @ 200 times faster "b", , 5 times faster "a"; mileage may vary.
since need write counts existing array a
, try this:
u, inv = np.unique(np.ravel_multi_index((i, j), (1000, 4)), return_inverse=true) a.flat[u] += np.bincount(inv)
i make second method little slower (2x) "a", isn't surprising unique
stage going slow.
Comments
Post a Comment