Compound assignment operators in Python's Numpy library -


the "vectorizing" of fancy indexing python's numpy library gives unexpected results. example:

import numpy = numpy.zeros((1000,4), dtype='uint32') b = numpy.zeros((1000,4), dtype='uint32') = numpy.random.random_integers(0,999,1000) j = numpy.random.random_integers(0,3,1000)  a[i,j] += 1 k in xrange(1000):     b[i[k],j[k]] += 1 

gives different results in arrays 'a' , 'b' (i.e. appearance of tuple (i,j) appears 1 in 'a' regardless of repeats, whereas repeats counted in 'b'). verified follows:

numpy.sum(a) 883 numpy.sum(b) 1000 

it notable fancy indexing version 2 orders of magnitude faster loop. question is: "is there efficient way numpy compute repeat counts implemented using loop in provided example?"

this should want:

np.bincount(np.ravel_multi_index((i, j), (1000, 4)), minlength=4000).reshape(1000, 4) 

as breakdown, ravel_multi_index converts index pairs specified i , j integer indices c-flattened array; bincount counts number of times each value 0..4000 appears in list of indices; , reshape converts c-flattened array 2d array.

in terms of performance, measure @ 200 times faster "b", , 5 times faster "a"; mileage may vary.

since need write counts existing array a, try this:

u, inv = np.unique(np.ravel_multi_index((i, j), (1000, 4)), return_inverse=true) a.flat[u] += np.bincount(inv) 

i make second method little slower (2x) "a", isn't surprising unique stage going slow.


Comments

Popular posts from this blog

django - How can I change user group without delete record -

java - Need to add SOAP security token -

java - EclipseLink JPA Object is not a known entity type -