Issue
So I have a large 3D array (~ 2000 x 1000 x 1000). I want to update each value in the array to a random integer value between 1 and the current max such that all values = x are updated to the same random integer. I want to keep zeros unchanged. Also there can't be any repeats, i.e. different values in the original array can't be updated to the same random int. The values are currently in a continuous range between 0 and 9000. There are quite a lot of values in the array;
np.amax(arr) #output = 9000
So tried the method below...
max_v = np.amax(arr)
vlist = []
for l in range(1,max_v): vlist.append(l)
for l in tqdm(range(1,max_v)):
m = random.randint(1,len(vlist))
n = vlist[m]
arr = np.where(arr == l, n, arr)
vlist.remove(n)
My current code takes about 13 s per iteration with 9000 itertions (for the first few iterations at least which is too slow). I've thought about parallelisation with concurrent.futures but i'm sure it's likely i've missed something obvious here XD
Solution
If your current values are in a continuous range, and you want another continuous range, you're in luck! At that point, you aren't really generating 2 billion random numbers: you're just permuting 9000 or so integers. For example:
arr = np.random.randint(9001, size=(10, 20, 20))
p = np.arange(arr.max(None) + 1)
np.random.shuffle(p)
arr = p[arr]
The replacement values do not have to start with zero, but if you plan on doing this iteratively, you will have to subtract off the offset before using arr
as an index into p
.
Answered By - Mad Physicist
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.