Issue
I refer to this question, which has already a good answer; but there were unnecessary operations identified (see discussion in the posting) and I was just curious if I could succeed in eliminate them...
In the meantime, I found a method which avoids unnecessary multiplications (using masks for indexing) and gives the same result. The code is below.
Variant 1 is the original.
In Variant 2 I tried to make use of python slicing combined with masking - not only to write the two loops in a better and more compact way, but mainly in the hope it would get faster. But it turned out, that it is even slower by ~30%. To be honest, the original code is more readable, but I was hoping to get significant improvement as compared to the double-loop.
Why is this not the case?
Or asked the other way around: In which situations are slice operations faster as compared to element-wise operations? Are they just syntactic sugar with significant internal overhead? I thought that they are implemented in C/C++ under the hood and must be faster than manual looping over i,j
in Python.
The output:
D:\python\animation>python test.py
used time for variant 1: 1.0377624034881592
used time for variant 2: 1.30381441116333
D:\python\animation>python test.py
used time for variant 1: 0.8954949378967285
used time for variant 2: 1.251044750213623
D:\python\animation>python test.py
used time for variant 1: 0.9750621318817139
used time for variant 2: 1.3896379470825195
The code:
import numpy as np
import numpy.ma as ma
import time
def test():
f = np.array([
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 3, 6 , 4, 2, 0],
[0, 2, 4, 7 , 6, 4, 0],
[0, 0, 0, 0, 0, 0, 0]
])
u = np.array([
[0, 0, 0, 0, 0, 0, 0],
[0, 0.5, 1, 0, -1, -0.5, 0],
[0, 0.7, 1.1, 0, -1, -0.4, 0],
[0, 0, 0, 0, 0, 0, 0],
])
# calculate : variant 1
x = np.zeros_like(f)
maxcount = 100000
start = time.time()
for count in range(maxcount):
for i in range(1,u.shape[0]-1):
for j in range(1,u.shape[1]-1):
if u[i,j] > 0:
x[i,j] = u[i,j]*(f[i,j]-f[i,j-1])
else:
x[i,j] = u[i,j]*(f[i,j+1]-f[i,j])
end = time.time()
print("used time for variant 1:", end-start)
# calculate : variant 2
y = np.zeros_like(f)
start = time.time()
for count in range(maxcount):
maskl = (u[1:-1, 1:-1] > 0)
maskr = ~maskl
diff = f[1:-1, 1:] - f[1:-1, 0:-1]
(y[1:-1, 1:-1])[maskl] = (u[1:-1, 1:-1 ])[maskl] * (diff[:, :-1])[maskl]
(y[1:-1, 1:-1])[maskr] = (u[1:-1, 1:-1 ])[maskr] * (diff[:, 1: ])[maskr]
end = time.time()
print("used time for variant 2:", end-start)
np.testing.assert_array_equal(x, y)
test()
"Pre-fetching" slices for u and y makes it a bit better, but not significantly:
for count in range(maxcount):
maskl = (u[1:-1, 1:-1] > 0)
maskr = ~maskl
diff = f[1:-1, 1:] - f[1:-1, 0:-1]
yy = (y[1:-1, 1:-1]) # <<--
uu = (u[1:-1, 1:-1 ]) # <<--
yy[maskl] = uu[maskl] * (diff[:, :-1])[maskl]
yy[maskr] = uu[maskr] * (diff[:, 1: ])[maskr]
Solution
I'm not quite getting the same answer as you, probably because I'm using floating point arrays instead of integer arrays (or I have a bug in my program), but you might find something like this a little simpler:
temp = np.zeros_like(f, )
temp[:,1:] = f[:,:-1] # temp[a, b] = f[a, b - 1]
x1 = u * (f - temp)
temp[:,:-1] = f[:,1:] # temp[a, b] = f[a, b + 1]
x2 = u * (temp - f)
result = np.where(u > 0, x1, x2)
I think this is a little bit clearer about your intent, and doesn't involve lots of masking.
Answered By - Frank Yellin
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.