Issue
I was trying to understand how matrix multiplication works over 2 dimensions in DL frameworks and I stumbled upon an article here. He used Keras to explain the same and it works for him. But when I try to reproduce the same code in Pytorch, it fails with the error as in the output of the following code
Pytorch Code:
a = torch.ones((2,3,4))
b = torch.ones((7,4,5))
c = torch.matmul(a,b)
print(c.shape)
Output: RuntimeError: The size of tensor a (2) must match the size of tensor b (7) at non-singleton dimension 0
Keras Code:
a = K.ones((2,3,4))
b = K.ones((7,4,5))
c = K.dot(a,b)
print(c.shape)
Output:(2, 3, 7, 5)
Can somebody explain what is it that I'm doing wrong?
Solution
Matrix multiplication (aka matrix dot product) is a well defined algebraic operation taking two 2D matrices.
Deep-learning frameworks (e.g., tensorflow, keras, pytorch) are tuned to operate of batches of matrices, hence they usually implement batched matrix multiplication, that is, applying matrix dot product to a batch of 2D matrices.
The examples you linked to show how matmul
processes a batch of matrices:
a = tf.ones((9, 8, 7, 4, 2))
b = tf.ones((9, 8, 7, 2, 5))
c = tf.matmul(a, b)
Note how all but last two dimensions are identical ((9, 8, 7)
).
This is NOT the case in your example - the leading ("batch") dimensions are different, hence the error.
Using identical leading dimensions in pytorch:
a = torch.ones((2,3,4))
b = torch.ones((2,4,5))
c = torch.matmul(a,b)
print(c.shape)
results with
torch.Size([2, 3, 5])
If you insist on dot products with different batch dimensions, you will have to explicitly define how to multiply the two tensors. You can do that using the very flexible torch.einsum
:
a = torch.ones((2,3,4))
b = torch.ones((7,4,5))
c = torch.einsum('ijk,lkm->ijlm', a, b)
print(c.shape)
Resulting with:
torch.Size([2, 3, 7, 5])
Answered By - Shai
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.