You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, this repo is awesome, but there might be something wrong in the code above. According to the comment above, this snippet intends to change a tensor from shape [num_layers(=1) * num_directions(=2), batch_size, n_hidden] to shape [batch_size, n_hidden * num_directions(=2), 1(=n_layer)], i.e. to concatenate the 2 hidden vector from different direction for every data example in a batch(By saying "data example", I mean a batch has batch_size examples). But I think the code above will mess up the data examples in a batch and lead to unexpected result.
For example, we can use IPython to check the effect of the snippet above.
As you can see, we create a tensor with batch_size=3 and n_hidden=5, e.g [ 0, 1, 2, 3, 4] and [15, 16, 17, 18, 19] belong to the same data example in the batch, but they are from different directions, so what we want is to concatenate them in the resulting tensor. But what the code really does is to concatenate [ 0, 1, 2, 3, 4] and [ 5, 6, 7, 8, 9], which are from different data examples in a batch.
I think it can be fixed by changing the line of code to hidden=torch.cat(final_state[0],final_state[1]],1).view(-1,10,1)
The effect of the new code can be shown as follows:
nlp-tutorial/4-3.Bi-LSTM(Attention)/Bi-LSTM(Attention)-Torch.py
Line 50 in cb4881e
Hi, this repo is awesome, but there might be something wrong in the code above. According to the comment above, this snippet intends to change a tensor from shape
[num_layers(=1) * num_directions(=2), batch_size, n_hidden]
to shape[batch_size, n_hidden * num_directions(=2), 1(=n_layer)]
, i.e. to concatenate the 2 hidden vector from different direction for every data example in a batch(By saying "data example", I mean a batch hasbatch_size
examples). But I think the code above will mess up the data examples in a batch and lead to unexpected result.For example, we can use IPython to check the effect of the snippet above.
As you can see, we create a tensor with batch_size=3 and n_hidden=5, e.g
[ 0, 1, 2, 3, 4]
and[15, 16, 17, 18, 19]
belong to the same data example in the batch, but they are from different directions, so what we want is to concatenate them in the resulting tensor. But what the code really does is to concatenate[ 0, 1, 2, 3, 4]
and[ 5, 6, 7, 8, 9]
, which are from different data examples in a batch.I think it can be fixed by changing the line of code to
hidden=torch.cat(final_state[0],final_state[1]],1).view(-1,10,1)
The effect of the new code can be shown as follows:
The text was updated successfully, but these errors were encountered: