Bi-LSTM attention calc may be wrong #68

liuxiaoqun · 2021-06-03T14:44:03Z

lstm_output : [batch_size, n_step, n_hidden * num_directions(=2)], F matrix

def attention_net(self, lstm_output, final_state): 
    batch_size = len(lstm_output) 
    hidden_forward=final_state[0] 
    hidden_backward=final_state[1]
    hidden_f_b=torch.cat((hidden_forward, hidden_backward), 1) 
    hidden = hidden_f_b.view(batch_size, -1, 1)   #  
    hidden = final_state.view(batch_size, -1, 1)   # this line in source code is wrong, bi-lstm's hidden is[2,batch,embed_size] ,we need to concatenate forward and backward hidden state. if we   final_state.view(batch_size, -1, 1)   the  hidden state is not concatenate by final_state[0][0] and final_state[1][0]

The text was updated successfully, but these errors were encountered:

liuxiaoqun · 2021-06-04T06:42:47Z

hidden = final_state.view(batch_size, -1, 1) should be final_state.transpose(0,1).reshape(batch_size,-1,1)

randydkx · 2021-09-26T12:18:09Z

I think so too.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bi-LSTM attention calc may be wrong #68

Bi-LSTM attention calc may be wrong #68

liuxiaoqun commented Jun 3, 2021

liuxiaoqun commented Jun 4, 2021

randydkx commented Sep 26, 2021

Bi-LSTM attention calc may be wrong #68

Bi-LSTM attention calc may be wrong #68

Comments

liuxiaoqun commented Jun 3, 2021

lstm_output : [batch_size, n_step, n_hidden * num_directions(=2)], F matrix

liuxiaoqun commented Jun 4, 2021

randydkx commented Sep 26, 2021