Bug when visualizing T5 models with generate #133

salokr · 2024-05-21T17:46:29Z

I tried to visualize the attention maps for the T5 model but have encountered issues while getting the plots.

I would like to emphasize few points:

I have used model.generate because I don't have labels assumed in my real project and generate can also output attentions.
I have also tried passing the input directly to model but that also doesn't works.

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("google-t5/t5-small")
model = T5ForConditionalGeneration.from_pretrained("google-t5/t5-small")

input_ids = tokenizer("translate English to German: The house is wonderful.", return_tensors="pt").input_ids
outputs = model.generate(input_ids, output_attentions=True, return_dict_in_generate=True)

encoder_text = tokenizer.convert_ids_to_tokens(input_ids[0])
decoder_text = tokenizer.convert_ids_to_tokens(outputs.sequences[0])


from bertviz import model_view
model_view(
    encoder_attention=outputs.encoder_attentions,
    decoder_attention=outputs.decoder_attentions,
    cross_attention=outputs.cross_attentions,
    encoder_tokens= encoder_text,
    decoder_tokens = decoder_text
)

The error I am getting is:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[56], line 6
      2 decoder_text = tokenizer.convert_ids_to_tokens(outputs.sequences[0])#tokenizer.convert_ids_to_tokens(decoder_input_ids[0])
      5 from bertviz import model_view
----> 6 model_view(
      7     encoder_attention=outputs.encoder_attentions,
      8     decoder_attention=outputs.decoder_attentions,
      9     cross_attention=outputs.cross_attentions,
     10     encoder_tokens= encoder_text,
     11     decoder_tokens = decoder_text
     12 )

File /Volumes/x/envs/my_env/lib/python3.8/site-packages/bertviz/model_view.py:147, in model_view(attention, tokens, sentence_b_start, prettify_tokens, display_mode, encoder_attention, decoder_attention, cross_attention, encoder_tokens, decoder_tokens, include_layers, include_heads, html_action)
    145     if include_heads is None:
    146         include_heads = list(range(n_heads))
--> 147     decoder_attention = format_attention(decoder_attention, include_layers, include_heads)
    148     attn_data.append(
    149         {
    150             'name': 'Decoder',
   (...)
    154         }
    155     )
    156 if cross_attention is not None:

File /Volumes/x/envs/my_env/lib/python3.8/site-packages/bertviz/util.py:10, in format_attention(attention, layers, heads)
      7 squeezed = []
      8 for layer_attention in attention:
      9     # 1 x num_heads x seq_len x seq_len
---> 10     if len(layer_attention.shape) != 4:
     11         raise ValueError("The attention tensor does not have the correct number of dimensions. Make sure you set "
     12                          "output_attentions=True when initializing your model.")
     13     layer_attention = layer_attention.squeeze(0)

AttributeError: 'tuple' object has no attribute 'shape'

The text was updated successfully, but these errors were encountered:

nikhilajoshy · 2024-05-21T19:12:08Z

Looks like outputs.decoder_attentions is not a list of tensors but a list of tuples in your case. Please check the data type of attention variables.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug when visualizing T5 models with generate #133

Bug when visualizing T5 models with generate #133

salokr commented May 21, 2024

nikhilajoshy commented May 21, 2024

Bug when visualizing T5 models with generate #133

Bug when visualizing T5 models with generate #133

Comments

salokr commented May 21, 2024

nikhilajoshy commented May 21, 2024