You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Getting the following error when using Llama 3 with Unsloth and calling the generate function passing embeddings as well as token ID's:
self.llama_model, self.tokenizer = FastLanguageModel.from_pretrained(
model_name=model_name,
max_seq_length=self.max_seq_length,
dtype=dtype,
load_in_4bit=True,
trust_remote_code=True
)
# Configure LoRA
self.llama_model = FastLanguageModel.get_peft_model(
self.llama_model,
r=64, # Choose any number > 0, suggested values: 8, 16, 32, 64, 128
target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"],
lora_alpha=128,
lora_dropout=0, # Optimized value
bias="none", # Optimized value
use_gradient_checkpointing=True,
random_state=3407
)
generation_params = {
"input_ids": input_ids,
"inputs_embeds": inputs_embeds,
"attention_mask": new_attention_mask,
"max_length": max_length,
"num_beams": num_beams,
"do_sample": do_sample,
"temperature": temperature,
}
if top_k is not None:
generation_params["top_k"] = top_k
if top_p is not None:
generation_params["top_p"] = top_p
output_ids = self.llama_model.generate(**generation_params)
Error:
[/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in _prepare_model_inputs(self, inputs, bos_token_id, model_kwargs)
413 inspect.signature(self.prepare_inputs_for_generation).parameters.keys()
414 )
--> 415 if not has_inputs_embeds_forwarding:
416 raise ValueError(
417 f"You passed `inputs_embeds` to `.generate()`, but the model class {self.__class__.__name__} "
ValueError: You passed `inputs_embeds` to `.generate()`, but the model class LlamaForCausalLM doesn't have its forwarding implemented. See the GPT2 implementation for an example (https://github.com/huggingface/transformers/pull/21405), and feel free to open a PR with it!
I can see that in modeling_llama.py this is implemented. I think the issue may simply be that this code below raising the Error is referencing the prepare_inputs_for_generation function from PEFT (peft.peft_model.PeftModelForCausalLM.prepare_inputs_for_generation) and not from the underlying base model? That's what appears to be referenced when I access the function via llama_model. prepare_inputs_for_generation.
has_inputs_embeds_forwarding = "inputs_embeds" in set(
inspect.signature(self.prepare_inputs_for_generation).parameters.keys()
)
if not has_inputs_embeds_forwarding:
raise ValueError(
f"You passed `inputs_embeds` to `.generate()`, but the model class {self.__class__.__name__} "
"doesn't have its forwarding implemented. See the GPT2 implementation for an example "
"(https://github.com/huggingface/transformers/pull/21405), and feel free to open a PR with it!"
)
I'll debug further later and if this is true can fledge this issue out more can raise a PR.
Expected behavior
It references the function from the base model and passing inputs_embeds should work.
The text was updated successfully, but these errors were encountered:
jonflynng
changed the title
Error when checking args for prepare_inputs_for_generation function on a Peft model
self.__class__.__name__} " ValueError: You passed inputs_embeds to .generate(), but the model class LlamaForCausalLM doesn't have its forwarding implemented.
Jul 6, 2024
jonflynng
changed the title
self.__class__.__name__} " ValueError: You passed inputs_embeds to .generate(), but the model class LlamaForCausalLM doesn't have its forwarding implemented.
error on Peft model: You passed inputs_embeds to .generate(), but the model class LlamaForCausalLM doesn't have its forwarding implemented.
Jul 6, 2024
has_inputs_embeds_forwarding = "inputs_embeds" in set(
inspect.signature(self.prepare_inputs_for_generation).parameters.keys()
)
if not has_inputs_embeds_forwarding:
raise ValueError(
f"You passed `inputs_embeds` to `.generate()`, but the model class {self.__class__.__name__} "
"doesn't have its forwarding implemented. See the GPT2 implementation for an example "
"(https://github.com/huggingface/transformers/pull/21405), and feel free to open a PR with it!"
)
System Info
Using Google Colab with a T4 with
transformers
version 4.42.3.Who can help?
@ArthurZucker
Information
Tasks
Reproduction
Getting the following error when using Llama 3 with Unsloth and calling the
generate
function passing embeddings as well as token ID's:Error:
I can see that in
modeling_llama.py
this is implemented. I think the issue may simply be that this code below raising the Error is referencing theprepare_inputs_for_generation
function from PEFT (peft.peft_model.PeftModelForCausalLM.prepare_inputs_for_generation
) and not from the underlying base model? That's what appears to be referenced when I access the function viallama_model. prepare_inputs_for_generation
.I'll debug further later and if this is true can fledge this issue out more can raise a PR.
Expected behavior
It references the function from the base model and passing
inputs_embeds
should work.The text was updated successfully, but these errors were encountered: