-
Notifications
You must be signed in to change notification settings - Fork 25.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some Bugs in JetMoE #31791
Comments
Thanks @Phoenix-Shen! Let me cc @yikangshen, who has contributed the model. |
Hi @Phoenix-Shen, thanks for bringing up the issue! Your fix looks good to me. Would you like to submit a PR? |
Ok, I've fixed all the bugs and am ready to submit a PR. |
Thanks, reviewed! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
System Info
transformers version: 4.43.0.dev0 (installed from source)
Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Outline:
There are a couple of bugs that cause JetMoE to not be able to output logits for gating and calculate aux_loss.
I want to output the logits of the gating.
It will report an error:
Traceback (most recent call last):
File "/home/ubuntu/ssk/test_jetmoe.py", line 18, in
output = model.forward(
File "/home/ubuntu/miniconda3/envs/pytorch/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/pytorch/lib/python3.9/site-packages/transformers/models/jetmoe/modeling_jetmoe.py", line 1365, in forward
self.num_experts,
File "/home/ubuntu/miniconda3/envs/pytorch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1709, in getattr
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'JetMoeForCausalLM' object has no attribute 'num_experts'
Analysis
After examination of the code (https://github.com/huggingface/transformers/blob/main/src/transformers/models/jetmoe/modeling_jetmoe.py), I found serval mistakes:
self.num_experts
andself.num_experts_per_tok
are not defined in theJetMoeForCausalLM
class.output_router_logits
argument to the forward function ofself.model
inJetMoeForCausalLM
class. (see line 1310 and 1341, modeling_jetmoe.py)JetMoeForSequenceClassification
class, it misses the process of calculating aux_loss and forgets to passoutput_router_logits
argument toself.model.forward
.JetMoeForCausalLM
classself.num_experts = config.num_local_experts
, andself.num_experts_per_tok = config.num_experts_per_tok
in the__init__
function of theJetMoeForCausalLM
.output_router_logits
toself.model.forward
(line 1331)Expected behavior
The solution has been described in the previous section.
The text was updated successfully, but these errors were encountered: