-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix unimo bug #8653
base: develop
Are you sure you want to change the base?
fix unimo bug #8653
Conversation
Thanks for your contribution! |
@@ -313,21 +313,21 @@ def __init__(self, hidden_size, vocab_size, activation, embedding_weights=None): | |||
self.transform = nn.Linear(hidden_size, hidden_size) | |||
self.activation = getattr(nn.functional, activation) | |||
self.layer_norm = nn.LayerNorm(hidden_size) | |||
self.decoder_weight = ( | |||
self.weight = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处不应修改参数名称,加载参数时需根据参数名称进行加载
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tie_weight方法的实现中,要求参数名称为weight,tie_weight
def forward(self, hidden_states: Tensor, masked_positions: Optional[Tensor] = None): | ||
if masked_positions is not None: | ||
hidden_states = paddle.reshape(hidden_states, [-1, hidden_states.shape[-1]]) | ||
hidden_states = paddle.tensor.gather(hidden_states, masked_positions) | ||
hidden_states = self.transform(hidden_states) | ||
hidden_states = self.activation(hidden_states) | ||
hidden_states = self.layer_norm(hidden_states) | ||
logits = paddle.tensor.matmul(hidden_states, self.decoder_weight, transpose_y=True) + self.decoder_bias | ||
logits = paddle.tensor.matmul(hidden_states, self.weight, transpose_y=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处缺失bias项
@@ -349,6 +349,10 @@ def __init__(self, config: UNIMOConfig): | |||
config.hidden_act, | |||
self.unimo.embeddings.word_embeddings.weight, | |||
) | |||
self.tie_weights() | |||
|
|||
def get_output_embeddings(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个方法只有一行代码,返回的还是类变量,没必要单独实现方法
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tie_weight方法的调用需要该方法的实现,tie_weight,同时请参考其他模型get_output_embeddings方法的实现。
PR types
Bug fixes
PR changes
model
Description
fix bug unimo bug