-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Issues: NVIDIA/Megatron-LM
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
How to train multiple binariey files at the same time or merge them?
#927
opened Jul 11, 2024 by
Liangyz2019
[QUESTION] How Do NCCL_ALGO and Flash Attention Affect Deterministic Training in Megatron?
#925
opened Jul 11, 2024 by
jinzhuer
[BUG] Getting distributed rank in save_checkpoint when torch.distributed is not initialized.
#920
opened Jul 10, 2024 by
haolin-nju
[ENHANCEMENT] Enable non-gelu activations for BERT LM Head
#918
opened Jul 9, 2024 by
skothenhill-nv
[BUG] Missing init_process_group call when converting model to HF format.
#911
opened Jul 8, 2024 by
benoriol
[QUESTION] Does Megatron-LM supports Flash Attention for BERT and T5 Pretraining?
#899
opened Jul 2, 2024 by
Leo-T-Zang
Batch_input and elapsed time per iteration slow down during model training
#897
opened Jun 29, 2024 by
Yuhanleeee
[REGRESSION] MoEs are obtaining higher loss than they should during training
#894
opened Jun 27, 2024 by
kiddyboots216
[QUESTION] Getting tools/preprocess_data.py to work is painful
#892
opened Jun 26, 2024 by
sambar1729
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.