Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use gemini plugin and LowLevelZero to run llama2_7b. In the pulgin in gemini, set the policy to static, shard_param_frac, offload_optim_frac, and offload_param_frac to 0.0, making gemini equal to zero2, and set stage to 2 in LowLevelZero. Using bf16 for training, and comparing the two plugins, we found that the GPU memory usage of gemini is higher than that of LowLevelZero. Why is this? In principle, gemini should save more GPU memory #5830

Open
JJGSBGQ opened this issue Jun 18, 2024 · 2 comments

Comments

@JJGSBGQ
Copy link

JJGSBGQ commented Jun 18, 2024

No description provided.

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Title: Use gemini plugin and LowLevelZero to run llama2_7b. In the pulgin in gemini, set the policy to static, shard_param_frac, offload_optim_frac, and offload_param_frac to 0.0, making gemini equal to zero2, and set stage to 2 in LowLevelZero. Using bf16 for training, and comparing the two plugins, we found that the memory usage of gemini is higher than that of LowLevelZero. Why is this? In principle, gemini should save more video memory

@JJGSBGQ JJGSBGQ changed the title 使用gemini plugin 和 LowLevelZero运行llama2_7b,在gemini 中pulgin中,设置policy为static, shard_param_frac、offload_optim_frac、offload_param_frac都为0.0,使得gemini 等同于zero2, 在LowLevelZero中设置stage为2。采用bf16训练,在这两种plugin下对比,发现使用gemini 的显存占用要比LowLevelZero显存占用高,请问这是为什么? 原理上,gemini 应该更节省显存 Use gemini plugin and LowLevelZero to run llama2_7b. In the pulgin in gemini, set the policy to static, shard_param_frac, offload_optim_frac, and offload_param_frac to 0.0, making gemini equal to zero2, and set stage to 2 in LowLevelZero. Using bf16 for training, and comparing the two plugins, we found that the GPU memory usage of gemini is higher than that of LowLevelZero. Why is this? In principle, gemini should save more GPU memory Jun 18, 2024
@JJGSBGQ
Copy link
Author

JJGSBGQ commented Jun 18, 2024

When perform stable-diffusion in the same way, find that gemni has a lower GPU memory usage than LowLevelZero

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants