microsoft / DeepSpeed Public

Notifications You must be signed in to change notification settings
Fork 4k
Star 33.9k

Code
Issues 990
Pull requests 143
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: microsoft/DeepSpeed

Labels 32 Milestones 0

New pull request New

143 Open 2,682 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Update the list of supported models in the Chinese README of fastgen

#5773 opened Jul 16, 2024 by beep-bebop • Queued

Use accelerator to replace cuda in setup and runner

#5769 opened Jul 15, 2024 by Andy666G

Loading…

Add fp8-fused gemm kernel

#5764 opened Jul 11, 2024 by sfc-gh-reyazda

Loading…

Add DataStates-LLM: Asynchronous Checkpointing Engine Support

#5763 opened Jul 10, 2024 by mauryaavinash95 • Draft

move is_checkpointable call reducing torch.compile Graph breaks

#5759 opened Jul 9, 2024 by NirSonnenschein

Loading…

Launcher mode with SSH bypass

#5728 opened Jul 5, 2024 by dogacancolak-kensho • Queued

Misplaced global variable warned

#5725 opened Jul 4, 2024 by anferico

Loading…

Find ROCm on Fedora

#5705 opened Jun 28, 2024 by trixirt

Loading…

sequence parallel with communication overlap

#5691 opened Jun 21, 2024 by inkcherry

Loading…

Switch from torch.cuda.amp.custom_fwd to torch.amp.custom_fwd(device=...)

#5684 opened Jun 18, 2024 by loadams • Draft

Switch what versions of python are supported

#5676 opened Jun 17, 2024 by loadams • Draft

Update xpu-max1100.yml with new config and add some tests

#5668 opened Jun 17, 2024 by Liangliang-Ma

Loading…

Add and Remove ZeRO 3 Hooks

#5658 opened Jun 13, 2024 by jomayeri

Loading…

Unpin transformers version

#5650 opened Jun 12, 2024 by loadams

Loading…

reduce all-to-all communication volume when both expert and non-expert are tensor-parallel

#5626 opened Jun 7, 2024 by taozhiwei

Loading…

Hybrid Offloading for ZeRO3

#5625 opened Jun 7, 2024 by tohtana • Draft

fix: quantization with DeepSpeed HE

#5624 opened Jun 6, 2024 by Atry

Loading…

Add support for Phi-3 small to FastGen

#5614 opened Jun 4, 2024 by adk9 • Draft

Upgrade HPU image to v1.16.2.

#5610 opened Jun 4, 2024 by vshekhawat-hlab

Loading…

state_dict_factory: llama checkpoint - support SWIGLU

#5601 opened Jun 2, 2024 by nelyahu

Loading…

FastGen H100 MoE support: Add PyTorch multi-gemm MOE implementation

#5586 opened May 29, 2024 by HeyangQin

Loading…

Update profiler.py

#5584 opened May 29, 2024 by gameofdimension

Loading…

reduce cpu host overhead when using moe

#5578 opened May 29, 2024 by ranzhejiang

Loading…

Reuse KV cache of prefixes

#5572 opened May 27, 2024 by tohtana • Draft

Add support for Microsoft Phi-3 model to DeepSpeed-FastGen

#5559 opened May 21, 2024 by adk9

Loading…

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Updated in the last three days: updated:>2024-07-13.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly