[FEATURE]: [PyTorch] per-channel FP8 quantization #5873

BurkeHulk · 2024-07-01T06:21:53Z

Implement per-channel scaling (in PyTorch) for FP8 quantization.
Support PyTorch native FP8 formats.
Refer to:
https://pytorch.org/docs/stable/tensors.html#id7
https://arxiv.org/pdf/2209.05433

BurkeHulk closed this as completed Jul 1, 2024

BurkeHulk changed the title ~~[PyTorch] per-channel FP8 quantization~~ [FEATURE] FP8 quantization in communication Jul 4, 2024

BurkeHulk mentioned this issue Jul 4, 2024

[feature]: support FP8 communication in pipeline parallelism #5885

Merged

4 tasks

BurkeHulk reopened this Jul 4, 2024

BurkeHulk self-assigned this Jul 4, 2024

BurkeHulk changed the title ~~[FEATURE] FP8 quantization in communication~~ [FEATURE]: [PyTorch] per-channel FP8 quantization Jul 4, 2024

ver217 linked a pull request Jul 16, 2024 that will close this issue

[feature]: support FP8 communication in pipeline parallelism #5885

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: [PyTorch] per-channel FP8 quantization #5873

[FEATURE]: [PyTorch] per-channel FP8 quantization #5873

BurkeHulk commented Jul 1, 2024 •

edited

Loading

[FEATURE]: [PyTorch] per-channel FP8 quantization #5873

[FEATURE]: [PyTorch] per-channel FP8 quantization #5873

Comments

BurkeHulk commented Jul 1, 2024 • edited Loading

BurkeHulk commented Jul 1, 2024 •

edited

Loading