Add description of fp8 function support in md file #8610

YZW-explorer · 2024-06-17T03:47:07Z

PR types

New features

PR changes

Docs

Description

目前支持FP8 PTQ功能，添加相关的文档介绍以及FP8 PTQ功能使用案例

paddle-bot · 2024-06-17T03:47:12Z

Thanks for your contribution!

codecov · 2024-06-17T04:18:09Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 54.42%. Comparing base (65e721e) to head (a3e3862).
Report is 5 commits behind head on develop.

❗ Current head a3e3862 differs from pull request most recent head bbfb24a

Please upload reports for the commit bbfb24a to get more accurate results.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8610      +/-   ##
===========================================
- Coverage    55.80%   54.42%   -1.39%     
===========================================
  Files          620      632      +12     
  Lines        96642    99475    +2833     
===========================================
+ Hits         53928    54135     +207     
- Misses       42714    45340    +2626

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ceci3 · 2024-06-17T08:22:04Z

llm/argument.py

@@ -177,8 +177,8 @@ class ModelArgument:
 @dataclass
 class QuantArgument:
    quant_type: str = field(
-        default="a8w8",
-        metadata={"help": "Quantization type. Supported values: a8w8, weight_only_int4, weight_only_int8"},
+        default="a8w8(int)",


不要用()表示是int还是float，可以新增afp8wfp8代表fp8量化

ceci3 · 2024-06-17T08:22:45Z

llm/quant.py

-    if quant_args.quant_type == "a8w8":
-        activation = AVGObserver(quant_bits=8)
-        weight = weight_observer(quant_bits=8)
+    if quant_args.quant_type == "a8w8(int)":


可以在paddlenlp里新增一个枚举类型表示量化级别

bins = 2**(self._quant_bits - 1) - 1

ceci3 · 2024-06-25T07:06:06Z

llm/utils/argument.py

@@ -214,8 +221,8 @@ class ModelArgument:
 @dataclass
 class QuantArgument:
    quant_type: str = field(
-        default="a8w8",
-        metadata={"help": "Quantization type. Supported values: a8w8, weight_only_int4, weight_only_int8"},
+        default=QuantType.WFP8AFP8,


默认还是WINT8AINT8？先不要修改默认行为

ceci3 · 2024-06-25T07:18:47Z

llm/docs/quantization.md

@@ -77,7 +77,7 @@ python  run_finetune.py ./config/llama/awq_argument.json

 <summary>&emsp; 量化参数（QuantArgument）</summary><div>

- `quant_type`: PTQ,QAT量化类型，默认为A8W8。支持A8W8,WINT4，WINT8：A8W8指对激活（输入）进行INT8量化，对模型权重进行INT8量化；WINT4指仅对模型权重进行INT4量化，后续使用WeightOnly进行推理；WINT8指仅对模型权重进行INT8量化，后续使用WeightOnly进行推理。
+- `quant_type`: PTQ,QAT量化类型，默认为WINT8AINT8。支持WINT8AINT8,WFP8AFP8,WEIGHT_ONLY_INT4,WEIGHT_ONLY_INT8：WINT8AINT8指对激活（输入）以及模型权重进行INT8量化；WFP8AFP8指对激活（输入）以及模型权重进行FP8量化；WINT4指仅对模型权重进行INT4量化，后续使用WeightOnly进行推理；WINT8指仅对模型权重进行INT8量化，后续使用WeightOnly进行推理。


改成Wint8Aint8、Wfp8Afp8？

Add description of fp8 function support in md file

7ffec92

paddle-bot bot added the contributor label Jun 17, 2024

paddle-bot bot assigned lugimzzz Jun 17, 2024

change quantization.md

a3e3862

ceci3 reviewed Jun 17, 2024

View reviewed changes

YZW-explorer added 4 commits June 24, 2024 03:52

Add QuantType to distinguish different types of quantization.

958ff06

Merge branch 'develop' of github.com:PaddlePaddle/PaddleNLP into fp8ptq

8cf5294

bins = 2**(self._quant_bits - 1) - 1

fix argument.py

7d70140

add describe in llm/docs/quantization.md

91542c9

ceci3 reviewed Jun 25, 2024

View reviewed changes

YZW-explorer added 2 commits June 25, 2024 07:34

change .md and argument.py

16d04f7

change quantization.md

bbfb24a

ceci3 approved these changes Jun 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add description of fp8 function support in md file #8610

Add description of fp8 function support in md file #8610

YZW-explorer commented Jun 17, 2024

paddle-bot bot commented Jun 17, 2024

codecov bot commented Jun 17, 2024 •

edited

Loading

ceci3 Jun 17, 2024

ceci3 Jun 17, 2024

ceci3 Jun 25, 2024

YZW-explorer Jun 25, 2024

ceci3 Jun 25, 2024

YZW-explorer Jun 25, 2024

Add description of fp8 function support in md file #8610

Are you sure you want to change the base?

Add description of fp8 function support in md file #8610

Conversation

YZW-explorer commented Jun 17, 2024

PR types

PR changes

Description

paddle-bot bot commented Jun 17, 2024

codecov bot commented Jun 17, 2024 • edited Loading

Codecov Report

ceci3 Jun 17, 2024

Choose a reason for hiding this comment

ceci3 Jun 17, 2024

Choose a reason for hiding this comment

ceci3 Jun 25, 2024

Choose a reason for hiding this comment

YZW-explorer Jun 25, 2024

Choose a reason for hiding this comment

ceci3 Jun 25, 2024

Choose a reason for hiding this comment

YZW-explorer Jun 25, 2024

Choose a reason for hiding this comment

codecov bot commented Jun 17, 2024 •

edited

Loading