-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting gguf fp16 & bf16 to hf is not supported. #31762
Comments
I found that the PPL issue is related to Llama3 or llama.cpp. It doesn't happen with TinyLlama. I'll create another issue to discuss if needed. |
It's easy to support GGUF FP16. Since BF16 is not supported by NumPy, my current workaround is to convert BF16 to FP16 using PyTorch, but it's not ideal to rely on PyTorch at this step. Reference: main...PenutChen:transformers:main def load_dequant_gguf_tensor(shape, ggml_type, data):
if ggml_type == GGML_TYPES["F32"]:
values = data
elif ggml_type == GGML_TYPES["F16"]:
values = data
elif ggml_type == GGML_TYPES["BF16"]:
import torch
data_uint8 = data.view(np.uint8)
tensor_uint8 = torch.from_numpy(data_uint8)
values = tensor_uint8.view(torch.bfloat16).float().numpy() Note that BF16 support requires modifying some code in gguf-py. Since the latest version of gguf-py from the llama.cpp repo doesn't work with the current HF integration (#31725), I modified the version from PyPI as follows: class GGMLQuantizationType(IntEnum):
F32 = 0
F16 = 1
BF16 = 30
# ...
GGML_QUANT_SIZES = {
GGMLQuantizationType.F32: (1, 4),
GGMLQuantizationType.F16: (1, 2),
GGMLQuantizationType.BF16: (1, 2),
# ...
} |
Hey @SunMarc, would you have some bandwidth to take a look at this ? :) |
Hey @PenutChen, thanks for your research ! I think that we should just support FP16 first since supporting BF16 would require a new gguf release + transformers gguf integration is not compatible yet. LMK what you think ! If you have some time, would you like a open a PR ? Otherwise, I will do it ! |
System Info
Who can help?
@SunMarc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Besides quantization, only F32 is implemented. FP16 and BF16 are not yet supported.
fp16 error log:
bf16 error log:
I tried to add F16 to
GGML_TYPES
:I'm not sure if this is correct, but after converting to hf, the PPL is over 1000.The text was updated successfully, but these errors were encountered: