Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Colossal AI failed to load ChatGLM2 #5861

Open
1 task done
hiprince opened this issue Jun 26, 2024 · 2 comments
Open
1 task done

[BUG]: Colossal AI failed to load ChatGLM2 #5861

hiprince opened this issue Jun 26, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@hiprince
Copy link

hiprince commented Jun 26, 2024

Is there an existing issue for this bug?

  • I have searched the existing issues

馃悰 Describe the bug

I failed to run ChatGLM model with ColossalAI 0.3.6.

backtrace is here

KeyError Traceback (most recent call last)
Cell In[4], line 112
110 else:
111 print('Skip launch colossalai')
--> 112 benchmark_inference(
113 model_id,
114 "fp16",
115 max_input_len=max_input_len,
116 max_output_len=max_seq_len,
117 tp_size=tp_size,
118 batch_size=batch_size)
121 recorder.print()

Cell In[4], line 75, in benchmark_inference(model_id, dtype, max_input_len, max_output_len, tp_size, batch_size)
63 model = model.to(torch.bfloat16)
65 inference_config = InferenceConfig(
66 dtype=dtype,
67 max_batch_size=batch_size,
(...)
73 use_cuda_kernel=True,
74 )
---> 75 engine = InferenceEngine(model, tokenizer, inference_config, verbose=False)
77 generation_config = GenerationConfig(
78 pad_token_id=tokenizer.pad_token_id,
79 max_length=max_input_len + max_output_len,
80 # max_new_tokens=args.max_output_len,
81 )
82 tokens=gen_tokens(tokenizer, dataset, dataset_format)

File [~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:75], in InferenceEngine.init(self, model_or_path, tokenizer, inference_config, verbose, model_policy)
72 self.verbose = verbose
73 self.logger = get_dist_logger(name)
---> 75 self.init_model(model_or_path, model_policy)
77 self.generation_config = inference_config.to_generation_config(self.model_config)
79 self.tokenizer = tokenizer

File [~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:148], in InferenceEngine.init_model(self, model_or_path, model_policy)
146 else:
147 model_type = "nopadding_" + self.model_config.model_type
--> 148 model_policy = model_policy_mapmodel_type
150 pg_mesh = ProcessGroupMesh(self.inference_config.pp_size, self.inference_config.tp_size)
151 tp_group = pg_mesh.get_group_along_axis(TP_AXIS)

KeyError: 'nopadding_chatglm'

Environment

ColossalAI 0.3.6
PyTorch 2.3.1
CUDA 12.1
NV driver 545

@hiprince hiprince added the bug Something isn't working label Jun 26, 2024
@yuehuayingxueluo
Copy link
Contributor

We have not yet adapted ChatGLM, but we will adapt these general models in the future.

@hiprince
Copy link
Author

hiprince commented Jul 9, 2024

Can I get an update or plan of the support of ChatGLM2/3? I mainly need inference instead of pretrain/finetune.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants