单机2卡预训练LLAMA-7B报错TypeError: an integer is required (got type NoneType) #112

smallYellowCat · 2023-11-29T12:03:34Z

单机2卡训练报错：
Traceback (most recent call last):
File "/home/d00620160/local/project/TencentPretrain/pretrain.py", line 139, in
main()
File "/home/d00620160/local/project/TencentPretrain/pretrain.py", line 135, in main
trainer.train_and_validate(args)
File "/home/d00620160/local/project/TencentPretrain/tencentpretrain/trainer.py", line 147, in train_and_validate
worker(args.local_rank, None, args)
File "/home/d00620160/local/project/TencentPretrain/tencentpretrain/trainer.py", line 732, in worker
trainer.train(args, local_rank, global_rank, train_loader, model_for_training, optimizer, scheduler)
File "/home/d00620160/local/project/TencentPretrain/tencentpretrain/trainer.py", line 193, in train
batch = list(next(loader_iter))
File "/home/d00620160/local/project/TencentPretrain/tencentpretrain/utils/dataloader.py", line 187, in iter
yield torch.LongTensor(src),
TypeError: an integer is required (got type NoneType)

训练命令如下：

CUDA_VISIBLE_DEVICES=6,7 deepspeed pretrain.py --deepspeed --deepspeed_config models/deepspeed_zero3_config.json --enable_zero3 --pretrained_model_path models/llama2-7b.bin --dataset_path llama_support.pt --spm_model_path models/llama/tokenizer.model --config_path models/llama/7b_config.json --output_model_path models/llama_support_7b_dpw.bin --world_size 2 --gpu_ranks 0 1 --data_processor lm --deepspeed_checkpoint_activations --total_steps 300000 --save_checkpoint_steps 5000 --batch_size 1

这个错误的意思是数据有问题吗？还是模型加载的有问题？

wmpscc · 2023-12-01T13:20:47Z

数据预处理时，针对LLaMA模型需要做一些修改。

将 tencentpretrain/utils/constants.py 文件第4行special_tokens_map.json 修改为 llama_special_tokens_map.json
参考llama-training

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

单机2卡预训练LLAMA-7B报错TypeError: an integer is required (got type NoneType) #112

单机2卡预训练LLAMA-7B报错TypeError: an integer is required (got type NoneType) #112

smallYellowCat commented Nov 29, 2023

wmpscc commented Dec 1, 2023

单机2卡预训练LLAMA-7B报错TypeError: an integer is required (got type NoneType) #112

单机2卡预训练LLAMA-7B报错TypeError: an integer is required (got type NoneType) #112

Comments

smallYellowCat commented Nov 29, 2023

wmpscc commented Dec 1, 2023