我使用ChatGLM-6B模型,使用4bit量化训练,更换了 bell math的数据源,训练时报RuntimeError: expected scalar type Float but found Half,报错信息和执行命令如下
Traceback (most recent call last): File "main.py", line 434, in <module> main() File "main.py", line 373, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/data1/ChatGLM-6B/ptuning/trainer.py", line 1639, in train ignore_keys_for_eval=ignore_keys_for_eval, File "/data1/ChatGLM-6B/ptuning/trainer.py", line 1904, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/data1/ChatGLM-6B/ptuning/trainer.py", line 2647, in training_step loss = self.compute_loss(model, inputs) File "/data1/ChatGLM-6B/ptuning/trainer.py", line 2679, in compute_loss outputs = model(**inputs) File "/root/anaconda3/envs/chatglm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 1199, in forward return_dict=return_dict, File "/root/anaconda3/envs/chatglm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 993, in forward output_attentions File "/root/anaconda3/envs/chatglm/lib/python3.7/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/root/anaconda3/envs/chatglm/lib/python3.7/site-packages/torch/utils/checkpoint.py", line 107, in forward outputs = run_function(*args) File "/root/anaconda3/envs/chatglm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 634, in forward output_attentions=output_attentions File "/root/anaconda3/envs/chatglm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 445, in forward mixed_raw_layer = self.query_key_value(hidden_states) File "/root/anaconda3/envs/chatglm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 147, in forward output = W8A16Linear.apply(input, self.weight, self.weight_scale, self.weight_bit_width) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 53, in forward output = inp.mm(weight.t()) RuntimeError: expected scalar type Float but found Half
有没有大佬帮忙
执行训练的命令
LR=2e-2 && CUDA_VISIBLE_DEVICES=0 python3 main.py \ --do_train \ --train_file /data1/belle_math/math_train.json \ --validation_file /data1/belle_math/dev.json \ --output_dir output/6b-4bit-zpj-demo-pt-$LR \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 5e-5 \ --num_train_epochs 1.0 \ --model_name_or_path /data1/chatglm-6b \ --prompt_column instruction \ --response_column output \ --quantization_bit 4
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :