在adgen数据上进行微调之后想测试模型效果。 尝试运行evaluate.sh时总会报这个错误,但是不知道怎么解决
Steps To Reproducebash evaluate.sh
Environment- OS:
- Python: 3.7.5
- Transformers: 4.27.1
- PyTorch: 1.13.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
在adgen数据上进行微调之后想测试模型效果。 尝试运行evaluate.sh时总会报这个错误,但是不知道怎么解决
Steps To Reproducebash evaluate.sh
Environment- OS:
- Python: 3.7.5
- Transformers: 4.27.1
- PyTorch: 1.13.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
我在验证的时候报的是这个错误
Traceback (most recent call last):
File "ChatGLM-6B/ptuning/main.py", line 429, in
大佬帮忙看看
同样遇到了这个问题,蹲一个解决办法
遇到了同样的问题
me too
应该是因为你用的是旧版的checkpoint,但是用了新版的 evaluate.sh
,可以看一下 https://github.com/THUDM/ChatGLM-6B/tree/main/ptuning#%E6%8E%A8%E7%90%86 的0410更新部分
更新一下 tokenization_chatglm.py 到你的checkpoint目录下(就是最后报错的那个文件)
感谢,可以了
@duzx16 我更新了evaluate.sh 还有tokenization_chatglm.py 但是还是报错 RuntimeError: Error(s) in loading state_dict for PrefixEncoder: Unexpected key(s) in state_dict: ".weight", "layernorm.weight", "layernorm.bias", "ion.rotary_emb.inv_freq", "ion.query_key_value.bias", "ion.query_key_value.weight", "ion.query_key_value.weight_scale", "ion.dense.bias", "ion.dense.weight", "ion.dense.weight_scale", "ttention_layernorm.weight", "ttention_layernorm.bias", "nse_h_to_4h.bias", "nse_h_to_4h.weight", "nse_h_to_4h.weight_scale", "nse_4h_to_h.bias", "nse_4h_to_h.weight", "nse_4h_to_h.weight_scale", "_layernorm.weight", "_layernorm.bias", "tion.rotary_emb.inv_freq", "tion.query_key_value.bias", "tion.query_key_value.weight", "tion.query_key_value.weight_scale", "tion.dense.bias", "tion.dense.weight", "tion.dense.weight_scale", "attention_layernorm.weight", "attention_layernorm.bias", "ense_h_to_4h.bias", "ense_h_to_4h.weight", "ense_h_to_4h.weight_scale", "ense_4h_to_h.bias", "ense_4h_to_h.weight", "ense_4h_to_h.weight_scale", ".bias", "".
evaluate.sh 是这样的: PRE_SEQ_LEN=128 CHECKPOINT=adgen-chatglm-6b-pt-128-2e-2 STEP=30
CUDA_VISIBLE_DEVICES=0 python3 main.py \ --do_predict \ --validation_file dataset/devData.json \ --test_file dataset/devData.json \ --overwrite_cache \ --prompt_column query \ --response_column output \ --model_name_or_path model/chatGLM6b/model \ --ptuning_checkpoint ./output/$CHECKPOINT/checkpoint-$STEP \ --output_dir ./output/$CHECKPOINT \ --overwrite_output_dir \ --max_source_length 64 \ --max_target_length 64 \ --per_device_eval_batch_size 1 \ --predict_with_generate \ --pre_seq_len $PRE_SEQ_LEN \ --quantization_bit 4