[THUDM/ChatGLM-6B][BUG/Help] <使用evaluate.sh 评估模型效果时报错>

2024-06-17 138 views
7

image 在adgen数据上进行微调之后想测试模型效果。 尝试运行evaluate.sh时总会报这个错误,但是不知道怎么解决

Steps To Reproduce

bash evaluate.sh

Environment
- OS:
- Python: 3.7.5
- Transformers: 4.27.1
- PyTorch: 1.13.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

回答

8

我在验证的时候报的是这个错误

Traceback (most recent call last): File "ChatGLM-6B/ptuning/main.py", line 429, in main() File "ChatGLM-6B/ptuning/main.py", line 122, in main model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict) File "/home/fengpan/anaconda3/envs/omni-event/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1605, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for PrefixEncoder: Unexpected key(s) in state_dict: ".weight", "layernorm.weight", "layernorm.bias", "ion.rotary_emb.inv_freq", "ion.query_key_value.bias", "ion.query_key_value.weight", "ion.query_key_value.weight_scale", "ion.dense.bias", "ion.dense.weight", "ion.dense.weight_scale", "ttention_layernorm.weight", "ttention_layernorm.bias", "nse_h_to_4h.bias", "nse_h_to_4h.weight", "nse_h_to_4h.weight_scale", "nse_4h_to_h.bias", "nse_4h_to_h.weight", "nse_4h_to_h.weight_scale", "_layernorm.weight", "_layernorm.bias", "tion.rotary_emb.inv_freq", "tion.query_key_value.bias", "tion.query_key_value.weight", "tion.query_key_value.weight_scale", "tion.dense.bias", "tion.dense.weight", "tion.dense.weight_scale", "attention_layernorm.weight", "attention_layernorm.bias", "ense_h_to_4h.bias", "ense_h_to_4h.weight", "ense_h_to_4h.weight_scale", "ense_4h_to_h.bias", "ense_4h_to_h.weight", "ense_4h_to_h.weight_scale", ".bias", "".

  • Python: 3.7.5
  • Transformers: 4.27.1
  • PyTorch: 1.12.1
  • 大佬帮忙看看

5

同样遇到了这个问题,蹲一个解决办法

0

遇到了同样的问题

9

me too

5

更新一下 tokenization_chatglm.py 到你的checkpoint目录下(就是最后报错的那个文件)

7

感谢,可以了

4

@duzx16 我更新了evaluate.sh 还有tokenization_chatglm.py 但是还是报错 RuntimeError: Error(s) in loading state_dict for PrefixEncoder: Unexpected key(s) in state_dict: ".weight", "layernorm.weight", "layernorm.bias", "ion.rotary_emb.inv_freq", "ion.query_key_value.bias", "ion.query_key_value.weight", "ion.query_key_value.weight_scale", "ion.dense.bias", "ion.dense.weight", "ion.dense.weight_scale", "ttention_layernorm.weight", "ttention_layernorm.bias", "nse_h_to_4h.bias", "nse_h_to_4h.weight", "nse_h_to_4h.weight_scale", "nse_4h_to_h.bias", "nse_4h_to_h.weight", "nse_4h_to_h.weight_scale", "_layernorm.weight", "_layernorm.bias", "tion.rotary_emb.inv_freq", "tion.query_key_value.bias", "tion.query_key_value.weight", "tion.query_key_value.weight_scale", "tion.dense.bias", "tion.dense.weight", "tion.dense.weight_scale", "attention_layernorm.weight", "attention_layernorm.bias", "ense_h_to_4h.bias", "ense_h_to_4h.weight", "ense_h_to_4h.weight_scale", "ense_4h_to_h.bias", "ense_4h_to_h.weight", "ense_4h_to_h.weight_scale", ".bias", "".

evaluate.sh 是这样的: PRE_SEQ_LEN=128 CHECKPOINT=adgen-chatglm-6b-pt-128-2e-2 STEP=30

CUDA_VISIBLE_DEVICES=0 python3 main.py \ --do_predict \ --validation_file dataset/devData.json \ --test_file dataset/devData.json \ --overwrite_cache \ --prompt_column query \ --response_column output \ --model_name_or_path model/chatGLM6b/model \ --ptuning_checkpoint ./output/$CHECKPOINT/checkpoint-$STEP \ --output_dir ./output/$CHECKPOINT \ --overwrite_output_dir \ --max_source_length 64 \ --max_target_length 64 \ --per_device_eval_batch_size 1 \ --predict_with_generate \ --pre_seq_len $PRE_SEQ_LEN \ --quantization_bit 4