[THUDM/ChatGLM-6B]ptuning 的evaluate.sh执行失败

2024-06-17 553 views
8

[WARNING|tokenization_auto.py:652] 2023-04-03 18:03:23,678 >> Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|tokenization_utils_base.py:1800] 2023-04-03 18:03:23,849 >> loading file ice_text.model [INFO|tokenization_utils_base.py:1800] 2023-04-03 18:03:23,849 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:1800] 2023-04-03 18:03:23,849 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:1800] 2023-04-03 18:03:23,849 >> loading file tokenizer_config.json [WARNING|auto_factory.py:456] 2023-04-03 18:03:24,851 >> Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. [INFO|modeling_utils.py:2400] 2023-04-03 18:03:24,959 >> loading weights file ./output/adgen-chatglm-6b-pt-8-1e-2/checkpoint-1000/pytorch_model.bin [INFO|configuration_utils.py:575] 2023-04-03 18:03:35,970 >> Generate config GenerationConfig { "_from_model_config": true, "bos_token_id": 150004, "eos_token_id": 150005, "pad_token_id": 20003, "transformers_version": "4.27.1" }

[INFO|modeling_utils.py:3032] 2023-04-03 18:05:00,044 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.

[INFO|modeling_utils.py:3040] 2023-04-03 18:05:00,044 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at ./output/adgen-chatglm-6b-pt-8-1e-2/checkpoint-1000. If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training. [INFO|configuration_utils.py:535] 2023-04-03 18:05:00,056 >> loading configuration file ./output/adgen-chatglm-6b-pt-8-1e-2/checkpoint-1000/generation_config.json [INFO|configuration_utils.py:575] 2023-04-03 18:05:00,057 >> Generate config GenerationConfig { "_from_model_config": true, "bos_token_id": 150004, "eos_token_id": 150005, "pad_token_id": 20003, "transformers_version": "4.27.1" }

Quantized to 4 bit /home/weiqiang/.local/lib/python3.8/site-packages/dill/_dill.py:1705: PicklingWarning: Cannot locate reference to <class 'google.protobuf.pyext._message.CMessage'>. warnings.warn('Cannot locate reference to %r.' % (obj,), PicklingWarning) /home/weiqiang/.local/lib/python3.8/site-packages/dill/_dill.py:1707: PicklingWarning: Cannot pickle <class 'google.protobuf.pyext._message.CMessage'>: google.protobuf.pyext._message.CMessage has recursive self-references that trigger a RecursionError. warnings.warn('Cannot pickle %r: %s.%s has recursive self-references that trigger a RecursionError.' % (obj, obj.module, obj_name), PicklingWarning) 04/03/2023 18:05:00 - WARNING - datasets.fingerprint - Parameter 'function'=<function main..preprocess_function_eval at 0x7f4483d67ca0> of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed. Traceback (most recent call last): File "main.py", line 391, in main() File "main.py", line 229, in main eval_dataset = eval_dataset.map( File "/home/weiqiang/.local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 563, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs) File "/home/weiqiang/.local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 528, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, args, kwargs) File "/home/weiqiang/.local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3004, in map for rank, done, content in Dataset._map_single(dataset_kwargs): File "/home/weiqiang/.local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3397, in _map_single writer.write_batch(batch) File "/home/weiqiang/.local/lib/python3.8/site-packages/datasets/arrow_writer.py", line 554, in write_batch pa_table = pa.Table.from_arrays(arrays, schema=schema) File "pyarrow/table.pxi", line 3674, in pyarrow.lib.Table.from_arrays File "pyarrow/table.pxi", line 2837, in pyarrow.lib.Table.validate File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: Column 1 named attention_mask expected length 100 but got length 99

希望正常执行,别出现现在的中断执行问题

1.windows10 ,wsl2环境下 2.ptuning按照教程执行 3.train.sh训练过程正常 4.evalueate.sh 评估过程出错

Environment
- OS:windows10-wsl2-ubuntu20.04
- Python:3.8
- Transformers:4.27.1
- PyTorch:1.13
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

回答

7

被这个折腾了一天,仍然没有解决掉!

3

您好,我训练自己的数据集没问题,但是在evaluate.sh一直报错ModuleNotFoundError: No module named 'transformers_modules.',能解决吗

4

这是旧版本的一个bug,需要把./output/adgen-chatglm-6b-pt-8-1e-2/checkpoint-1000下面的tokenization_chatglm.py替换成huggingface repo上最新的

2

您好,我训练自己的数据集没问题,但是在evaluate.sh一直报错ModuleNotFoundError: No module named 'transformers_modules.',能解决吗

把模型加载的时候的相对路径换成绝对路径

3

这是旧版本的一个bug,需要把./output/adgen-chatglm-6b-pt-8-1e-2/checkpoint-1000下面的tokenization_chatglm.py替换成huggingface repo上最新的 非常感谢,您的意见奏效了!已经可以正常工作了。

3

是的,我也遇到了同样的问题,我在win10下面操作的,是把环境变量重新引到了一下解决的。

5

你好,我训练自己的数据集没有问题,但是在evaluate.sh一直报错ModuleNotFoundError: No module named 'transformers_modules.',能解决吗

把模型加载的时候的相对路径转换成绝对路径这个问题我还没解决呢,您能帮吗再详细描述一下吗

5

把模型加载的时候的相对路径转换成绝对路径 指的是evaluate.sh --model_name_or_path

9

./output/adgen-chatglm-6b-pt-8-1e-2/checkpoint-1000 指的是evaluate.sh 的--model_name_or_path吗

1

你好,想问一下在推理过程中如果报错ValueError: Need either a dataset name or a training/validation file.是为什么啊