[THUDM/ChatGLM-6B]如何在微调后应该对其进行下游任务的训练TRAIN this model on a down-stream[BUG/Help]

微调后加载模型和checkpoint 出现如下提示： Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at D:\glm\chatglm_webui\chatglm-6b and are newly initialized: ['transformer.prefix_encoder.embedding.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 这是一个警告信息，表明ChatGLMForConditionalGeneration模型中的某些权重没有从提供的检查点初始化，而是被随机初始化。它建议在该模型用于预测或推理之前，应该对其进行下游任务的训练。

Environment

- OS:win11
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

BoFan-tunning

我也遇到的相同的问题，而且最终也没有执行成功，报错为：torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 7.80 GiB total capacity; 7.25 GiB already allocated; 71.44 MiB free; 7.25 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

zodiacVG

微调后的模型部署请参考https://github.com/THUDM/ChatGLM-6B/tree/main/ptuning#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2

duzx16

请问解决了吗

lelechallc

我之前使用的是8GB显存的2080显卡，之后更换为16GB显存的T4，就可以跑起来了。其实理论上8GB是够用的，我也不清楚为什么。

zodiacVG

你这是因为显存不够导致的

maxh2018

[THUDM/ChatGLM-6B]如何在微调后应该对其进行下游任务的训练TRAIN this model on a down-stream[BUG/Help]

回答