[THUDM/ChatGLM-6B]从本地导入chatglm-6b-int4预训练时,却从缓存导入

2024-05-10 665 views
4

tokenizer = AutoTokenizer.from_pretrained("model/chatglm-6b-int4", trust_remote_code=True) 我已经在目录下放置了model/chatglm-6b-int4文件夹,但模型坚持从.cache中导入,并报错: ModuleNotFoundError: No module named 'transformers_modules.model/chatglm-6b-int4'

Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

回答

4

我的也是从缓存导入的,你是怎么让 chatglm-6b-int4 在 GPU 上跑的?我这边只能 CPU 推断

2

升级transfromers到更新的版本,我这边有效

8

@2019switch 升级下确实可以跑起来了,但为什么感觉4bit量化后显存占用少了但推断更慢了呢

2

@rxy1212 你好,先请教一下你是怎么下载预训练权重的额?好像huggingface登不了额,另外int4需要下载哪些文件呢?谢谢

7

@Albert337

import os
from huggingface_hub import login, logout, hf_hub_download

repo_id = 'THUDM/chatglm-6b-int4'
local_dir = './chatglm-6b-int4'
token = os.getenv('HUGGINGFACE_TOKEN')
login(token=token, add_to_git_credential=True)
hf_hub_download(repo_id=repo_id, filename="ice_text.model", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="configuration_chatglm.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="config.json", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="modeling_chatglm.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="quantization.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="quantization_kernels.c", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="quantization_kernels_parallel.c", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="tokenization_chatglm.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="tokenizer_config.json", local_dir=local_dir, local_dir_use_symlinks=False)
logout()

可以用脚本下载,模型在 huggingface 上都有 https://huggingface.co/THUDM/chatglm-6b-int4

8

@Albert337

import os
from huggingface_hub import login, logout, hf_hub_download

repo_id = 'THUDM/chatglm-6b-int4'
local_dir = './chatglm-6b-int4'
token = os.getenv('HUGGINGFACE_TOKEN')
login(token=token, add_to_git_credential=True)
hf_hub_download(repo_id=repo_id, filename="ice_text.model", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="configuration_chatglm.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="config.json", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="modeling_chatglm.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="quantization.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="quantization_kernels.c", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="quantization_kernels_parallel.c", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="tokenization_chatglm.py", local_dir=local_dir, local_dir_use_symlinks=False)
hf_hub_download(repo_id=repo_id, filename="tokenizer_config.json", local_dir=local_dir, local_dir_use_symlinks=False)
logout()

可以用脚本下载,模型在 huggingface 上都有 https://huggingface.co/THUDM/chatglm-6b-int4

首先很感谢你的回答,但是我这边还是显示requests.exceptions.ConnectionError,用了科学上网好像还是不行

1

@Albert337 这个是这样的,你多试几次就好了(报错就一直重复运行脚本),一般前面两个大文件下完后面的文件又要下一次

6

@rxy1212 ,确实也试了很多次,还是没解决。不知道有啥稳定的下载办法没?或者有直接下载好放在云盘链接的没

9

升级transfromers到更新的版本,我这边有效

我这边试了不行
transformers已经升级到了4.31.0 一使用int8版本模型就会跑到.cache目录下面去,但使用fp16版本模型却自始至终没有这个问题