[THUDM/ChatGLM-6B]如何使用amd显卡运行此模型

2024-07-12 654 views
2

如何使用amd显卡运行此模型

Environment
- OS:CentOS 7.9
- Python:3.7
- Transformers:4.26.1
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

回答

3

用cpu吧,13代的,虽然不能回车就回复,但是端杯茶看回复还是可以接受的

3

装好AMD ROCm后可以使用Cuda

5

Environment

  • OS:Ubuntu:22.04
  • Python:3.10
  • Transformers:4.26.1/4.27.1
  • PyTorch:2.0
  • CUDA Support (python -c "import torch; print(torch.cuda.is_available())") :true
  • GPU:AMD RX6600
  • ROCM:5.4.3

Error

欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序

用户:hi
Fatal Python error: Segmentation fault

Current thread 0x00007fe057f23000 (most recent call first):
  File "/home/aomi/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 1235 in stream_generate
  File "/home/aomi/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35 in generator_context
  File "/home/aomi/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 1163 in stream_chat
  File "/home/aomi/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35 in generator_context
  File "/home/aomi/projects/ChatGLM-6B/cli_demo.py", line 47 in main
  File "/home/aomi/projects/ChatGLM-6B/cli_demo.py", line 62 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, yaml._yaml, sentencepiece._sentencepiece, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, PIL._imaging, PIL._imagingft, google.protobuf.pyext._message (total: 25)
段错误 (核心已转储)
8

是不是说用不了?

8

改成gfx1030也是不行,提示需要cuda 好像是rocm有Bug的感觉,总之用不了

3

为啥Linux 3090卡也会crash 掉

9

需要安装 ROCm, 并安装带有 ROCm 的 pytorch.

实测 2.0+ 的 pytorch 的 tensor 可能有 bug, 这里使用 torch-1.13.1+rocm5.2 与 ROCm 5.5 可以正常运行, 设备是 gfx906, 并使用

export HIP_VISIBLE_DEVICES=0

来排除不受支持的设备, 防止 Segmentation fault