1
如何使用amd显卡运行此模型
Environment- OS:CentOS 7.9
- Python:3.7
- Transformers:4.26.1
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
如何使用amd显卡运行此模型
Environment- OS:CentOS 7.9
- Python:3.7
- Transformers:4.26.1
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
用cpu吧,13代的,虽然不能回车就回复,但是端杯茶看回复还是可以接受的
装好AMD ROCm后可以使用Cuda
Environment
python -c "import torch; print(torch.cuda.is_available())"
) :trueError
欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序
用户:hi
Fatal Python error: Segmentation fault
Current thread 0x00007fe057f23000 (most recent call first):
File "/home/aomi/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 1235 in stream_generate
File "/home/aomi/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35 in generator_context
File "/home/aomi/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py", line 1163 in stream_chat
File "/home/aomi/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35 in generator_context
File "/home/aomi/projects/ChatGLM-6B/cli_demo.py", line 47 in main
File "/home/aomi/projects/ChatGLM-6B/cli_demo.py", line 62 in <module>
Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, yaml._yaml, sentencepiece._sentencepiece, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, PIL._imaging, PIL._imagingft, google.protobuf.pyext._message (total: 25)
段错误 (核心已转储)
是不是说用不了?
改成gfx1030也是不行,提示需要cuda 好像是rocm有Bug的感觉,总之用不了
为啥Linux 3090卡也会crash 掉
需要安装 ROCm, 并安装带有 ROCm 的 pytorch.
实测 2.0+ 的 pytorch 的 tensor 可能有 bug, 这里使用 torch-1.13.1+rocm5.2 与 ROCm 5.5 可以正常运行, 设备是 gfx906, 并使用
export HIP_VISIBLE_DEVICES=0
来排除不受支持的设备, 防止 Segmentation fault