[THUDM/ChatGLM-6B][BUG/Help] 显存爆满后无法释放

输入文本过长导致报错，提示显存不足，清空history并torch.cuda.empty_cache()后显存仍被占用，此时再输入新的问题依然可能出现报错显存不足。

del并重新加载model后显存仍然显示被占用，但输入同样的问题可以正常使用，请问除了重新加载model外还有其他办法吗？

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.25 GiB (GPU 0; 23.65 GiB total capacity; 22.79 GiB already allocated; 81.31 MiB free; 22.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Environment

- OS:Ubuntu 18.04.6 
- Python:3.9
- Transformers:4.26.1
- PyTorch:1.12
- CUDA Support : True

fangmy1993

仅供参考 #243

torch.cuda.OutOfMemoryError: CUDA out of memory.

ZhangErling

torch.cuda.empty_cache() 你不访多执行几次

cywjava

没用啊

hugefrog

if torch.cuda.is_available(): torch.cuda.empty_cache() torch.cuda.ipc_collect() st.session_state["state"] = []

JamePeng

试了上面的所有方法，包括知乎上看到的 CUDA_DEVICE = "cuda:0" def torch_gc(): """清除GPU缓存""" if torch.cuda.is_available(): with torch.cuda.device(CUDA_DEVICE): torch.cuda.empty_cache() torch.cuda.ipc_collect()

import torch

torch.cuda.empty_cache() torch.cuda.reset_max_memory_allocated()

关闭会话

torch.cuda.current_device() torch.cuda.device(0) torch.cuda.empty_cache() torch.cuda.ipc_collect() torch.cuda.empty_cache() torch.cuda.memory_allocated() torch.cuda.memory_reserved()

都没有用，必须重启kernel才能清空显存。

GoldExperience

同问，CPU memory只增不减，history置空，torch.cuda.empty_cache()等方法都没用

Kou-Guandong

还没解决吗//

wfwei

[THUDM/ChatGLM-6B][BUG/Help] 显存爆满后无法释放

回答