[2noise/ChatTTS]dev 分支use_flash_attn 与 use_vllm 同时打开,会报错:AttributeError: 'GPT' object has no attribute 'gpt'

2024-08-19 59 views
5
Traceback (most recent call last):
  File "/workspace/ChatTTS/examples/cmd/stream.py", line 189, in <module>
    chat.load(compile=False, use_flash_attn=True, use_vllm=True)
  File "/workspace/ChatTTS/ChatTTS/core.py", line 134, in load
    return self._load(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/ChatTTS/ChatTTS/core.py", line 291, in _load
    gpt.prepare(compile=compile and "cuda" in str(device))
  File "/workspace/ChatTTS/ChatTTS/model/gpt.py", line 189, in prepare
    self.gpt = self.gpt.to(dtype=torch.float16)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1695, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'GPT' object has no attribute 'gpt'

Chatts/models/gpt.py

        self.use_flash_attn = use_flash_attn
        self.is_te_llama = False
        self.is_vllm = use_vllm

        if self.is_vllm:
            return

        self.gpt, self.llama_config = self._build_llama(gpt_config, self.device_gpt)

Chatts/models/gpt.py

def prepare(self, compile=False):
        if self.use_flash_attn and is_flash_attn_2_available():
            self.gpt = self.gpt.to(dtype=torch.float16)
        if compile and not self.is_te_llama and not self.is_vllm:
            try:
                self.compile(backend="inductor", dynamic=True)
                self.gpt.compile(backend="inductor", dynamic=True)
            except RuntimeError as e:
                self.logger.warning(f"compile failed: {e}. fallback to normal mode.")

use_vllm打开就没有 self.gpt的定义, 执行到 prepare 的时候就会报错

回答

7
  1. 不要同时打开,vllm 与 flash_attn 并不兼容
  2. vllm 还未完全适配,因此仍在 dev 分支
  3. 不要打开 flash_attn,反而会减速/速度不变
7

ok. 了解