[2noise/ChatTTS]运行zero shot功能陷入死循环infinite loop occurs while using the zero shot function

2024-08-19 495 views
0

Issue Description: 我使用zero shot功能,尝试合成音频的时候,程序陷入死循环 代码如下

logger = get_logger("Test #511", lv=logging.WARN)
chat = ChatTTS.Chat(logger)
chat.load(compile=False, source="huggingface")  # Set to True for better performance
texts = [
    "你今天似乎心情不太好,是发生了什么事情吗",
]

params_infer_code = ChatTTS.Chat.InferCodeParams( 
    spk_smp=chat.sample_audio_speaker(load_audio("debug.wav", 24000)),
    txt_smp="就是,我是去唐山那个城市,接我爸的遗体回来的。",
)
wavs = chat.infer(
    texts,
    skip_refine_text=False,
    params_infer_code=params_infer_code,
)

输出如下

/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:462: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:647: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:162: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/xxx/env/ChatTTS/lib/python3.12/site-packages/torch/cuda/__init__.py:128: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() > 0
[+0800 20240807 18:07:27] [WARN] Test #511 | gpu | no GPU found, use CPU instead
/xxx/opensource/ChatTTS/ChatTTS/model/tokenizer.py:24: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  tokenizer: BertTokenizerFast = torch.load(
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/residual_fsq.py:170: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with autocast(enabled = False):
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:192: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with quantization_context():
text:   0%|                                                                                                                                                                                                        | 0/384(max) [00:00, ?it/s]We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
text:   7%|█████████████▏                                                                                                                                                                                     | 26/384(max) [00:01, 19.11it/s]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:49] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:49] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:49] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:49] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:50] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:50] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:50] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:50] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:51] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:51] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:51] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:51] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:52] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:52] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:52] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:52] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:53] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:53] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:53] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:53] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:54] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:54] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:54] [WARN] Test #511 | gpt | unexpected end at index [0]
code:   0%|                                                                                                                                                                                                       | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:54] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code:   0%|

上面的log就一直循环输出 我上面使用的debug.wav是采用率为16k的音频,我也尝试过用sox转成24k的音频,还是会出现同样的问题;我也尝试过其他的prompt音频,也会发生同样的问题 另外我为了图方面没有使用GPU,而是用CPU生成音频,不知道这个会不会影响

回答

3

仔细看了一下代码,这个感觉是和issue#648一样的问题,感觉是输出不了code导致gpt不停重新生成 但是我把temperature调大到0.5,音频截取到8秒,仔细对过音频和转写,还是不停地死循环

8

尝试使用最新dev版本

3

尝试使用最新dev版本

感谢回复

请问您的建议是checkout到dev分支吗,切了之后也是不行

另外捉个虫,dev分支的ChatTTS/model/gpt.py文件中651行和652行之间少了一个参数manual_seed传入

3

切了之后也是不行

也许是音频有问题。你可以尝试用webui而非手写代码,并尝试调整各项参数。如果还是不行,就把音频贴出来看看。