0
Issue Description: 我使用zero shot功能,尝试合成音频的时候,程序陷入死循环 代码如下
logger = get_logger("Test #511", lv=logging.WARN)
chat = ChatTTS.Chat(logger)
chat.load(compile=False, source="huggingface") # Set to True for better performance
texts = [
"你今天似乎心情不太好,是发生了什么事情吗",
]
params_infer_code = ChatTTS.Chat.InferCodeParams(
spk_smp=chat.sample_audio_speaker(load_audio("debug.wav", 24000)),
txt_smp="就是,我是去唐山那个城市,接我爸的遗体回来的。",
)
wavs = chat.infer(
texts,
skip_refine_text=False,
params_infer_code=params_infer_code,
)
输出如下
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:462: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:647: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:162: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/xxx/env/ChatTTS/lib/python3.12/site-packages/torch/cuda/__init__.py:128: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
[+0800 20240807 18:07:27] [WARN] Test #511 | gpu | no GPU found, use CPU instead
/xxx/opensource/ChatTTS/ChatTTS/model/tokenizer.py:24: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
tokenizer: BertTokenizerFast = torch.load(
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/residual_fsq.py:170: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with autocast(enabled = False):
/xxx/env/ChatTTS/lib/python3.12/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:192: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with quantization_context():
text: 0%| | 0/384(max) [00:00, ?it/s]We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
text: 7%|█████████████▏ | 26/384(max) [00:01, 19.11it/s]
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:49] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:49] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:49] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:49] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:50] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:50] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:50] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:50] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:51] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:51] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:51] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:51] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:52] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:52] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:52] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:52] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:53] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:53] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:53] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:53] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:54] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:54] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%| | 0/2048(max) [00:00, ?it/s][+0800 20240807 17:27:54] [WARN] Test #511 | gpt | unexpected end at index [0]
code: 0%| | 0/2048(max) [00:00, ?it/s]
[+0800 20240807 17:27:54] [WARN] Test #511 | gpt | regenerate in order to ensure non-empty
code: 0%|
上面的log就一直循环输出 我上面使用的debug.wav是采用率为16k的音频,我也尝试过用sox转成24k的音频,还是会出现同样的问题;我也尝试过其他的prompt音频,也会发生同样的问题 另外我为了图方面没有使用GPU,而是用CPU生成音频,不知道这个会不会影响