代码如下: import torch import ChatTTS from IPython.display import Audio
初始化ChatTTSchat = ChatTTS.Chat() chat.load_models()
定义要转换为语音的文本texts = ["你好,欢迎使用ChatTTS!"]
生成语音wavs = chat.infer(texts, use_decoder=True)
播放生成的音频Audio(wavs[0], rate=24_000, autoplay=True)
终端显示如下: E:\workplace\ChatTTS\venvs\Scripts\python.exe E:\workplace\ChatTTS\test.py INFO:ChatTTS.core:Load from cache: E:\huggingface\hub/models--2Noise--ChatTTS/snapshots\c0aa9139945a4d7bb1c84f07785db576f2bb1bfa INFO:ChatTTS.core:use cuda:0 INFO:ChatTTS.core:vocos loaded. INFO:ChatTTS.core:dvae loaded. INFO:ChatTTS.core:gpt loaded. INFO:ChatTTS.core:decoder loaded. INFO:ChatTTS.core:tokenizer loaded. INFO:ChatTTS.core:All initialized. INFO:ChatTTS.core:All initialized. 0%| | 0/384 [00:00<?, ?it/s]E:\workplace\ChatTTS\venvs\lib\site-packages\transformers\models\llama\modeling_llama.py:649: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.) attn_output = torch.nn.functional.scaled_dot_product_attention( 4%|▍ | 15/384 [00:03<01:15, 4.91it/s] 4%|▍ | 85/2048 [00:02<01:06, 29.35it/s]
Process finished with exit code 0