[2noise/ChatTTS]UnboundLocalError:无法访问与值无关的局部变量“Normalizer”

2024-06-05 307 views
5

Run: conda install -c conda-forge pynini=2.1.5 && pip install WeTextProcessing Traceback (most recent call last): File "/home/usr/Documents/git/ChatTTS/ChatTTS.py", line 9, in wavs = chat.infer(texts, use_decoder=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/usr/Documents/git/ChatTTS/ChatTTS/core.py", line 146, in infer self.init_normalizer(_lang) File "/home/usr/Documents/git/ChatTTS/ChatTTS/core.py", line 192, in init_normalizer self.normalizer[lang] = Normalizer().normalize ^^^^^^^^^^ UnboundLocalError: cannot access local variable 'Normalizer' where it is not associated with a value

回答

9

pip install WeTextProcessing

3

Collecting WeTextProcessing
Using cached WeTextProcessing-0.1.12-py3-none-any.whl.metadata (5.2 kB)
Collecting pynini==2.1.5 (from WeTextProcessing)
Using cached pynini-2.1.5.tar.gz (627 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output] Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/private/var/folders/5y/jvr8_n054fbbxzsv_yvpv7640000gn/T/pip-install-8qke8yip/pynini_ea11fdd3588444dbbc24ae05e2221b72/setup.py", line 22, in from Cython.Build import cythonize ModuleNotFoundError: No module named 'Cython' [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata.

9

(.venv) root@This-is-Python ChatTTS % pip install WeTextProcessing Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Collecting WeTextProcessing Using cached https://mirrors.aliyun.com/pypi/packages/04/17/3d624a59ef30484c0f4bb703529f128566fab0e8660bbd2e3a36f5244d2d/WeTextProcessing-0.1.12-py3-none-any.whl (690 kB) Collecting pynini==2.1.5 (from WeTextProcessing) Using cached https://mirrors.aliyun.com/pypi/packages/86/63/6a720dbdf4e7358baa2ab206a51693f9c6e83d174179a95539f501bd4a34/pynini-2.1.5.tar.gz (627 kB) Preparing metadata (setup.py) ... done Requirement already satisfied: importlib-resources in ./.venv/lib/python3.11/site-packages (from WeTextProcessing) (6.4.0) Requirement already satisfied: Cython>=0.29 in ./.venv/lib/python3.11/site-packages (from pynini==2.1.5->WeTextProcessing) (3.0.10) Building wheels for collected packages: pynini Building wheel for pynini (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [55 lines of output] running bdist_wheel running build running build_py creating build creating build/lib.macosx-13.0-arm64-cpython-311 creating build/lib.macosx-13.0-arm64-cpython-311/pywrapfst copying pywrapfst/init.py -> build/lib.macosx-13.0-arm64-cpython-311/pywrapfst creating build/lib.macosx-13.0-arm64-cpython-311/pynini copying pynini/init.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini creating build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/chatspeak.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/chatspeak_model.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/weather.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/init.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/numbers.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/plurals.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/t9.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/case.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/dates.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/examples/g2p.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples creating build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/edit_transducer.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/utf8.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/init.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/features.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/pynutil.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/paradigms.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/tagger.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/rule_cascade.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/rewrite.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/lib/byte.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib creating build/lib.macosx-13.0-arm64-cpython-311/pynini/export copying pynini/export/init.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/export copying pynini/export/export.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/export copying pynini/export/multi_grm_example.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/export copying pynini/export/multi_grm.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/export copying pynini/export/grm.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/export copying pynini/export/grm_example.py -> build/lib.macosx-13.0-arm64-cpython-311/pynini/export copying pywrapfst/init.pyi -> build/lib.macosx-13.0-arm64-cpython-311/pywrapfst copying pywrapfst/py.typed -> build/lib.macosx-13.0-arm64-cpython-311/pywrapfst copying pynini/init.pyi -> build/lib.macosx-13.0-arm64-cpython-311/pynini copying pynini/py.typed -> build/lib.macosx-13.0-arm64-cpython-311/pynini copying pynini/examples/py.typed -> build/lib.macosx-13.0-arm64-cpython-311/pynini/examples copying pynini/lib/py.typed -> build/lib.macosx-13.0-arm64-cpython-311/pynini/lib copying pynini/export/py.typed -> build/lib.macosx-13.0-arm64-cpython-311/pynini/export running build_ext building '_pywrapfst' extension creating build/temp.macosx-13.0-arm64-cpython-311 creating build/temp.macosx-13.0-arm64-cpython-311/extensions clang -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk -I/Users/guoge/Work/DEV/server/chattts/ChatTTS/.venv/include -I/opt/homebrew/opt/python@3.11/Frameworks/Python.framework/Versions/3.11/include/python3.11 -c extensions/_pywrapfst.cpp -o build/temp.macosx-13.0-arm64-cpython-311/extensions/_pywrapfst.o -std=c++17 -Wno-register -Wno-deprecated-declarations -Wno-unused-function -Wno-unused-local-typedefs -funsigned-char -stdlib=libc++ -mmacosx-version-min=10.7 extensions/_pywrapfst.cpp:1291:10: fatal error: 'fst/util.h' file not found

include <fst/util.h>
           ^~~~~~~~~~~~
  1 error generated.
  error: command '/usr/bin/clang' failed with exit code 1
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for pynini Running setup.py clean for pynini Failed to build pynini ERROR: Could not build wheels for pynini, which is required to install pyproject.toml-based projects

环境: python@3.11 pip 24.0 MAC OS 装不上pynini,WeTextProcessing也没有办法加载。

3

用miniconda跑吧

2

Traceback (most recent call last): File "/home/brandon/Documents/git/ChatTTS/ChatTTS.py", line 11, in torchaudio.save("output1.wav", torch.from_numpy(wavs[0]), 24000) ^^^^^^^^^^ NameError: name 'torchaudio' is not defined

6

Traceback (most recent call last): File "/home/brandon/Documents/git/ChatTTS/ChatTTS.py", line 11, in torchaudio.save("output1.wav", torch.from_numpy(wavs[0]), 24000) ^^^^^^^^^^ NameError: name 'torchaudio' is not defined

7

Core.py 修改后,可以run起来。注释掉就可以。 `import os import logging from functools import partial from omegaconf import OmegaConf

import torch from vocos import Vocos from .model.dvae import DVAE from .model.gpt import GPT_warpper from .utils.gpu_utils import select_device from .utils.infer_utils import count_invalid_characters, detect_language, apply_character_map, apply_half2full_map from .utils.io_utils import get_latest_modified_file from .infer.api import refine_text, infer_code

from huggingface_hub import snapshot_download

logging.basicConfig(level = logging.INFO)

class Chat: def init(self, ): self.pretrain_models = {} self.normalizer = {} self.logger = logging.getLogger(name)

def check_model(self, level = logging.INFO, use_decoder = False):
    not_finish = False
    check_list = ['vocos', 'gpt', 'tokenizer']

    if use_decoder:
        check_list.append('decoder')
    else:
        check_list.append('dvae')

    for module in check_list:
        if module not in self.pretrain_models:
            self.logger.log(logging.WARNING, f'{module} not initialized.')
            not_finish = True

    if not not_finish:
        self.logger.log(level, f'All initialized.')

    return not not_finish

def load_models(self, source='huggingface', force_redownload=False, local_path='<LOCAL_PATH>', **kwargs):
    if source == 'huggingface':
        hf_home = os.getenv('HF_HOME', os.path.expanduser("~/.cache/huggingface"))
        try:
            download_path = get_latest_modified_file(os.path.join(hf_home, 'hub/models--2Noise--ChatTTS/snapshots'))
        except:
            download_path = None
        if download_path is None or force_redownload: 
            self.logger.log(logging.INFO, f'Download from HF: https://huggingface.co/2Noise/ChatTTS')
            download_path = snapshot_download(repo_id="2Noise/ChatTTS", allow_patterns=["*.pt", "*.yaml"])
        else:
            self.logger.log(logging.INFO, f'Load from cache: {download_path}')
    elif source == 'local':
        self.logger.log(logging.INFO, f'Load from local: {local_path}')
        download_path = local_path

    self._load(**{k: os.path.join(download_path, v) for k, v in OmegaConf.load(os.path.join(download_path, 'config', 'path.yaml')).items()}, **kwargs)

def _load(
    self, 
    vocos_config_path: str = None, 
    vocos_ckpt_path: str = None,
    dvae_config_path: str = None,
    dvae_ckpt_path: str = None,
    gpt_config_path: str = None,
    gpt_ckpt_path: str = None,
    decoder_config_path: str = None,
    decoder_ckpt_path: str = None,
    tokenizer_path: str = None,
    device: str = None,
    compile: bool = True,
):
    if not device:
        device = select_device(4096)
        self.logger.log(logging.INFO, f'use {device}')

    if vocos_config_path:
        vocos = Vocos.from_hparams(vocos_config_path).to(device).eval()
        assert vocos_ckpt_path, 'vocos_ckpt_path should not be None'
        vocos.load_state_dict(torch.load(vocos_ckpt_path))
        self.pretrain_models['vocos'] = vocos
        self.logger.log(logging.INFO, 'vocos loaded.')

    if dvae_config_path:
        cfg = OmegaConf.load(dvae_config_path)
        dvae = DVAE(**cfg).to(device).eval()
        assert dvae_ckpt_path, 'dvae_ckpt_path should not be None'
        dvae.load_state_dict(torch.load(dvae_ckpt_path, map_location='cpu'))
        self.pretrain_models['dvae'] = dvae
        self.logger.log(logging.INFO, 'dvae loaded.')

    if gpt_config_path:
        cfg = OmegaConf.load(gpt_config_path)
        gpt = GPT_warpper(**cfg).to(device).eval()
        assert gpt_ckpt_path, 'gpt_ckpt_path should not be None'
        gpt.load_state_dict(torch.load(gpt_ckpt_path, map_location='cpu'))
        if compile and 'cuda' in str(device):
            gpt.gpt.forward = torch.compile(gpt.gpt.forward,  backend='inductor', dynamic=True)
        self.pretrain_models['gpt'] = gpt
        spk_stat_path = os.path.join(os.path.dirname(gpt_ckpt_path), 'spk_stat.pt')
        assert os.path.exists(spk_stat_path), f'Missing spk_stat.pt: {spk_stat_path}'
        self.pretrain_models['spk_stat'] = torch.load(spk_stat_path).to(device)
        self.logger.log(logging.INFO, 'gpt loaded.')

    if decoder_config_path:
        cfg = OmegaConf.load(decoder_config_path)
        decoder = DVAE(**cfg).to(device).eval()
        assert decoder_ckpt_path, 'decoder_ckpt_path should not be None'
        decoder.load_state_dict(torch.load(decoder_ckpt_path, map_location='cpu'))
        self.pretrain_models['decoder'] = decoder
        self.logger.log(logging.INFO, 'decoder loaded.')

    if tokenizer_path:
        tokenizer = torch.load(tokenizer_path, map_location='cpu')
        tokenizer.padding_side = 'left'
        self.pretrain_models['tokenizer'] = tokenizer
        self.logger.log(logging.INFO, 'tokenizer loaded.')

    self.check_model()

def infer(
    self, 
    text, 
    skip_refine_text=False, 
    refine_text_only=False, 
    params_refine_text={}, 
    params_infer_code={'prompt':'[speed_5]'}, 
    use_decoder=True,
    do_text_normalization=True,
    lang=None,
):

    assert self.check_model(use_decoder=use_decoder)

    if not isinstance(text, list): 
        text = [text]

    # if do_text_normalization:
    #     for i, t in enumerate(text):
    #         _lang = detect_language(t) if lang is None else lang
    #         self.init_normalizer(_lang)
    #         text[i] = self.normalizer[_lang](t)
    #         if _lang == 'zh':
    #             text[i] = apply_half2full_map(text[i])

    for i, t in enumerate(text):
        invalid_characters = count_invalid_characters(t)
        if len(invalid_characters):
            self.logger.log(logging.WARNING, f'Invalid characters found! : {invalid_characters}')
            text[i] = apply_character_map(t)

    if not skip_refine_text:
        text_tokens = refine_text(self.pretrain_models, text, **params_refine_text)['ids']
        text_tokens = [i[i < self.pretrain_models['tokenizer'].convert_tokens_to_ids('[break_0]')] for i in text_tokens]
        text = self.pretrain_models['tokenizer'].batch_decode(text_tokens)
        if refine_text_only:
            return text

    text = [params_infer_code.get('prompt', '') + i for i in text]
    params_infer_code.pop('prompt', '')
    result = infer_code(self.pretrain_models, text, **params_infer_code, return_hidden=use_decoder)

    if use_decoder:
        mel_spec = [self.pretrain_models['decoder'](i[None].permute(0,2,1)) for i in result['hiddens']]
    else:
        mel_spec = [self.pretrain_models['dvae'](i[None].permute(0,2,1)) for i in result['ids']]

    wav = [self.pretrain_models['vocos'].decode(i).cpu().numpy() for i in mel_spec]

    return wav

def sample_random_speaker(self, ):

    dim = self.pretrain_models['gpt'].gpt.layers[0].mlp.gate_proj.in_features
    std, mean = self.pretrain_models['spk_stat'].chunk(2)
    return torch.randn(dim, device=std.device) * std + mean

# def init_normalizer(self, lang):

#     if lang not in self.normalizer:
#         if lang == 'zh':
#             try:
#                 from tn.chinese.normalizer import Normalizer
#             except:
#                 self.logger.log(logging.WARNING, f'Package WeTextProcessing not found! \
#                     Run: conda install -c conda-forge pynini=2.1.5 && pip install WeTextProcessing')
#             self.normalizer[lang] = Normalizer().normalize
#         else:
#             try:
#                 from nemo_text_processing.text_normalization.normalize import Normalizer
#             except:
#                 self.logger.log(logging.WARNING, f'Package nemo_text_processing not found! \
#                     Run: conda install -c conda-forge pynini=2.1.5 && pip install nemo_text_processing')
#             self.normalizer[lang] = partial(Normalizer(input_case='cased', lang=lang).normalize, verbose=False, punct_post_process=True)`
9

注释掉两处还是之前的问题,没用。

5

我复制它的示例代码

import ChatTTS
from IPython.display import Audio

chat = ChatTTS.Chat()
chat.load_models(compile=False) # 设置为True以获得更快速度

texts = ["在这里输入你的文本",]

wavs = chat.infer(texts, use_decoder=True)

torchaudio.save("output1.wav", torch.from_numpy(wavs[0]), 24000)

应该是可以直接运行的吗?还是要修改一下?

5

我复制它的示例代码

import ChatTTS
from IPython.display import Audio

chat = ChatTTS.Chat()
chat.load_models(compile=False) # 设置为True以获得更快速度

texts = ["在这里输入你的文本",]

wavs = chat.infer(texts, use_decoder=True)

torchaudio.save("output1.wav", torch.from_numpy(wavs[0]), 24000)

应该是可以直接运行的吗?还是要修改一下?

你直接download官方的整个项目可以正常run。 !注销这两处,就关闭了english -> chinese的功能,在文本中的数字、英文无法转译成中文语音朗读。会出现许多[cat]。

下载整个项目,选择python 3.11运行,安装.ven环境依赖(不要用全局) 然后执行python webui.py

9

也可以直接关闭Core.py中的开关do_text_normalization,置为False:

def infer(
        self,
        text,
        skip_refine_text=False,
        refine_text_only=False,
        params_refine_text={},
        params_infer_code={'prompt':'[speed_5]'},
        use_decoder=True,
        do_text_normalization=False,
        lang=None,
    ):