6
def generate_speaker_tensor(mean: float = 0.0, std: float = 15.247) -> torch.Tensor:
return torch.normal(mean, std, size=(768,))
def generate_speaker_tensor_a() -> torch.Tensor:
std, mean = torch.load(f'{Path(__file__).resolve().parent}/models/asset/spk_stat.pt').chunk(2)
rand_spk = torch.randn(768) * std + mean
return rand_spk
使用 generate_speaker_tensor 生成speaker1 和 generate_speaker_tensor_a 生成speaker2 ,然后分别保存到本地; 在推理时,本地加载speaker1 以及 speaker2 修改 params_infer_code 中的 rand_spk并生成多个语音 ··· params_infer_code = { 'spk_emb': rand_spk, # add sampled speaker 'temperature': .3, # using custom temperature 'top_P': 0.7, # top P decode 'top_K': 20, # top K decode } ···
为什么 speaker2 可以相对固定音色,但 speaker1则不行且每次都不同?