【Audio】05 TTS
【Audio】05 TTS
1 fishspeech
python: 3.9.16 kernel: 5.4.0-42-generic
doc: https://speech.fish.audio/samples/
it requires python 3.10, after i installed python3.10, and install env refer to the document, and start it, it shows nvidia-driver version is too low, i don’t know how to solve it, so i ignore it for now.
2 sambert
2.1 inference directly
python: 3.8.19 kernel: 5.4.0-42-generic
https://www.modelscope.cn/models/iic/speech_sambert-hifigan_tts_zh-cn_16k
1
2
pip install torch==1.11.0 "numpy<2" torchvision==0.12.0 torchaudio==0.11.0 modelscope[framework] matplotlib pytorch_wavelets tensorboardX
pip install kantts -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
1
2
3
4
5
6
7
8
9
10
11
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
text = '待合成文本'
model_id = 'damo/speech_sambert-hifigan_tts_zh-cn_16k'
sambert_hifigan_tts = pipeline(task=Tasks.text_to_speech, model=model_id)
output = sambert_hifigan_tts(input=text, voice='zhitian_emo')
wav = output[OutputKeys.OUTPUT_WAV]
with open('output.wav', 'wb') as f:
f.write(wav)
This post is licensed under CC BY 4.0 by the author.