Bert-vits2-v2.2新版本本地训练推理整合包(原神八重神子英文模型miko)
近日,Bert-vits2-v2.2如約更新,該新版本v2.2主要把Emotion 模型換用CLAP多模態(tài)模型,推理支持輸入text prompt提示詞和audio prompt提示語(yǔ)音來進(jìn)行引導(dǎo)風(fēng)格化合成,讓推理音色更具情感特色,并且推出了新的預(yù)處理webuI,操作上更加親民和接地氣。
更多情報(bào)請(qǐng)參見Bert-vits2官網(wǎng):
https://github.com/fishaudio/Bert-VITS2/releases/tag/v2.2
與此同時(shí),基于FastApi的推理web界面項(xiàng)目也同步適配了Bert-vits2-v2.2版本,官網(wǎng)如下:
https://github.com/jiangyuxiaoxiao/Bert-VITS2-UI
本次我們基于此兩個(gè)項(xiàng)目來克隆原神角色八重神子的英文語(yǔ)音模型miko。
Bert-vits2-v2.2新的底模和情感模型
首先克隆Bert-vits2-v2.2官方項(xiàng)目:
git clone https://github.com/fishaudio/Bert-VITS2/tree/v2.2
安裝依賴:
pip3 install -r requirements.txt
這里注意是v2.2的tag分支,因?yàn)楣俜诫S時(shí)都在更新,主分支可能會(huì)存在bug。
進(jìn)入項(xiàng)目的目錄:
cd /Bert-VITS2
隨后下載新的底模和情感模型,下載地址:
https://openi.pcl.ac.cn/Stardust_minus/Bert-VITS2/modelmanage/show_model
將新的情感模型clap-hatsat-fused放入到項(xiàng)目的emotional目錄,結(jié)構(gòu)如下:
E:\work\Bert-VITS2-v22\emotional>tree /f
Folder PATH listing for volume myssd
Volume serial number is 7CE3-15AE
E:.
├───clap-htsat-fused
│ .gitattributes
│ config.json
│ merges.txt
│ preprocessor_config.json
│ pytorch_model.bin
│ README.md
│ special_tokens_map.json
│ tokenizer.json
│ tokenizer_config.json
│ vocab.json
│
└───wav2vec2-large-robust-12-ft-emotion-msp-dim
.gitattributes
config.json
LICENSE
preprocessor_config.json
pytorch_model.bin
README.md
vocab.json
注意,wav2vec2-large-robust-12-ft-emotion-msp-dim是Bert-vits2-v2.1的情感模型,也需要保留,具體請(qǐng)移步:義無反顧馬督工,Bert-vits2V210復(fù)刻馬督工實(shí)踐(Python3.10), 這里不再贅述。
至此,新模型就配置好了。
Bert-vits2-v2.2模型訓(xùn)練
首先下載訓(xùn)練集,以原神角色八重神子的英文配音為例子,數(shù)據(jù)集下載地址:
https://github.com/AI-Hobbyist/Genshin_Datasets
隨后新建miko角色目錄
mkdir miko
將語(yǔ)音標(biāo)注文件以esd.list命名,放入miko目錄。
同時(shí)將分片語(yǔ)音素材放入raw目錄。
最后新建miko/configs/config.json配置文件:
{
"train": {
"log_interval": 50,
"eval_interval": 50,
"seed": 42,
"epochs": 1000,
"learning_rate": 0.0002,
"betas": [
0.8,
0.99
],
"eps": 1e-09,
"batch_size": 6,
"fp16_run": false,
"lr_decay": 0.99995,
"segment_size": 16384,
"init_lr_ratio": 1,
"warmup_epochs": 0,
"c_mel": 45,
"c_kl": 1.0,
"skip_optimizer": false,
"freeze_ZH_bert": false,
"freeze_JP_bert": false,
"freeze_EN_bert": false
},
"data": {
"training_files": "data/miko/train.list",
"validation_files": "data/miko/val.list",
"max_wav_value": 32768.0,
"sampling_rate": 44100,
"filter_length": 2048,
"hop_length": 512,
"win_length": 2048,
"n_mel_channels": 128,
"mel_fmin": 0.0,
"mel_fmax": null,
"add_blank": true,
"n_speakers": 1,
"cleaned_text": true,
"spk2id": {
"miko": 0
}
},
"model": {
"use_spk_conditioned_encoder": true,
"use_noise_scaled_mas": true,
"use_mel_posterior_encoder": false,
"use_duration_discriminator": true,
"inter_channels": 192,
"hidden_channels": 192,
"filter_channels": 768,
"n_heads": 2,
"n_layers": 6,
"kernel_size": 3,
"p_dropout": 0.1,
"resblock": "1",
"resblock_kernel_sizes": [
3,
7,
11
],
"resblock_dilation_sizes": [
[
1,
3,
5
],
[
1,
3,
5
],
[
1,
3,
5
]
],
"upsample_rates": [
8,
8,
2,
2,
2
],
"upsample_initial_channel": 512,
"upsample_kernel_sizes": [
16,
16,
8,
2,
2
],
"n_layers_q": 3,
"use_spectral_norm": false,
"gin_channels": 256
},
"version": "2.2"
}
這里注意"version": "2.2",即版本號(hào)為最新的v2.2。
其他參數(shù)根據(jù)當(dāng)前的設(shè)備環(huán)境酌情調(diào)整即可。
隨后啟動(dòng)預(yù)處理頁(yè)面:
python3 webui_preprocess.py
訪問http://127.0.0.1:7860/:
按照頁(yè)面的步驟進(jìn)行操作即可,簡(jiǎn)單且方便。
操作完之后,運(yùn)行訓(xùn)練命令:
python3 train_ms.py
訓(xùn)練好的模型放在data/miko/models目錄,結(jié)構(gòu)如下:
E:\work\Bert-VITS2-v22\Data\miko\models>tree /f
Folder PATH listing for volume myssd
Volume serial number is 7CE3-15AE
E:.
│ DUR_0.pth
│ DUR_100.pth
│ DUR_150.pth
│ DUR_50.pth
│ D_0.pth
│ D_100.pth
│ D_150.pth
│ D_50.pth
│ events.out.tfevents.1702457087.ly.13044.0
│ events.out.tfevents.1702458207.ly.12416.0
│ githash
│ G_0.pth
│ G_100.pth
│ G_150.pth
│ G_50.pth
│ train.log
│
└───eval
events.out.tfevents.1702457087.ly.13044.1
events.out.tfevents.1702458207.ly.12416.1
至此,訓(xùn)練環(huán)節(jié)結(jié)束。
Bert-vits2-v2.2模型推理
推理我們使用Bert-vits2-UI項(xiàng)目的頁(yè)面,克隆web項(xiàng)目:
git clone https://github.com/jiangyuxiaoxiao/Bert-VITS2-UI
將Web項(xiàng)目放入Bert-vits2-v2.2的根目錄中,目錄結(jié)構(gòu)如下:
E:\work\Bert-VITS2-v22_lilith\Web>tree /f
Folder PATH listing for volume myssd
Volume serial number is 7CE3-15AE
E:.
│ index.html
│
├───assets
│ index-21bc6a28.css
│ index-402c0217.js
│
└───img
helps1.png
helps2.png
Hiyori.ico
這里包含主頁(yè)面、樣式文件以及JS文件,基于Hiyori。
隨后啟動(dòng)推理頁(yè)面:
python3 server_fastapi.py
訪問:http://127.0.0.1:5000/:
加載模型進(jìn)行推理即可。
此外,還可以基于FastAPI的接口進(jìn)行推理,換句話說,發(fā)送http請(qǐng)求即可獲取推理音頻,接口參數(shù)如下:
{
"openapi": "3.1.0",
"info": {
"title": "FastAPI",
"version": "0.1.0"
},
"paths": {
"/": {
"get": {
"summary": "Index",
"operationId": "index__get",
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
}
}
}
},
"/voice": {
"post": {
"summary": "Voice",
"description": "語(yǔ)音接口,若需要上傳參考音頻請(qǐng)僅使用post請(qǐng)求",
"operationId": "voice_voice_post",
"parameters": [
{
"name": "model_id",
"in": "query",
"required": true,
"schema": {
"type": "integer",
"description": "模型ID",
"title": "Model Id"
},
"description": "模型ID"
},
{
"name": "speaker_name",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "說話人名",
"title": "Speaker Name"
},
"description": "說話人名"
},
{
"name": "speaker_id",
"in": "query",
"required": false,
"schema": {
"type": "integer",
"description": "說話人id,與speaker_name二選一",
"title": "Speaker Id"
},
"description": "說話人id,與speaker_name二選一"
},
{
"name": "sdp_ratio",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "SDP/DP混合比",
"default": 0.2,
"title": "Sdp Ratio"
},
"description": "SDP/DP混合比"
},
{
"name": "noise",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "感情",
"default": 0.2,
"title": "Noise"
},
"description": "感情"
},
{
"name": "noisew",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "音素長(zhǎng)度",
"default": 0.9,
"title": "Noisew"
},
"description": "音素長(zhǎng)度"
},
{
"name": "length",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "語(yǔ)速",
"default": 1,
"title": "Length"
},
"description": "語(yǔ)速"
},
{
"name": "language",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "語(yǔ)言",
"title": "Language"
},
"description": "語(yǔ)言"
},
{
"name": "auto_translate",
"in": "query",
"required": false,
"schema": {
"type": "boolean",
"description": "自動(dòng)翻譯",
"default": false,
"title": "Auto Translate"
},
"description": "自動(dòng)翻譯"
},
{
"name": "auto_split",
"in": "query",
"required": false,
"schema": {
"type": "boolean",
"description": "自動(dòng)切分",
"default": false,
"title": "Auto Split"
},
"description": "自動(dòng)切分"
},
{
"name": "emotion",
"in": "query",
"required": false,
"schema": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
},
{
"type": "null"
}
],
"description": "emo",
"title": "Emotion"
},
"description": "emo"
}
],
"requestBody": {
"required": true,
"content": {
"multipart/form-data": {
"schema": {
"$ref": "#/components/schemas/Body_voice_voice_post"
}
}
}
},
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
},
"get": {
"summary": "Voice",
"description": "語(yǔ)音接口",
"operationId": "voice_voice_get",
"parameters": [
{
"name": "text",
"in": "query",
"required": true,
"schema": {
"type": "string",
"description": "輸入文字",
"title": "Text"
},
"description": "輸入文字"
},
{
"name": "model_id",
"in": "query",
"required": true,
"schema": {
"type": "integer",
"description": "模型ID",
"title": "Model Id"
},
"description": "模型ID"
},
{
"name": "speaker_name",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "說話人名",
"title": "Speaker Name"
},
"description": "說話人名"
},
{
"name": "speaker_id",
"in": "query",
"required": false,
"schema": {
"type": "integer",
"description": "說話人id,與speaker_name二選一",
"title": "Speaker Id"
},
"description": "說話人id,與speaker_name二選一"
},
{
"name": "sdp_ratio",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "SDP/DP混合比",
"default": 0.2,
"title": "Sdp Ratio"
},
"description": "SDP/DP混合比"
},
{
"name": "noise",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "感情",
"default": 0.2,
"title": "Noise"
},
"description": "感情"
},
{
"name": "noisew",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "音素長(zhǎng)度",
"default": 0.9,
"title": "Noisew"
},
"description": "音素長(zhǎng)度"
},
{
"name": "length",
"in": "query",
"required": false,
"schema": {
"type": "number",
"description": "語(yǔ)速",
"default": 1,
"title": "Length"
},
"description": "語(yǔ)速"
},
{
"name": "language",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "語(yǔ)言",
"title": "Language"
},
"description": "語(yǔ)言"
},
{
"name": "auto_translate",
"in": "query",
"required": false,
"schema": {
"type": "boolean",
"description": "自動(dòng)翻譯",
"default": false,
"title": "Auto Translate"
},
"description": "自動(dòng)翻譯"
},
{
"name": "auto_split",
"in": "query",
"required": false,
"schema": {
"type": "boolean",
"description": "自動(dòng)切分",
"default": false,
"title": "Auto Split"
},
"description": "自動(dòng)切分"
},
{
"name": "emotion",
"in": "query",
"required": false,
"schema": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
},
{
"type": "null"
}
],
"description": "emo",
"title": "Emotion"
},
"description": "emo"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/models/info": {
"get": {
"summary": "Get Loaded Models Info",
"description": "獲取已加載模型信息",
"operationId": "get_loaded_models_info_models_info_get",
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
}
}
}
},
"/models/delete": {
"get": {
"summary": "Delete Model",
"description": "刪除指定模型",
"operationId": "delete_model_models_delete_get",
"parameters": [
{
"name": "model_id",
"in": "query",
"required": true,
"schema": {
"type": "integer",
"description": "刪除模型id",
"title": "Model Id"
},
"description": "刪除模型id"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/models/add": {
"get": {
"summary": "Add Model",
"description": "添加指定模型:允許重復(fù)添加相同路徑模型,且不重復(fù)占用內(nèi)存",
"operationId": "add_model_models_add_get",
"parameters": [
{
"name": "model_path",
"in": "query",
"required": true,
"schema": {
"type": "string",
"description": "添加模型路徑",
"title": "Model Path"
},
"description": "添加模型路徑"
},
{
"name": "config_path",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "添加模型配置文件路徑,不填則使用./config.json或../config.json",
"title": "Config Path"
},
"description": "添加模型配置文件路徑,不填則使用./config.json或../config.json"
},
{
"name": "device",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "推理使用設(shè)備",
"default": "cuda",
"title": "Device"
},
"description": "推理使用設(shè)備"
},
{
"name": "language",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "模型默認(rèn)語(yǔ)言",
"default": "ZH",
"title": "Language"
},
"description": "模型默認(rèn)語(yǔ)言"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/models/get_unloaded": {
"get": {
"summary": "Get Unloaded Models Info",
"description": "獲取未加載模型",
"operationId": "get_unloaded_models_info_models_get_unloaded_get",
"parameters": [
{
"name": "root_dir",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "搜索根目錄",
"default": "Data",
"title": "Root Dir"
},
"description": "搜索根目錄"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/models/get_local": {
"get": {
"summary": "Get Local Models Info",
"description": "獲取全部本地模型",
"operationId": "get_local_models_info_models_get_local_get",
"parameters": [
{
"name": "root_dir",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "搜索根目錄",
"default": "Data",
"title": "Root Dir"
},
"description": "搜索根目錄"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/status": {
"get": {
"summary": "Get Status",
"description": "獲取電腦運(yùn)行狀態(tài)",
"operationId": "get_status_status_get",
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
}
}
}
},
"/tools/translate": {
"get": {
"summary": "Translate",
"description": "翻譯",
"operationId": "translate_tools_translate_get",
"parameters": [
{
"name": "texts",
"in": "query",
"required": true,
"schema": {
"type": "string",
"description": "待翻譯文本",
"title": "Texts"
},
"description": "待翻譯文本"
},
{
"name": "to_language",
"in": "query",
"required": true,
"schema": {
"type": "string",
"description": "翻譯目標(biāo)語(yǔ)言",
"title": "To Language"
},
"description": "翻譯目標(biāo)語(yǔ)言"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/tools/random_example": {
"get": {
"summary": "Random Example",
"description": "獲取一個(gè)隨機(jī)音頻+文本,用于對(duì)比,音頻會(huì)從本地目錄隨機(jī)選擇。",
"operationId": "random_example_tools_random_example_get",
"parameters": [
{
"name": "language",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "指定語(yǔ)言,未指定則隨機(jī)返回",
"title": "Language"
},
"description": "指定語(yǔ)言,未指定則隨機(jī)返回"
},
{
"name": "root_dir",
"in": "query",
"required": false,
"schema": {
"type": "string",
"description": "搜索根目錄",
"default": "Data",
"title": "Root Dir"
},
"description": "搜索根目錄"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
},
"/tools/get_audio": {
"get": {
"summary": "Get Audio",
"operationId": "get_audio_tools_get_audio_get",
"parameters": [
{
"name": "path",
"in": "query",
"required": true,
"schema": {
"type": "string",
"description": "本地音頻路徑",
"title": "Path"
},
"description": "本地音頻路徑"
}
],
"responses": {
"200": {
"description": "Successful Response",
"content": {
"application/json": {
"schema": {}
}
}
},
"422": {
"description": "Validation Error",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HTTPValidationError"
}
}
}
}
}
}
}
},
"components": {
"schemas": {
"Body_voice_voice_post": {
"properties": {
"text": {
"type": "string",
"title": "Text"
},
"reference_audio": {
"type": "string",
"format": "binary",
"title": "Reference Audio"
}
},
"type": "object",
"required": [
"text"
],
"title": "Body_voice_voice_post"
},
"HTTPValidationError": {
"properties": {
"detail": {
"items": {
"$ref": "#/components/schemas/ValidationError"
},
"type": "array",
"title": "Detail"
}
},
"type": "object",
"title": "HTTPValidationError"
},
"ValidationError": {
"properties": {
"loc": {
"items": {
"anyOf": [
{
"type": "string"
},
{
"type": "integer"
}
]
},
"type": "array",
"title": "Location"
},
"msg": {
"type": "string",
"title": "Message"
},
"type": {
"type": "string",
"title": "Error Type"
}
},
"type": "object",
"required": [
"loc",
"msg",
"type"
],
"title": "ValidationError"
}
}
}
}
最后奉上Bert-vits2-v2.2本地訓(xùn)練推理整合包:
https://pan.baidu.com/s/1OVX9seRwZR6bZ-xsE_nRLg?pwd=v3uc
與眾鄉(xiāng)親同饗。
總結(jié)
以上是生活随笔為你收集整理的Bert-vits2-v2.2新版本本地训练推理整合包(原神八重神子英文模型miko)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: C++ Qt 开发:ListWidget
- 下一篇: java信息管理系统总结_java实现科