Skip to content

Commit

Permalink
create singer
Browse files Browse the repository at this point in the history
  • Loading branch information
MaxMax2016 committed May 28, 2023
1 parent 503b32e commit 2ab6ace
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 5 deletions.
28 changes: 24 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@

- 【无 泄漏】支持多发音人

- 【捏 音色】创造独有发音人

- 【带 伴奏】也能进行转换,轻度伴奏

- 【用 Excel】进行原始调教,纯手工
Expand All @@ -29,9 +31,9 @@

本项目将继续完成基于BIGVGAN的模型(32K),在此之后,有成果再更新项目

## 模型和日志:https://github.com/PlayVoice/so-vits-svc-5.0/releases/tag/v5.3
## 模型和日志:https://github.com/PlayVoice/so-vits-svc-5.0/releases/tag/base_release_hifigan

- [5.0.epoch1200.full.pth](https://github.com/PlayVoice/so-vits-svc-5.0/releases/download/v5.3/5.0.epoch1200.full.pth)模型包括:生成器+判别器=176M,可用作预训练模型
- [5.0.epoch1200.full.pth](https://github.com/PlayVoice/so-vits-svc-5.0/releases/download/base_release_hifigan/5.0.epoch1200.full.pth)模型包括:生成器+判别器=176M,可用作预训练模型
- 发音人(56个)文件在configs/singers目录中,可进行推理测试,尤其测试音色泄露
- 发音人22,30,47,51辨识度较高,音频样本在configs/singers_sample目录中

Expand All @@ -42,7 +44,7 @@
| natural speech | Microsoft || 减少发音错误 | - |
| neural source-filter | NII || 解决断音问题 | 参数优化 |
| speaker encoder | Google || 音色编码与聚类 | - |
| GRL for speaker | Ubisoft || 防止编码器泄露音色 | 原理类似判别器的对抗训练 |
| GRL for speaker | Ubisoft || 防止编码器泄漏音色 | 原理类似判别器的对抗训练 |
| one shot vits | Samsung || VITS 一句话克隆 | - |
| SCLN | Microsoft || 改善克隆 | - |
| band extention | Adobe || 16K升48K采样 | 数据处理 |
Expand All @@ -60,7 +62,7 @@
💗必要的前处理:
- 1 降噪&去伴奏
- 2 频率提升
- 3 音质提升,基于https://github.com/openvpi/vocoders ,待整合
- 3 音质提升
- 4 将音频剪裁为小于30秒的音频段,whisper的要求

然后以下面文件结构将数据集放入dataset_raw目录
Expand Down Expand Up @@ -255,6 +257,24 @@ data_svc/
| --- | --- | --- | --- | --- | --- | --- | --- |
| name | 配置文件 | 模型文件 | 音色文件 | 音频文件 | 音频内容 | 音高内容 | 升降调 |

## 捏音色
纯属巧合的取名:average -> ave -> eva,夏娃代表者孕育和繁衍

> python svc_eva.py
```python
eva_conf = {
'./configs/singers/singer0022.npy': 0,
'./configs/singers/singer0030.npy': 0,
'./configs/singers/singer0047.npy': 0.5,
'./configs/singers/singer0051.npy': 0.5,
}
```

生成的音色文件为:eva.spk.npy

💗Flow和Decoder均需要输入,您甚至可以给两个模块输入不同的音色参数,捏出更独特的音色。

## 数据集

| Name | URL |
Expand Down
20 changes: 20 additions & 0 deletions svc_eva.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import os
import numpy as np

# average -> ave -> eva :haha

eva_conf = {
'./configs/singers/singer0022.npy': 0,
'./configs/singers/singer0030.npy': 0,
'./configs/singers/singer0047.npy': 0.5,
'./configs/singers/singer0051.npy': 0.5,
}

if __name__ == "__main__":

eva = np.zeros(256)
for k, v in eva_conf.items():
assert os.path.isfile(k), k
spk = np.load(k)
eva = eva + spk * v
np.save("eva.spk.npy", eva, allow_pickle=False)
3 changes: 2 additions & 1 deletion svc_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,9 @@ def main(args):
ppg = torch.FloatTensor(ppg)

pit = load_csv_pitch(args.pit)
print("pitch shift: ", args.shift)
if (args.shift == 0):
print("don't use pitch shift")
pass
else:
pit = np.array(pit)
source = pit[pit > 0]
Expand Down

0 comments on commit 2ab6ace

Please sign in to comment.