From 14e528c550ec1a9fe6cac57c3daa637a35f2ec53 Mon Sep 17 00:00:00 2001 From: MaxFish <525942103@qq.com> Date: Tue, 28 Feb 2023 00:08:06 +0800 Subject: [PATCH] student --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 4fdb401..2bb2e37 100755 --- a/README.md +++ b/README.md @@ -54,15 +54,15 @@ put 000001-010000.txt to ./data/000001-010000.txt ![bert_lose](https://user-images.githubusercontent.com/16432329/220883346-c382bea2-1d2f-4a16-b797-2f9e2d2fb639.png) ### Model compression based on knowledge distillation -Student model has 3× speed of teacher model. +Student model has 53M size and 3× speed of teacher model. To train: > python train.py -c configs/bert_vits_student.json -m bert_vits_student -To infer, pretrained student model link:https://drive.google.com/file/d/1hTLWYEKH4GV9mQltrMyr3k2UKUo4chdp/view?usp=sharing +To infer, get studet model at release page or -Also get studet model at release page. +Google: :https://drive.google.com/file/d/1hTLWYEKH4GV9mQltrMyr3k2UKUo4chdp/view?usp=sharing > python vits_infer.py --config ./configs/bert_vits_student.json --model vits_bert_student.pth @@ -98,4 +98,4 @@ TODO ~ [Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers](https://arxiv.org/abs/2211.00585) -[Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation](https://arxiv.org/abs/2210.15868) \ No newline at end of file +[Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation](https://arxiv.org/abs/2210.15868)