Skip to content

Latest commit

 

History

History
287 lines (237 loc) · 21.6 KB

README.md

File metadata and controls

287 lines (237 loc) · 21.6 KB

SpeechColab ASR leaderboard

result result

1. Overview

"If you can’t measure it, you can’t improve it." -- Peter Drucker

SpeechIO leaderboard serves as an ASR benchmarking platform by providing 3 components:

  1. TestSet Zoo: A collection of test sets covering wide range of speech recognition tasks & scenarios

  2. Model Zoo: A collection of models including commercial APIs & open-sourced models

  3. Benchmarking Pipeline: a simple & well-specified pipeline to take care of data preparation / recognition / post processing / error rate evaluation.

People should be able to easily benchmark, reproduce, examine ASR systems from each other

Overview


2. TestSet Zoo: datasets/*

Academic Test Sets (EN & ZH)

已公开
UNLOCKED
编号
DATASET_ID
说明
DESCRIPTION
语言
LANGUAGE
AISHELL1_TEST test set of AISHELL-1 zh
AISHELL2_IOS_TEST test set of AISHELL-2 (iOS channel) zh
AISHELL2_ANDROID_TEST test set of AISHELL-2 (Android channel) zh
AISHELL2_MIC_TEST test set of AISHELL-2 (Microphone channel) zh
ALIMEETING_EVAL_NEAR_FIELD AliMeeting zh
ALIMEETING_TEST_NEAR_FIELD AliMeeting zh
ALIMEETING_EVAL_FAR_FIELD AliMeeting zh
ALIMEETING_TEST_FAR_FIELD AliMeeting zh
LIBRISPEECH_TEST_CLEAN "test_clean" set of LibriSpeech en
LIBRISPEECH_TEST_OTHER "test_other" set of LibriSpeech en
TEDLIUM_RELEASE3_LEGACY_DEV tedlium release 3, legacy dir dev set TEDLium3 en
TEDLIUM_RELEASE3_LEGACY_TEST tedlium release 3, legacy dir test set TEDLium3 en
GIGASPEECH_V1.0.0_DEV dev set of GigaSpeech en
GIGASPEECH_V1.0.0_TEST test set of GigaSpeech en
VOXPOPULI_V1.0_EN_DEV dev set of VoxPopuli en
VOXPOPULI_V1.0_EN_TEST test set of VoxPopuli en
VOXPOPULI_V1.0_EN_ACCENTED_TEST accented test set of VoxPopuli en
COMMON_VOICE_V11.0_DEV dev set of Common Voice en
COMMON_VOICE_V11.0_TEST test set of Common Voice en

SpeechIO Test Sets (ZH)

SpeechIO test sets are carefully curated by SpeechIO authors, crawled from publicly available sources (Youtube, TV programs, Podcast etc), covering various well-known scenarios and topics, transcribed by payed professional annotators.
已公开
UNLOCKED
编号
DATASET_ID
名称
NAME
场景
SCENARIO
内容领域
TOPIC
有效时长
DURATION (HOURS)
难度(1-5)
DIFFICULTY
SPEECHIO_ASR_ZH00000 调试集
for debugging
视频会议、论坛演讲
conference & speech
经济、货币、金融
economy, currency, finance
1.0 ★★☆
SPEECHIO_ASR_ZH00001 新闻联播 新闻播报
TV News
时政
news & politics
9
SPEECHIO_ASR_ZH00002 鲁豫有约 访谈电视节目
TV interview
名人工作/生活
celebrity & film & music & daily
3 ★★☆
SPEECHIO_ASR_ZH00003 天下足球 专题电视节目
TV program
足球
Sports & Football & Worldcup
2.7 ★★☆
SPEECHIO_ASR_ZH00004 罗振宇跨年演讲 会场演讲
Stadium Public Speech
社会、人文、商业
Society & Culture & Business Trend
2.7 ★★
SPEECHIO_ASR_ZH00005 李永乐讲堂 在线教育
Online Education
科普
Popular Science
4.4 ★★★
SPEECHIO_ASR_ZH00006 王者荣耀
张大仙 & 骚白
直播
Live Broadcasting
游戏
Game
1.6 ★★★☆
SPEECHIO_ASR_ZH00007 直播带货
李佳琪 & 薇娅
直播
Live Broadcasting
电商、美妆
Makeup & Online shopping/advertising
0.9 ★★★★☆
SPEECHIO_ASR_ZH00008 老罗语录 线下培训
Offline lecture
段子、做人
Life & Purpose & Ethics
1.3 ★★★★☆
SPEECHIO_ASR_ZH00009 故事FM 播客
Podcast
人生故事、见闻
Ordinary Life Story Telling
4.5 ★★☆
SPEECHIO_ASR_ZH00010 创业内幕 播客
Podcast
创业、产品、投资
Startup & Enterprenuer & Product & Investment
4.2 ★★☆
SPEECHIO_ASR_ZH00011 罗翔刑法法考 在线教育
Online Education
法律 法考
Law & Lawyer Qualification Exams
3.4 ★★☆
SPEECHIO_ASR_ZH00012 张雪峰考研 在线教育
Online Education
考研 高校报考
University & Graduate School Entrance Exams
3.4 ★★★☆
SPEECHIO_ASR_ZH00013 谷阿莫
牛叔说电影
短视频
VLog
电影剪辑
Movie Cuts
1.8 ★★★
SPEECHIO_ASR_ZH00014 贫穷料理
琼斯爱生活
短视频
VLog
美食、烹饪
Food & Cooking & Gourmet
1 ★★★☆
SPEECHIO_ASR_ZH00015 单田芳 白眉大侠 评书
Traditional Podcast
江湖、武侠
Kongfu Fiction
2.2 ★★☆
SPEECHIO_ASR_ZH00016 德云社演出 剧场相声
Theater Crosstalk Show
包袱段子
Funny Stories
1 ★★★
SPEECHIO_ASR_ZH00017 吐槽大会 脱口秀电视节目
Standup Comedy
明星糗事
Celebrity Jokes
1.8 ★★☆
SPEECHIO_ASR_ZH00018 小猪佩奇
熊出没
少儿动画
Children Cartoon
童话故事、日常
Fairy Tale
0.9 ★☆
SPEECHIO_ASR_ZH00019 CCTV5 NBA 转播 体育赛事解说
Sports Game Live
篮球、NBA
NBA Game
0.7 ★★★
SPEECHIO_ASR_ZH00020 篮球人物 纪录片
Documentary
篮球明星、成长
NBA Super Stars' Life & History
2.2 ★★
SPEECHIO_ASR_ZH00021 汽车之家评测 短视频
VLog
汽车测评
Car benchmarks, Road driving test
1.7 ★★★☆
SPEECHIO_ASR_ZH00022 小艾大叔 豪宅带看 短视频
VLog
房地产、豪宅
Realestate, Mansion tour
1.7 ★★★
SPEECHIO_ASR_ZH00023 无聊开箱
Zealer评测
短视频
VLog
产品开箱评测
Unboxing
2 ★★★
SPEECHIO_ASR_ZH00024 付老师种植技术 短视频
VLog
农业、种植
Agriculture, Planting
2.7 ★★★☆
SPEECHIO_ASR_ZH00025 石国鹏讲历史 线下培训
Offline lecture
历史,古希腊哲学
History, Greek philosophy
1.3 ★★☆
SPEECHIO_ASR_ZH00026 张震鬼故事 广播节目
Broadcasting Program
鬼故事
Horror Stories
2.4 ★★★
SPEECHIO_ASR_ZH00027 华语辩论世界杯 辩论赛
Debates Contest
兴趣、技能、成长
Hobby, Skill, Growth
1.4 ★★★
SPEECHIO_ASR_ZH00028 时政现场同传 同声传译
Simultaneous Translation
时政、社会公共治理
News & Events on Public Governance
2.1 ★★★☆
SPEECHIO_ASR_ZH00029 港台明星访谈
周杰伦,曾志伟
张家辉,陈小春
周星驰
口音(港台)
HongKong/Taiwan Accents
娱乐、生活、演艺
Entertainment, Acting, Musics
1.5 ★★★☆
SPEECHIO_ASR_ZH00030 世界青年说 口音(老外)
Foreigner Accents
异国文化比较
Cultural Difference
2 ★★★☆
SPEECHIO_ASR_ZH00031 东方甄选 直播
broadcast
带货,英语教学
Online advertising & English Education
2.4 ★★★☆
SPEECHIO_ASR_ZH00032 郎朗钢琴课 长视频
long-form video
音乐乐理,钢琴
Music & piano
1.7 ★★☆
SPEECHIO_ASR_ZH00033 老石谈芯 短视频
VLog
芯片
chips
2.8 ★★★
SPEECHIO_ASR_ZH00034 电丸科技AK 短视频
VLog
网络 IT
Internet tech, IT
1.4 ★★★☆
SPEECHIO_ASR_ZH00035 新氧医美 短视频
VLog
医疗美容
Medical Cosmetology
1.4 ★★
SPEECHIO_ASR_ZH00036 交通广播 交通广播
traffic radio
路况,娱乐
Traffics
1.2 ★★★☆
SPEECHIO_ASR_ZH00037 老俞闲聊 在线会议
Online meeting
闲聊
chat
2.4 ★★★
SPEECHIO_ASR_ZH00038 电影:疯狂石头+疯狂赛车 电影
Film
重庆话、山东青岛、四川成都话、河北唐山话、粤语、天津话、河南话、陕西话、闽南话,武汉话等
multiple accents
1.3 ★★★★☆
SPEECHIO_ASR_ZH00039 电影:1942 电影
Film
河南话
HeNan Accent
0.9 ★★★★
SPEECHIO_ASR_ZH00040 电影:白鹿原 电影
Film
陕西话
ShaanXi Accent
1.1 ★★★★★
SPEECHIO_ASR_ZH00041 电影:让子弹飞 电影
Film
四川话
SiChuan Accent
1.1 ★★★★☆
SPEECHIO_ASR_ZH00042 电影:人生大事 电影
Film
武汉话
WuHan Accent
0.8 ★★★★

Download Dataset


3. Model Zoo: models/*

EN Models

编号
MODEL_ID
类型
TYPE
厂商/作者
PROVIDER/AUTHOR
简介
DESCRIPTION
链接
URL
aliyun_api_en Cloud Alibaba link
amazon_api_en Cloud Amazon AWS link
baidu_api_en Cloud Baidu link
google_api_en Cloud Google link
google_USM_en Cloud Google request access
microsoft_sdk_en Cloud Microsoft Azure link
tencent_api_en Cloud Tencent link
coqui_model_en Local
coqui link
deepspeech_model_en Local
deepspeech link
k2_gigaspeech Local
k2-fsa link
nemo_conformer_ctc_large_en Local
NVidia NeMo link
nemo_conformer_transducer_xlarge_en Local
NVidia NeMo link
vosk_model_en Local
alphacephei link
vosk_model_en_large Local
alphacephei link
whisper_large Local
OpenAI link
whisper_large_v2 Local
OpenAI link
data2vec_audio_large_ft_libri_960h Local Facebook AI link
hubert_xlarge_ft_libri_960h Local Facebook AI link
wav2vec2_large_robust_ft_libri_960h Local Facebook AI link
wavlm_base_plus_ft_libri_clean_100h Local Microsoft
patrickvonplaten
link

ZH Models

Cloud Models

编号
MODEL_ID
类型
TYPE
厂商
PROVIDER
简介
DESCRIPTION
链接
URL
aispeech_api_zh Cloud 思必驰
AISpeech
思必驰开放平台 link
aliyun_api_zh Cloud 阿里巴巴
Alibaba
阿里云 - 一句话识别 link
aliyun_ftasr_api_zh Cloud 阿里巴巴
Alibaba
阿里云 - 文件识别(非流式) link
baidu_pro_api_zh Cloud 百度
Baidu
百度智能云
(极速版)
link
bilibili_api_zh Cloud 哔哩哔哩
bilibili
哔哩哔哩AI开放平台 not available yet
ximalaya_api_zh Cloud 喜马拉雅
ximalaya
喜马拉雅AI开放平台
(转写,非流式)
link
iflytek_lfasr_api_zh Cloud 讯飞
IFlyTek
讯飞开放平台
(转写,非流式)
link
microsoft_sdk_zh Cloud 微软
Microsoft
Azure
(流式)
link
microsoft_batch_zh Cloud 微软
Microsoft
Azure
(离线转写)
link
tencent_api_zh Cloud 腾讯
Tencent
腾讯云 link
yitu_api_zh Cloud 依图
YituTech
依图语音开放平台 link

Local Models

编号
MODEL_ID
类型
TYPE
作者
AUTHOR
简介
DESCRIPTION
speechio_kaldi_multicn Local Xingyu NA(那兴宇) Kaldi multi_cn recipe
vosk_model_cn Local alphacephei Chinese engine of Vosk
paraformer_large_offline_zh Local modelscope Paraformer, default Chinese 16k model, offline, support long-form audio recognition

Download Model

To submit a model

Follow this specification. Existing models are good references as well.


4. Benchmarking Pipeline

Benchmark


5. Latest Results

Public Models

Unlocked SpeechIO test sets (ZH00001 ~ ZH00026)

Rank 排名 Model 模型 CER 字错误率 Date 时间
1 ximalaya_api_zh 1.72% 2024.08
2 aliyun_ftasr_api_zh 1.80% 2024.08
3 microsoft_batch_zh 1.95% 2024.08
4 iflytek_lfasr_api_zh 3.02% 2024.08
5 tencent_api_zh 3.20% 2024.08
6 aispeech_api_zh 3.61% 2024.08
7 baidu_pro_api_zh 7.28% 2024.08

Locked SpeechIO test sets (ZH00027 ~ ZH00042)

Rank 排名 Model 模型 CER 字错误率 Date 时间
1 microsoft_batch_zh 4.79% 2024.08
2 aliyun_ftasr_api_zh 6.09% 2024.08
3 ximalaya_api_zh 6.35% 2024.08
4 tencent_api_zh 7.22% 2024.08
5 iflytek_lfasr_api_zh 7.84% 2024.08
6 aispeech_api_zh 9.15% 2024.08
7 baidu_pro_api_zh 15.31% 2024.08

All SpeechIO test sets (ZH00001 ~ ZH00042)

Rank 排名 Model 模型 CER 字错误率 Date 时间
1 microsoft_batch_zh 2.81% 2024.08
2 aliyun_ftasr_api_zh 3.09% 2024.08
3 ximalaya_api_zh 3.12% 2024.08
4 tencent_api_zh 4.42% 2024.08
5 iflytek_lfasr_api_zh 4.48% 2024.08
6 aispeech_api_zh 5.29% 2024.08
7 baidu_pro_api_zh 9.71% 2024.08

Private Models

Unlocked SpeechIO test sets (ZH00001 ~ ZH00021)

Model 模型 CER 字错误率 Date 时间
bilibili_api_zh(*) 2.49% 2024.04

Locked SpeechIO test sets (ZH00022 ~ ZH00042)

Model 模型 CER 字错误率 Date 时间
bilibili_api_zh(*) 5.36% 2024.04

All SpeechIO test sets (ZH00001 ~ ZH00042)

Model 模型 CER 字错误率 Date 时间
bilibili_api_zh(*) 3.36% 2024.04
Detail all results (字错误率 CER %)

Test Set ID 测试场景&内容领域 bilibili_api_zh Date 时间
SPEECHIO_ASR_ZH00001 新闻联播 0.54 2024.04
SPEECHIO_ASR_ZH00002 访谈 鲁豫有约 2.78 2024.04
SPEECHIO_ASR_ZH00003 电视节目 天下足球 0.81 2024.04
SPEECHIO_ASR_ZH00004 场馆演讲 罗振宇跨年 1.48 2024.04
SPEECHIO_ASR_ZH00005 在线教育 李永乐 科普 1.47 2024.04
SPEECHIO_ASR_ZH00006 直播 王者荣耀 张大仙&骚白 5.85 2024.04
SPEECHIO_ASR_ZH00007 直播 带货 李佳琪&薇娅 6.21 2024.04
SPEECHIO_ASR_ZH00008 线下培训 老罗语录 3.69 2024.04
SPEECHIO_ASR_ZH00009 播客 故事FM 3.18 2024.04
SPEECHIO_ASR_ZH00010 播客 创业内幕 3.51 2024.04
SPEECHIO_ASR_ZH00011 在线教育 罗翔 刑法法考 1.77 2024.04
SPEECHIO_ASR_ZH00012 在线教育 张雪峰 考研 2.11 2024.04
SPEECHIO_ASR_ZH00013 短视频 影剪 谷阿莫&牛叔说电影 2.96 2024.04
SPEECHIO_ASR_ZH00014 短视频 美式&烹饪 3.56 2024.04
SPEECHIO_ASR_ZH00015 评书 单田芳 白眉大侠 4.71 2024.04
SPEECHIO_ASR_ZH00016 相声 德云社专场 2.99 2024.04
SPEECHIO_ASR_ZH00017 脱口秀 吐槽大会 2.94 2024.04
SPEECHIO_ASR_ZH00018 少儿卡通 小猪佩奇&熊出没 1.97 2024.04
SPEECHIO_ASR_ZH00019 体育赛事解说 NBA比赛 2.32 2024.04
SPEECHIO_ASR_ZH00020 纪录片 篮球人物 1.51 2024.04
SPEECHIO_ASR_ZH00021 短视频 汽车之家 汽车评测 1.75 2024.04
SPEECHIO_ASR_ZH00022 短视频 小艾大叔 豪宅带看 3.29 2024.04
SPEECHIO_ASR_ZH00023 短视频 开箱视频 Zeal&无聊开箱 2.19 2024.04
SPEECHIO_ASR_ZH00024 短视频 付老师 农业种植 4.81 2024.04
SPEECHIO_ASR_ZH00025 线下课堂 石国鹏 古希腊哲学 3.32 2024.04
SPEECHIO_ASR_ZH00026 广播电台节目 张震鬼故事 3.69 2024.04
SPEECHIO_ASR_ZH00027 华语大学生辩论赛 兴趣,技能,成长 2.07 2024.04
SPEECHIO_ASR_ZH00028 同声传译:时政&社会公共治理 1.90 2024.04
SPEECHIO_ASR_ZH00029 港台口音:港台明星访谈 3.89 2024.04
SPEECHIO_ASR_ZH00030 老外口音:《世界青年说》 3.87 2024.04
SPEECHIO_ASR_ZH00031 直播带货 东方甄选 3.80 2024.04
SPEECHIO_ASR_ZH00032 音乐 郎朗钢琴课 3.86 2024.04
SPEECHIO_ASR_ZH00033 芯片 老石谈芯 2.70 2024.04
SPEECHIO_ASR_ZH00034 网络IT 电丸科技AK 5.48 2024.04
SPEECHIO_ASR_ZH00035 新氧医美 1.17 2024.04
SPEECHIO_ASR_ZH00036 交通广播 信不信由你 5.94 2024.04
SPEECHIO_ASR_ZH00037 在线会议聊天 老俞闲话 2.86 2024.04
SPEECHIO_ASR_ZH00038 电影:疯狂石头+疯狂赛车(方言杂烩) 18.29 2024.04
SPEECHIO_ASR_ZH00039 电影:1942(河南话) 13.96 2024.04
SPEECHIO_ASR_ZH00040 电影:白鹿原(陕西话) 26.38 2024.04
SPEECHIO_ASR_ZH00041 电影:让子弹飞(四川话) 10.84 2024.04
SPEECHIO_ASR_ZH00042 电影:人生大事(武汉话) 18.04 2024.04

note: models with (*) marker can be found in model zoo, but not universally available to public yet.


Contacts

Email: [email protected]