Thanks so much for joining our project, this is the beginning of your journey with us !
We hope that by participating in Common Voice your language community can build speech recognition models by using the Common Voice Dataset. You can learn more about how language communities have used the dataset by watching our contribute-athon videos.
If you prefer to be onboarded via slides format please check out our onboarding slides.
Common Voice is a publicly available voice dataset, powered by the voices of volunteer contributors around the world. People who want to build voice applications can use the dataset to train machine learning models.
At present, most voice datasets are owned by companies, which stifles innovation. Voice datasets also over-represent white, English-speaking men. This means that voice-enabled technology doesn’t work at all for many languages, and where it does work, it may not perform equally well for everyone. We want to change that by mobilizing people everywhere to share their voice.
Collecting and validating public domain sentences
Recording and validating the recordings of the sentences
Repeating this process to grow the size of the data
Generating a dataset
and using machine learning to train speech-to-text models using this dataset
If you have never used the Common Voice Platform before feel free to check out our demo mode
Common Voice has a variety of communities that support the project in different important areas, they are usually grouped by language.
👥 A language’s journey onto Common Voice is made possible with communities of multidisciplinary teams of committed people. Roles vary from no coding needed to organizing roles. Our community mobilization resource and Community page can connect you to resources and existing language communities that can support you.
ℹ️ Note: Mozilla welcomes small and minority language communities, and we understand some of these goals may seem out of reach. In that case, feel free to share with us how they are different for you, and we will try to help. Connect with the Common Voice Team on discourse or github issues.
Check out our short guide on community building to help you think about your language community
All the best with your journey !
You are welcome chat to us on matrix,if you have any questions !