Create Voiceovers with AI Using Bark (Text to Audio Model)
We finally have a text-to-audio model available on Stablecog and a new interface for voiceover models! This text-to-audio model is called Bark, made by Suno. We have 7 languages and 36 speakers available in beta.
Create Voiceovers with AI
The process of creating voiceovers with AI works similar to image generation. You enter a text prompt, select a speaker and a language and click generate. If you like you can also set the “voice stability” which makes the voice more stable or more variable. We also have the ability to remove silence from audio and the ability to remove the background noise. Here is our new interface:
Here is an example you can listen to. It is voiced by Rachel:
Rachel
Technology is a bridge between imagination and reality.
0:00 / 0:04
What is Bark?
Bark is an open-source, transformer-based text-to-audio model created by Suno. It is quite popular, 21.9K stars on GitHub.
It can create realistic speech in various languages. It is also able to generate music and simple sound effects in some cases. For example, you can make the speaker laugh using ”[laugh]“.
Here is an example with the prompt: “I wouldn’t say so [laugh].“:
Paul
I wouldn't say so [laugh].
0:00 / 0:04
Create Voiceovers with Our API
The voiceover feature is also available on our API. It supports all the same functionality that is available in the UI.
What’s Coming Next?
This marks a new beginning for Stablecog. We intend Stablecog to be a suite of AI powered open-source tools for creators, regardless of which language they speak. Next, we’ll create the long requested Discord bot. Afterwards, we’ll be experimenting with animations and writing.
Start Creating Voiceovers
You can start creating voiceovers on Stablecog now. Join our Discord to let us know how it goes. Click the button below and start creating!