Back to Blog

Create Voiceovers with AI Using Bark (Text to Audio Model)

Jun 22, 2023•5 min read

We finally have a text-to-audio model available on Stablecog and a new interface for voiceover models! This text-to-audio model is called Bark, made by Suno. We have 7 languages and 36 speakers available in beta.

Create Voiceovers with AI

The process of creating voiceovers with AI works similar to image generation. You enter a text prompt, select a speaker and a language and click generate. If you like you can also set the “voice stability” which makes the voice more stable or more variable. We also have the ability to remove silence from audio and the ability to remove the background noise. Here is our new interface:

Here is an example you can listen to. It is voiced by Rachel:

Rachel

Technology is a bridge between imagination and reality.

0:00 / 0:04

Create Voiceovers

What is Bark?

Bark is an open-source, transformer-based text-to-audio model created by Suno. It is quite popular, 21.9K stars on GitHub.

It can create realistic speech in various languages. It is also able to generate music and simple sound effects in some cases. For example, you can make the speaker laugh using ”[laugh]“.

Here is an example with the prompt: “I wouldn’t say so [laugh].“:

Paul

I wouldn't say so [laugh].

0:00 / 0:04

Create Voiceovers with Our API

The voiceover feature is also available on our API. It supports all the same functionality that is available in the UI.

Documentation

What’s Coming Next?

This marks a new beginning for Stablecog. We intend Stablecog to be a suite of AI powered open-source tools for creators, regardless of which language they speak. Next, we’ll create the long requested Discord bot. Afterwards, we’ll be experimenting with animations and writing.