Artificial Intelligence has become all-pervasive and is being used for a wide range of purposes, including generating images, text-based content, and even sound and music production. And Stable Audio AI is one of the best AI models for audio production currently available.
In this guide, we'll be looking at what this AI platform is and how you can use it to create original audio without needing any musical instruments.
What is Stable Audio AI?
Stable Audio AI is an AI model developed by Stability AI, best known for its Stable Diffusion AI that can generate images based on user-provided prompts.
Stable Audio uses advanced artificial intelligence techniques (specifically diffusion models) to create audio from text descriptions. You type in what kind of music or sounds you want, and it generates them within seconds.
The Stable Audio AI has been created by Harmonai, Stability AI's audio research lab, and allows users to generate original audio using prompts.
What makes this audio generation AI model so impressive is that it has been trained on 19,500 hours of audio data from the leading digital music library, AudioSparx. Over 800,000 audio files were used to train it, which has allowed the AI to become capable of generating 95 seconds of 44.1 kHz stereo-quality audio using an Nvidia A100 GPU in less than a second.
It manages to do so using a latent diffusion technology that is similar to the company's Stable Diffusion AI for image generation. And unlike other audio generation AI models, Stability Audio AI can be used to create sounds of different lengths.
You can use Stable Audio AI to generate sounds of single instruments, ambient sounds, or even a full ensemble. Now let's understand how to generate audio using it.
Get Started on Stable Audio AI
You can try Stable Audio AI to generate audio using Artificial Intelligence for free, but this will require creating an account on the Stable Audio website. The free account also has certain limitations around it.
- Launch your browser and go to the Stable Audio website. Once there, click the blue capsule-shaped button on the top right corner that says 'Try It Out For Free'.
- Clicking the button will take you to a new page where you can create an account which is required for using the Stable Audio AI. You can either set up a new account by entering your email address and a password or login through your Google account.
- If you choose to sign in using your Google account, you will need to provide authorization to Google to share your account details with the website. After entering your Google account username and password, Click the 'Continue' button to provide the required authorization.
- Once you are signed in, you will be greeted by the home page of the Stable Audio AI platform. This will contain the terms and conditions that you will have to accept to proceed. You can also choose to sign up for the Stable Audio newsletter from here.
Click the button next to where it says 'I have read and accept the terms and conditions' and, if you want, the one below to subscribe to their newsletter. Then, click the 'Next' button at the bottom.
- Accepting the terms and conditions will bring you to the Stable Audio dashboard, which you can use to generate original audio using prompts.
Using Stable Audio AI
The upper left section of the page will be where you can enter prompts for audio generation.
- You can enter a prompt like
Hard rock, concert promotion, metal, 180 bpm
into the box. Then, click the 'Generate' button at the bottom to start creating audio.
- Another way to generate audio is to use ready-made prompts from the 'Prompt Library' located below the prompt section, where you can access different audio prompt styles. To do so, click the capsule-shaped button labeled 'None'. This will open the library, which contains various styles you can choose from to add to your audio.
- To select any of the audio prompt styles from the library, simply click on it. The prompt section above the library will show you the prompts the style contains. The selected style will acquire a play icon, and the label on the prompt library button will also change to reflect the selected style.
- Underneath the 'Prompt Library', you can see the AI Model used for the audio generation process. Clicking on the model name, in this case, 'Stable-audio-audiosparx-v1-0', will show all available models. There is only one additional model available right now, which is in Beta.
You will need to upgrade to the Pro plan to use the Beta AI model.
- Next is the 'Duration' section, where you can control the duration of the audio generated. Clicking the downward pointing arrow will decrease the duration while clicking the upward pointing one will increase it.
The free version of Stable Audio allows you to generate audio that is 45 seconds long. If you upgrade to the Pro version, you can generate audio that is one minute and 30 seconds long.
- The last item on the left side is the option 'Add Extras' that you can use to customize your audio. Click the '+' button to view the available options, which currently include 'Steps', 'Number Of Results', 'Seed', and 'Prompt Strength'.
- Each of these extra options has settings that can be customized. For instance, if you click the 'Steps' option, you can increase the number of steps the AI takes to generate audio. By default, 50 steps are added, and you can increase them to 100 by typing in the number of steps in the provided box.
- You can go back to the default value by clicking the 'Reset' button at the bottom. If you want to remove the extras altogether, click the 'X' button next to the box where the number of steps has to be entered.
- Once you're done, click the 'Generate' button, and Stable Audio will start generating your audio. Alternatively, you can remove the steps and use another of the extras, such as the 'Seed' option, which controls the randomness of the audio generation.
By default, the option is set to 'Random', which means the AI model will use different audio arrangements to generate audio. You can enter a different value by clicking the 'Random' label and typing in a value like '222222' to make the AI use the same arrangement each time.
- Other extras include 'Number of Results' and 'Prompt Strength'. The former is a Pro feature that allows you to control the number of tracks the AI will return for the prompt (5 maximum) and is unusable in the free version. However, you can try out the 'Prompt Strength' option by clicking it.
This will bring up a slider that controls how closely the generated audio will be to the prompt provided. By default, it is set to 80%, but you can drag the slider left or right to decrease or increase the strength as required.
- Once you've adjusted the prompt strength, click the 'Generate' button to tell the AI to start creating the audio. The upper right side of the Stability Audio AI page also contains a few items. The first of these is a musical note symbol that indicates the number of credits free users have.
You can only generate audio as long as you have credits, and you get 20 credits each month. Then, there is a button that allows you to upgrade to the Pro version, an option to check the details of your account, and a hamburger menu containing additional options.
- Clicking the 'Upgrade To Pro' button will show you the available pricing plans. Besides the free plan, you can choose between the Pro, Studio, and Max plans that cost $11.99/month, $29.99/month and $89.99/month respectively. The 'Free' plan allows generating 20 tracks per month, while the 'Pro' allows 500 tracks. This goes up to 1,350 with the 'Studio' plan and 4,500 with the 'Max' plan.
Additionally, while the track duration in the 'Free' plan is 45 seconds, it is 90 seconds in all other plans. The 'Free' plan comes with a personal license, while you get a Creator license with the other options.
- The option next to the upgrade button shows details of your account on the Stable Audio AI platform. Clicking it will let you know about your current plan and what all it offers.
Refining Your Prompts
By refining your prompts, you can fine-tune the output that Stability Audio provides. When working with generative AI, the better your prompts, the better the output will be. Here are some ways by which you can improve your prompts.
- If the output sounds too electronic or digital, consider using words like 'Band' or 'Live' to the prompt.
- You may be able to improve the quality of the output generated by including words like '44.1kHz', 'high-quality', and 'stereo' in the prompt.
- Use the word 'Solo' after the name of the leading instrument in the track to enhance the output. For instance, if the primary instrument is a violin, you can use 'Solo Violin' in the prompt.
With Stability Audio AI, you can easily generate impressive audio tracks just by using prompts. If you are a beginner, the free plan is an excellent way to try the model out, while professional musicians can upgrade to the paid plans and check out the more advanced features offered by the platform.
Member discussion