Stable Audio Open: Free Model Released for Local Execution

Progress 3 / 12
Table of Contents

Stable Audio Open is a variant of the Stable Audio model trained exclusively on royalty-free audio sources, specifically to address copyright concerns around music generation.

With its free public release, I decided to set up and run the official demo locally. Here is how to do it.

Requirements

  • PyTorch 2.0 or later
  • A GPU with CUDA support is recommended (it is quite slow on CPU)

Execution Method

Step 1: HuggingFace Setup

Request access to Stable-Audio-Open on HuggingFace

You need to be logged in to accept the terms and request access.

stabilityai/stable-audio-open-1.0 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Generate an Access Token

Once logged in, generate a token at the link below. You’ll need this to authenticate with the HuggingFace CLI.

Hugging Face – The AI community building the future.

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Step 2: Prepare Environment and Run the Demo

  • I used Python 3.10.14
  • If you don’t want to clutter your local environment, run this inside a venv.
Terminal window
# Install dependencies (alternatively: pip install . inside the repo)
pip install stable-audio-tools
# Additional packages I needed in my environment
sudo apt install libsndfile1
sudo apt install nvidia-cuda-toolkit
pip install flash-attn
# Authenticate with HuggingFace
huggingface-cli login
# Clone and run the demo
git clone https://github.com/Stability-AI/stable-audio-tools.git
python ./stable-audio-tools/run_gradio.py --pretrained-name stabilityai/stable-audio-open-1.0

Step 3: Access the Demo

Once the demo starts successfully, a URL will be displayed. Open it in your browser.

Demo Screen
Demo Screen

Bonus: Disabling the Public URL

By default, a public URL accessible from anywhere on the internet is created by Gradio.

If you don’t want to expose the URL publicly, change the share parameter in interface.launch (line 18 of run_gradio.py) to False.

Alternatively, you can pass username and password as options to run_gradio.py to enable Basic authentication.

Generation Result

positive prompt:

Trance, Progressive, Rock, EDM

negative prompt:

harsh, loud, chaotic, aggressive, dissonant, jarring, abrupt, noisy, overpowering, unsettling, atonal, disruptive

Conclusion

I’m not very knowledgeable about music, but as an amateur I think the output roughly matched what I had in mind.

The stable-audio-tools repository also includes training scripts, so if you want to generate melodies similar to a specific song, that might be worth trying. It’s likely fine-tuning, though, so expect to need substantial memory and training data.

Personally I was happy to get it running successfully, so I’ll leave it here for now.