Stable Audio Open is a variant of the Stable Audio model trained on royalty-free sound sources to address copyright concerns.
With the recent free release of Stable Audio Open, I decided to set up and run the official demo locally. Here’s how you can do it too.
Requirements
- PyTorch 2.0 or later
- GPU capable of using cuda if possible (It is quite slow with CPU)
Execution Method
Step 1: Things around HuggingFace
Obtain permission to use Stable-Audio-Open at HuggingFace
Login required
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Generation of Access Token
If logged in, generation possible at link below
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Step 2: Preparation of Environment and Execution of Demo
- python: Used 3.10.14
- If you don’t want to pollute local, please execute using venv etc.
# Install dependencies (pip install . inside repository is also possible)pip install stable-audio-tools
# Things needed additional installation in my environmentsudo apt install libsndfile1sudo apt install nvidia-cuda-toolkitpip install flash-attn
# Register credential of hugging facehuggingface-cli login
# Execution of demogit clone https://github.com/Stability-AI/stable-audio-tools.gitpython ./stable-audio-tools/run_gradio.py --pretrained-name stabilityai/stable-audio-open-1.0Step 3: Access Demo
When execution of demo completes normally, URL is displayed, so access it.

Bonus: I hate public URL
By default, something called public URL accessible from all over the world is generated.
People who don’t want to publish URL on network, please modify share parameter specified in interface.launch (line 18) of run_gradio.py to False.
Or if you specify username, password in option of run_gradio.py, Basic authentication will be inserted.
Generation Result
positive
Trance, Progressive, Rock, EDM
negative
harsh, loud, chaotic, aggressive, dissonant, jarring, abrupt, noisy, overpowering, unsettling, atonal, disruptive
Conclusion
How was it. Although I am not very familiar with music, I think melody feeling like what I imagined as amateur came out.
Since script for learning is also published in stable-audio-tools, those who want to generate melody similar to specific song might want to try that too. (Probably fine-tuning, so memory/learning data etc. are needed in large quantity)
Since I am satisfied generating successfully personally, I stop here this time.









