Guide
How to write a music prompt that works
The anatomy of a prompt that gets you the track you hear in your head, and the common mistakes that get you generic output.
Updated 2026-01-24
Writing a music prompt is a skill, and it is mostly about being specific. A generator can only act on what you name, so the difference between a forgettable result and one that matches your idea is almost always the difference between a vague prompt and a concrete one.
This guide gives you a repeatable structure, the vocabulary that actually moves results, and a clear sense of what to leave out. Once you have written a handful this way, the structure becomes automatic and you stop thinking about it at all.
A structure you can reuse
Order your prompt from the most defining detail to the least. A reliable template is: genre, tempo, key and mode, instrumentation, mood and texture, then any production notes. You do not need every slot, and a textural ambient piece can skip tempo and key entirely, but the more of the concrete ones you fill, the closer the output lands. Leading with genre matters because it sets the template the model reasons from; everything after it is a refinement of that starting point.
| Part | Why it matters | Example |
|---|---|---|
| Genre | Sets the whole template the model works from | synthwave, drill, ambient |
| Tempo | Controls energy and feel more than any adjective | ~90 BPM, driving, slow |
| Key and mode | Steers brightness and emotion | A minor, F major |
| Instruments | The literal sounds you want to hear | analog synth, 808s, strings |
| Mood and texture | Colours the performance and mix | nostalgic, warm, vinyl crackle |
| Production | Final polish cues | wide stereo, tape saturation |
Specifics beat adjectives
The single biggest upgrade you can make is replacing vague feelings with named sounds. Chill becomes jazzy Rhodes chords and brushed drums. Epic becomes layered strings, taiko drums and a brass swell. Dreamy becomes lush reverb, soft synth pads and a slow, floating tempo. The feeling still comes through, but now the model has something concrete to render rather than an abstraction it has to guess at. Adjectives are not banned; they are best used as seasoning on top of the named ingredients, not as a substitute for them.
- Name two or three lead instruments rather than listing ten. A long instrument list reads as noise and dilutes the ones that matter.
- Give a number for tempo when you can, even an approximate one. Around 90 BPM beats mid-tempo every time.
- Say instrumental explicitly unless you want vocals, or you may get an unrequested vocal line.
- Add one or two texture words (warm, gritty, glossy) rather than five, which start to cancel each other out.
- Put the emotional payload in the mood, not the genre. The genre defines the sound; the mood bends it.
Build texture vocabulary
Texture is where prompts come alive, and most people have a smaller vocabulary for it than they think. These words map onto real production choices a generator can act on, so reach for them when a result sounds technically correct but emotionally flat.
| Word | What it implies in the mix |
|---|---|
| Warm | Rolled-off highs, analog feel, gentle low-mid body |
| Gritty | Distortion, saturation, raw edges left in |
| Glossy | Bright, polished, modern pop sheen |
| Lush | Dense layers, rich reverb, wide chords |
| Sparse | Few elements, lots of space, room to breathe |
| Lo-fi | Vinyl crackle, tape wow, filtered highs |
| Cinematic | Big dynamics, wide stereo, orchestral depth |
Before and after
Weak prompt
Before
a chill beat that sounds cool and modern
Strong prompt
After
Lo-fi hip-hop, ~72 BPM, jazzy electric piano, mellow boom-bap drums, warm sub bass, vinyl crackle, nostalgic and relaxed, instrumental.
Nothing in the after version is fancy. It just answers the questions the before version left open: what genre, how fast, which instruments, what mood. That is the whole game. Read both aloud and you can hear how much more the second one tells a generator to do.
Let the tools do the heavy lifting
If naming instruments and tempo feels hard, reverse a reference track to get them measured automatically, or paste a rough idea into the enhancer and it will fill in the missing dimensions for you. Writing prompts well is a skill worth having, but you do not have to start from a blank page.
Common mistakes
- Stacking too many genres, which pulls the model in opposite directions and produces a muddy average of all of them.
- Describing the scene instead of the sound. A rainy city street is not a sound; rain ambience and minor-key piano are.
- Forgetting to say instrumental and getting unexpected vocals over an otherwise good track.
- Writing one long run-on sentence with no clear structure, so no single element stands out as the priority.
- Contradicting yourself, for example asking for both minimal and richly layered, which leaves the model to pick one at random.
- Over-describing texture until the adjectives blur. Two strong texture words beat six competing ones.
Iterate one variable at a time
When a result is close but not right, change exactly one thing and regenerate. Drop the tempo by ten BPM, or swap one instrument, or flip the key to minor. If you change three things at once and the result improves, you have learned nothing you can reuse. Treat each generation as an experiment with a single variable and you build real intuition for how the model responds, which is worth far more than any list of magic words.
Frequently asked questions
- How long should a music prompt be?
- One or two focused sentences is usually ideal. Long enough to cover genre, tempo, instruments and mood, short enough that nothing contradicts. Past a certain length, extra words tend to dilute the important ones rather than add detail.
- Should I include a key?
- Include it when emotion matters, since minor keys read darker and major keys brighter. For loop-style or textural music you can leave it out, because the harmonic centre is doing less emotional work there.
- What if I do not know the genre?
- Describe the instruments and tempo instead. Genre is just shorthand for a set of sounds, so naming the sounds works just as well. You can also reverse a reference track and let the tool suggest the genre for you.
- Why do I keep getting generic results?
- Almost always because the prompt is too vague or too crowded. Vague prompts give the model nothing to commit to; crowded ones give it too many conflicting instructions. Strip back to a clear genre, a tempo and two or three named instruments and specificity returns.
- Do the same prompts work across different generators?
- The structural parts transfer well: genre, tempo, key and instruments mean the same thing everywhere. Formatting conventions and how each tool handles vocals differ, so expect to adjust the wrapping rather than the substance.