Skip to content
Music to Prompt

Guide

How to write a music prompt that works

The anatomy of a prompt that gets you the track you hear in your head, and the common mistakes that get you generic output.

Updated 2026-01-24

Writing a music prompt is a skill, and it is mostly about being specific. A generator can only act on what you name, so the difference between a forgettable result and one that matches your idea is almost always the difference between a vague prompt and a concrete one.

This guide gives you a repeatable structure, the vocabulary that actually moves results, and a clear sense of what to leave out. Once you have written a handful this way, the structure becomes automatic and you stop thinking about it at all.

A structure you can reuse

Order your prompt from the most defining detail to the least. A reliable template is: genre, tempo, key and mode, instrumentation, mood and texture, then any production notes. You do not need every slot, and a textural ambient piece can skip tempo and key entirely, but the more of the concrete ones you fill, the closer the output lands. Leading with genre matters because it sets the template the model reasons from; everything after it is a refinement of that starting point.

What each part does
PartWhy it mattersExample
GenreSets the whole template the model works fromsynthwave, drill, ambient
TempoControls energy and feel more than any adjective~90 BPM, driving, slow
Key and modeSteers brightness and emotionA minor, F major
InstrumentsThe literal sounds you want to hearanalog synth, 808s, strings
Mood and textureColours the performance and mixnostalgic, warm, vinyl crackle
ProductionFinal polish cueswide stereo, tape saturation

Specifics beat adjectives

The single biggest upgrade you can make is replacing vague feelings with named sounds. Chill becomes jazzy Rhodes chords and brushed drums. Epic becomes layered strings, taiko drums and a brass swell. Dreamy becomes lush reverb, soft synth pads and a slow, floating tempo. The feeling still comes through, but now the model has something concrete to render rather than an abstraction it has to guess at. Adjectives are not banned; they are best used as seasoning on top of the named ingredients, not as a substitute for them.

  • Name two or three lead instruments rather than listing ten. A long instrument list reads as noise and dilutes the ones that matter.
  • Give a number for tempo when you can, even an approximate one. Around 90 BPM beats mid-tempo every time.
  • Say instrumental explicitly unless you want vocals, or you may get an unrequested vocal line.
  • Add one or two texture words (warm, gritty, glossy) rather than five, which start to cancel each other out.
  • Put the emotional payload in the mood, not the genre. The genre defines the sound; the mood bends it.

Build texture vocabulary

Texture is where prompts come alive, and most people have a smaller vocabulary for it than they think. These words map onto real production choices a generator can act on, so reach for them when a result sounds technically correct but emotionally flat.

Texture words and what they imply
WordWhat it implies in the mix
WarmRolled-off highs, analog feel, gentle low-mid body
GrittyDistortion, saturation, raw edges left in
GlossyBright, polished, modern pop sheen
LushDense layers, rich reverb, wide chords
SparseFew elements, lots of space, room to breathe
Lo-fiVinyl crackle, tape wow, filtered highs
CinematicBig dynamics, wide stereo, orchestral depth

Before and after

Weak prompt

Before

a chill beat that sounds cool and modern

Strong prompt

After

Lo-fi hip-hop, ~72 BPM, jazzy electric piano, mellow boom-bap drums, warm sub bass, vinyl crackle, nostalgic and relaxed, instrumental.

Nothing in the after version is fancy. It just answers the questions the before version left open: what genre, how fast, which instruments, what mood. That is the whole game. Read both aloud and you can hear how much more the second one tells a generator to do.

Let the tools do the heavy lifting

If naming instruments and tempo feels hard, reverse a reference track to get them measured automatically, or paste a rough idea into the enhancer and it will fill in the missing dimensions for you. Writing prompts well is a skill worth having, but you do not have to start from a blank page.

Common mistakes

  • Stacking too many genres, which pulls the model in opposite directions and produces a muddy average of all of them.
  • Describing the scene instead of the sound. A rainy city street is not a sound; rain ambience and minor-key piano are.
  • Forgetting to say instrumental and getting unexpected vocals over an otherwise good track.
  • Writing one long run-on sentence with no clear structure, so no single element stands out as the priority.
  • Contradicting yourself, for example asking for both minimal and richly layered, which leaves the model to pick one at random.
  • Over-describing texture until the adjectives blur. Two strong texture words beat six competing ones.

Iterate one variable at a time

When a result is close but not right, change exactly one thing and regenerate. Drop the tempo by ten BPM, or swap one instrument, or flip the key to minor. If you change three things at once and the result improves, you have learned nothing you can reuse. Treat each generation as an experiment with a single variable and you build real intuition for how the model responds, which is worth far more than any list of magic words.

Frequently asked questions

How long should a music prompt be?
One or two focused sentences is usually ideal. Long enough to cover genre, tempo, instruments and mood, short enough that nothing contradicts. Past a certain length, extra words tend to dilute the important ones rather than add detail.
Should I include a key?
Include it when emotion matters, since minor keys read darker and major keys brighter. For loop-style or textural music you can leave it out, because the harmonic centre is doing less emotional work there.
What if I do not know the genre?
Describe the instruments and tempo instead. Genre is just shorthand for a set of sounds, so naming the sounds works just as well. You can also reverse a reference track and let the tool suggest the genre for you.
Why do I keep getting generic results?
Almost always because the prompt is too vague or too crowded. Vague prompts give the model nothing to commit to; crowded ones give it too many conflicting instructions. Strip back to a clear genre, a tempo and two or three named instruments and specificity returns.
Do the same prompts work across different generators?
The structural parts transfer well: genre, tempo, key and instruments mean the same thing everywhere. Formatting conventions and how each tool handles vocals differ, so expect to adjust the wrapping rather than the substance.