Layer 1 is the warm room — Tibetan singing bowls, crystal bowls, soft tuning fork blooms. It is the layer that makes a AmberRoom session feel like a sound bath instead of a science experiment. It now plays via modal synthesis, generated client-side in your browser, at $0 cost.
Earlier versions of AmberRoom claimed Layer 1 played without it actually generating audio. We fixed that by shipping the modal synthesizer first and saving the licensed-sample upgrade for when Pro revenue justifies the spend. The recipes you see in the inspector are now honest — every layer marked LIVE is actually in your headphones.
What modal synthesis does
A real Tibetan singing bowl, when struck, produces a stack of partials at non-harmonic frequencies — that is, the overtones are not integer multiples of the fundamental. A 220 Hz bowl might have its first overtone at ~611 Hz (ratio 2.78), the next at ~1192 Hz (ratio 5.42), and so on. Each partial decays at its own rate; the upper ones fade fast, the fundamental rings for many seconds. The fundamental drifts subtly — the characteristic shimmer or "wobble" sound healers describe.
We model this directly. AmberRoom's bowl synthesizer (lib/audio/bowls.ts) spawns one Web Audio sine oscillator per partial, with frequencies and decay envelopes measured from spectral analysis of public-domain bowl recordings. A slow LFO (0.6 Hz) wobbles the fundamental by ±1.2 Hz, recreating the shimmer. Result: a bowl strike that sounds — honestly — like a synthesized bowl in a room. Not indistinguishable from a real recording, but credible and pleasant.
Tibetan vs crystal partials
// Tibetan — denser, shorter sustain, prominent ~minor 7th overtone
[1.00, 2.78, 5.42, 8.93, 13.4]
// Crystal — cleaner, longer sustain, octave + 12th + upper shimmer
[1.00, 2.00, 3.00, 5.04]
Tibetan bowls have that characteristic dissonant warmth (the 2.78× partial is roughly a minor 7th above the fundamental — that's the Tibetan signature). Crystal bowls are cleaner and longer-sustained, closer to a pure tone with a glassy shimmer. AmberRoom uses both depending on the recipe — Tibetan leads anxiety; crystal leads sleep, grief, energy.
Convolution reverb — the missing piece that made it not sound clinical
A bowl played dry (no reverb) sounds like a tone generator. A bowl played in a meditation hall has 2–3 seconds of reverb tail, and that tail is half of what makes the experience feel sacred.
Web Audio's ConvolverNode applies an impulse response (IR) to any signal that flows through it. We synthesize the IR procedurally — exponentially decaying stereo-decorrelated noise, 2.6 second tail. The bowls flow through the convolver before reaching the speakers. Result: bowls now sound like they're being played in a stone room.
The reverb is shared across all layers (binaural + noise + bowls all flow through the same convolution chain), which means the entire mix has a coherent spatial signature. You're not hearing dry oscillators sitting on top of dry noise; you're hearing a mix in a space.
How bowls are scheduled
Real bowls in a sound bath aren't continuous — practitioners strike each bowl every 30–90 seconds, letting the previous one ring down. AmberRoom's BowlSection does the same: a rotation of bowls (low / mid / high), struck on a slow interval with subtle jitter so it feels human-paced, not metronomic.
Per-intent rotations:
- Anxiety: low/mid/high Tibetan (110, 165, 220 Hz), struck every ~35s. Velocity 0.5 — not too forceful.
- Sleep: two crystal bowls (174 Hz solfeggio, 261 Hz / C4), struck every ~60s. Soft velocity 0.35 — won't wake you.
- Grief: single 396 Hz crystal bowl ("release" solfeggio frequency), struck every ~50s. Held tone, no rotation — the recipe is the staying.
- Meditation: three-bowl stack mixing crystal + Tibetan, every ~40s.
- Energy: brighter crystal bowls (528 Hz "transformation", 396 Hz), struck more frequently (~30s).
- Pain: low Tibetan-leaning gong-like bowls (80, 120 Hz), every ~45s.
- Focus: intentionally no bowls — too rich for sustained work attention.
- Tinnitus: intentionally no bowls — would clash with notch-filtered masking.
Tradeoffs being honest about
- A practitioner will hear it's synthesized. Modal synthesis captures the spectral fingerprint and decay character of a real bowl, but not the strike texture (mallet against metal) or the minute imperfections of a hand-hammered instrument. Trained ears will notice. Most users on AirPods or Sony headphones won't.
- The Tibetan bowl cortisol study used real bowls. We can claim mechanism-alignment with the study (the same partial structure, the same frequency ranges) but not full study-fidelity. When Pro revenue justifies the upgrade to a real licensed sample pack ($150–500 one-time, e.g. Soniccouture), we'll swap to the recordings.
- Vibrotactile is not delivered. A bowl played live in a room produces low-frequency pressure waves you feel through your body. Headphones can't deliver this. That's a fundamental limit of digital sound therapy — solved only with a subwoofer or wearable haptics (V3+).
Why this approach was the right V1 call
The honest comparison:
- Modal synthesis (what we shipped): $0 cost, ~3 weeks of audio engineering, infinite variation, owns its own destiny, sounds like a synthesized bowl in a room.
- Soniccouture license ($149): $149 cost, ~1 week of integration, high-quality real recordings, finite variation per pack, sounds like a real bowl in a room.
For V1, the modal-synth path lets us validate traffic and demand without committing budget to assets we might want to swap later (e.g. if grief users prefer crystal over Tibetan, we'd want to reorient the library). For V2 once Pro conversion data exists, swapping to real samples becomes the obvious next investment — the upgrade is non-destructive (the orchestrator code stays the same; just the source of bowl audio changes).
Source code
lib/audio/bowls.ts modal synthesis + bowl scheduling
lib/audio/reverb.ts procedural impulse response + ConvolverNode chain
lib/audio/orchestrator.ts wires bowls into the per-intent recipe