Mivia Sign — How it works

01 / 07

Your audio, full length.

A typical 3-minute stereo track at 44.1 kHz is 8.9 million samples per channel. Mivia Sign treats it as a stream of independent frames — 4096 samples each, 93 ms of audio at the engine's 22,050 Hz working rate.

format: WAV · FLAC · AIFF · MP3
channels: mono or stereo
preserves: sample rate · bit depth · container

02 / 07

Zoom in. One frame.

Each frame carries the full watermark payload independently. Every frame is a fresh chance to recover the signature — so even aggressively cropped audio still verifies, as long as one complete frame survives.

frame length: 4,096 samples · 93 ms
payload: full 20-byte envelope, BCH-encoded
redundancy: every frame holds a copy

03 / 07

Move to the frequency domain.

The frame goes through a discrete cosine transform, splitting it into 2,048 frequency coefficients. Human hearing is most sensitive to mid-range frequencies — so the watermark skips those and hides in the perceptually quiet bands above.

transform: DCT-II · orthonormal
coefficients: 2,048 per frame
target band: above the masking threshold

04 / 07

A keyed subset of bins.

A blake3-derived master key picks which coefficients hold the watermark — roughly 120 bins out of the 2,048. Without the key, even a forensic listener can't find them; with the key, extraction is O(1) per frame.

selection: blake3(platform_key) → bin indices
carriers per frame: ~120
total bits per frame: 160 (20 bytes × 8)

05 / 07

Embed one bit. Imperceptibly.

Each carrier coefficient is nudged to the nearest point on a quantisation lattice with spacing δ = 0.002. That's about −54 dBFS — below the noise floor of any practical listening environment, and well below the threshold of audibility.

method: keyed QIM (quantisation index modulation)
step size: δ = 0.002
peak distortion: −54 dBFS (inaudible)

06 / 07

The envelope.

The 20 bytes embedded into every frame: a magic byte, a format version, a 4-byte issuer fingerprint, an 8-byte content hash pointing at the full manifest in the content-addressed store, a random nonce, and a 16-bit CRC. BCH-coded into 70 bytes before embedding so majority-vote recovery tolerates up to 40 bit errors per frame.

byte 0: FB (magic)
byte 1: 01 (version)
bytes 2–5: issuer fingerprint
bytes 6–13: contract hash (pointer)
bytes 14–17: nonce
bytes 18–19: CRC-16/CCITT

07 / 07

Your file, signed.

The output is byte-identical to your input except for the imperceptible modifications inside the DCT coefficients. WAV stays WAV at the same sample rate and bit depth. MP3 stays MP3 at the same bitrate with the same frame count. Nothing about the file's identity, structure, or listening experience changes — except that it now carries your signature.

size delta: 0 bytes (MP3) · unchanged (WAV)
perceptual delta: none
cryptographic delta: one 20-byte envelope per frame

A signature
embedded in the
sound itself.

Your audio, full length.

Zoom in. One frame.

Move to the frequency domain.

A keyed subset of bins.

Embed one bit. Imperceptibly.

The envelope.

Your file, signed.

Properties

Sign your first file.

A signatureembedded in thesound itself.

Your audio, full length.

Zoom in. One frame.

Move to the frequency domain.

A keyed subset of bins.

Embed one bit. Imperceptibly.

The envelope.

Your file, signed.

Properties

Sign your first file.

A signature
embedded in the
sound itself.