Canonicalization

Explain why JSON needs canonicalization and how MCPTrust achieves determinism.


The JSON Problem: JSON is Not Deterministic

Problem Statement: Two semantically identical JSON objects can have different byte representations:

// Representation A
{"name": "foo", "age": 30}
 
// Representation B (same data, different order)
{"age": 30, "name": "foo"}
 
// Representation C (same data, with whitespace)
{ "name": "foo", "age": 30 }

Impact: If you hash these, you get 3 different hashes. This breaks signature verification when the same lockfile is serialized by different tools.

Canonicalization Defined

Definition: Canonicalization is the process of converting JSON to a single, deterministic byte representation.

Goal: Canonicalize(A) === Canonicalize(B) === Canonicalize(C) for semantically equivalent inputs.

MCPTrust's Canonicalization Schemes: Two Versions

v1 (mcptrust-canon-v1) — Default

  • Rules:
    • Keys sorted alphabetically (Go/UTF-8 byte order)
    • Compact output (no whitespace)
    • Standard JSON string escaping
    • Numbers preserved as-is from source
  • Use Case: Internal MCPTrust operations, backward compatibility.
  • Example:
    • Input: { "z": 1, "a": 2 }
    • Output: {"a":2,"z":1}

v2 (mcptrust-canon-v2) — Interoperable

  • Rules:
    • Keys sorted by UTF-16 code unit order (per RFC 8785 / JCS)
    • Compact output
    • Go-native number formatting
  • Use Case: Cross-language integrations, new projects.
  • Example:
    • Input: { "é": 1, "e": 2 }
    • v1: {"e":2,"é":1} // UTF-8 order
    • v2: {"e":2,"é":1} // UTF-16 order (same in this case, differs for edge cases)

When Canonicalization Happens

  1. Lockfile Creation: When mcptrust lock hashes the input schema, it canonicalizes first.
  2. Signing: When mcptrust sign creates a signature, it signs the canonical form, not the pretty-printed file.
  3. Verification: When mcptrust verify checks a signature, it re-canonicalizes the lockfile and compares.

Practical Implications

Key Point: You can edit mcp-lock.json to add whitespace or reorder keys for readability. The signature will still verify because verification uses the canonical form.

[!WARNING] Do NOT change values. Only formatting is safe.