November 18, 2025

What Is TOON Format? The Compact JSON Alternative for LLM Prompts

As AI models become more accessible and context windows grow larger, the cost of tokens matters more than ever.
Standard JSON is verbose — full of brackets, quotes, and repeated field names that consume tokens without adding meaning.

TOON (Token-Oriented Object Notation) is a compact, human-readable encoding of the JSON data model designed specifically for LLM prompts.
It reduces token usage by ~40% on typical datasets while remaining fully lossless and bidirectional.

This guide explains what TOON is, how it works, when to use it, and how to integrate it into your AI workflows.

What Is TOON?

TOON is a serialization format that encodes the same data model as JSON — objects, arrays, strings, numbers, booleans, and null — but with significantly fewer tokens.

It combines:

  • YAML-like indentation for nested objects
  • CSV-style tables for uniform arrays
  • Explicit metadata for array lengths and field headers

Here's a real example comparing JSON and TOON:

JSON (288 tokens)

{
  "context": {
    "task": "Our favorite hikes together",
    "location": "Boulder",
    "season": "spring_2025"
  },
  "friends": ["ana", "luis", "sam"],
  "hikes": [
    {
      "id": 1,
      "name": "Blue Lake Trail",
      "distanceKm": 7.5,
      "elevationGain": 320,
      "companion": "ana",
      "wasSunny": true
    },
    {
      "id": 2,
      "name": "Ridge Overlook",
      "distanceKm": 9.2,
      "elevationGain": 540,
      "companion": "luis",
      "wasSunny": false
    },
    {
      "id": 3,
      "name": "Wildflower Loop",
      "distanceKm": 5.1,
      "elevationGain": 180,
      "companion": "sam",
      "wasSunny": true
    }
  ]
}

TOON (168 tokens, 42% fewer)

context:
  task: Our favorite hikes together
  location: Boulder
  season: spring_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
  1,Blue Lake Trail,7.5,320,ana,true
  2,Ridge Overlook,9.2,540,luis,false
  3,Wildflower Loop,5.1,180,sam,true

The data is identical — but TOON eliminates:

  • repeated field names
  • excessive braces and brackets
  • redundant quotes
  • visual noise

How TOON Works

TOON uses three key innovations to achieve token efficiency:

1. Indentation Instead of Braces

Like YAML, TOON uses indentation to represent nesting:

user:
  name: Alice
  settings:
    theme: dark

No { or } needed.

2. Tabular Arrays with Field Headers

For arrays of objects with the same structure, TOON declares fields once:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Instead of:

{
  "users": [
    {"id": 1, "name": "Alice", "role": "admin"},
    {"id": 2, "name": "Bob", "role": "user"}
  ]
}

3. Explicit Metadata for LLM Guardrails

TOON includes:

  • [N] — array length declaration
  • {fields} — explicit field headers

This gives models clear structure to follow, improving parsing accuracy.

From benchmarks across 4 LLMs, TOON achieved 73.9% accuracy vs JSON's 69.7% on data retrieval tasks.

When Should You Use TOON?

TOON excels in specific scenarios:

✅ Use TOON When You Have:

  • Uniform arrays of objects (same fields across items)
  • LLM prompts with large datasets
  • Token-sensitive applications (cost optimization)
  • Tabular or semi-tabular data (logs, analytics, records)

❌ Avoid TOON When You Have:

  • Deeply nested non-uniform structures
  • Pure flat tables (use CSV instead)
  • Systems requiring strict JSON compatibility
  • Data where token count doesn't matter

Real-World Token Savings

Here are token counts from actual benchmarks using GPT-5's tokenizer:

Dataset JSON TOON Savings
100 employee records 126,860 49,831 60.7%
E-commerce orders (nested) 108,806 72,771 33.1%
Time-series analytics 22,250 9,120 59.0%
GitHub repositories 15,145 8,745 42.3%
Event logs (semi-uniform) 180,176 153,211 15.0%

TOON consistently saves 15-60% depending on data structure.

For uniform tabular data, TOON approaches CSV-level efficiency while preserving full structure.

How to Use TOON in LLM Prompts

1. Wrap TOON in Code Blocks

Always use ```toon code blocks when including TOON in prompts:

```toon
users[3]{id,name,email}:
  1,alice,[email protected]
  2,bob,[email protected]
  3,charlie,[email protected]
```

This signals the format to the model.

2. Show the Format, Don't Describe It

Models parse TOON naturally when they see examples.
Don't explain the syntax — just show valid TOON.

3. Use Tab Delimiters for Maximum Efficiency

The official library supports tab-delimited mode:

users[2]{id,name,role}:
  1	Alice	admin
  2	Bob	user

Tabs tokenize more efficiently than commas in many models.

4. Validate Round-Trips Before Production

TOON is lossless and deterministic, but always test decode mode to verify your data survives conversion:

import { encode, decode } from '@toon-format/toon'

const original = { users: [{ id: 1, name: 'Alice' }] }
const toon = encode(original)
const restored = decode(toon)

console.log(JSON.stringify(original) === JSON.stringify(restored)) // true

Converting Between JSON and TOON

You can convert between formats using:

Both run entirely in your browser with no server-side processing.

Example (TypeScript/JavaScript)

import { encode, decode } from '@toon-format/toon'

const data = {
  items: [
    { id: 1, name: 'Item A', price: 10.5 },
    { id: 2, name: 'Item B', price: 20.0 }
  ]
}

const toon = encode(data)
console.log(toon)
// items[2]{id,name,price}:
//   1,Item A,10.5
//   2,Item B,20.0

const json = decode(toon)
console.log(json) // back to original JSON

TOON vs JSON vs YAML vs CSV

Feature JSON TOON YAML CSV
Token Efficiency Baseline 40% fewer 18% fewer Best (flat only)
Nested Structures ✅ Yes ✅ Yes ✅ Yes ❌ No
LLM Accuracy 69.7% 73.9% 69.0% 50.5%
Sortable Fields ❌ No ✅ Yes ❌ No ✅ Yes
Lossless ✅ Yes ✅ Yes ✅ Yes ❌ No
Human Readable Medium High High High

TOON balances all these factors better than alternatives for LLM use cases.

Best Practices for Production Use

1. Integrate the Library Directly

For production pipelines, use the official @toon-format/toon library:

npm install @toon-format/toon

2. Pre-Process Data Before Sending to LLMs

Convert JSON to TOON in your backend before including it in prompts:

const prompt = `
Analyze this user data:

\`\`\`toon
${encode(userData)}
\`\`\`

What patterns do you see?
`

3. Cache Encoded Results

TOON encoding is deterministic — cache results to avoid re-encoding identical data.

4. Monitor Token Savings

Track actual token usage before and after switching to TOON to measure ROI.

5. Test with Your Specific Models

Different models tokenize differently. Benchmark on your exact use case.

Common Misconceptions

"TOON is just CSV with extra steps"

No. TOON supports:

  • nested objects
  • mixed types
  • arrays of primitives
  • full JSON data model

CSV is flat and lossy. TOON is lossless.

"Models can't parse TOON"

Benchmarks show TOON achieves higher accuracy than JSON on retrieval tasks.
Models parse it naturally when they see examples.

"TOON is only for AI"

TOON is optimized for LLMs, but it's also:

  • human-readable
  • useful for logs
  • good for config files
  • great for data exchange

"It's just YAML"

TOON uses indentation like YAML, but adds:

  • explicit array lengths
  • tabular encoding
  • field headers

These make it more token-efficient and LLM-friendly.

Specification and Implementations

TOON is an open format with:

  • Full specification (v2.0)
  • Conformance test suite
  • Multi-language implementations:
    • TypeScript/JavaScript (official)
    • Python, Go, Rust, .NET (official)
    • 15+ community implementations

Check github.com/toon-format/toon for the latest.

Try TOON Today

The easiest way to experiment with TOON is our free JSON to TOON Converter

Features:

  • Bidirectional JSON ⇄ TOON conversion
  • No server uploads (runs in browser)
  • Copy-ready output
  • Real-time validation

Paste your JSON, convert to TOON, and see the token savings instantly.

Final Thoughts

TOON solves a real problem: LLM tokens are expensive, and JSON is wasteful.

By combining YAML's readability with CSV's compactness, TOON achieves:

  • ~40% token savings on typical datasets
  • Better LLM parsing accuracy
  • Lossless bidirectional conversion
  • Human-friendly syntax

If you're building AI applications that send structured data to models, TOON is worth testing.

Start with our converter, measure the results, and integrate the library when it makes sense.

Your token bills will thank you.

NordPass Promotion

Sponsored link