As AI models become more accessible and context windows grow larger, the cost of tokens matters more than ever.
Standard JSON is verbose — full of brackets, quotes, and repeated field names that consume tokens without adding meaning.
TOON (Token-Oriented Object Notation) is a compact, human-readable encoding of the JSON data model designed specifically for LLM prompts.
It reduces token usage by ~40% on typical datasets while remaining fully lossless and bidirectional.
This guide explains what TOON is, how it works, when to use it, and how to integrate it into your AI workflows.
What Is TOON?
TOON is a serialization format that encodes the same data model as JSON — objects, arrays, strings, numbers, booleans, and null — but with significantly fewer tokens.
It combines:
- YAML-like indentation for nested objects
- CSV-style tables for uniform arrays
- Explicit metadata for array lengths and field headers
Here's a real example comparing JSON and TOON:
JSON (288 tokens)
{
"context": {
"task": "Our favorite hikes together",
"location": "Boulder",
"season": "spring_2025"
},
"friends": ["ana", "luis", "sam"],
"hikes": [
{
"id": 1,
"name": "Blue Lake Trail",
"distanceKm": 7.5,
"elevationGain": 320,
"companion": "ana",
"wasSunny": true
},
{
"id": 2,
"name": "Ridge Overlook",
"distanceKm": 9.2,
"elevationGain": 540,
"companion": "luis",
"wasSunny": false
},
{
"id": 3,
"name": "Wildflower Loop",
"distanceKm": 5.1,
"elevationGain": 180,
"companion": "sam",
"wasSunny": true
}
]
}
TOON (168 tokens, 42% fewer)
context:
task: Our favorite hikes together
location: Boulder
season: spring_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
1,Blue Lake Trail,7.5,320,ana,true
2,Ridge Overlook,9.2,540,luis,false
3,Wildflower Loop,5.1,180,sam,true
The data is identical — but TOON eliminates:
- repeated field names
- excessive braces and brackets
- redundant quotes
- visual noise
How TOON Works
TOON uses three key innovations to achieve token efficiency:
1. Indentation Instead of Braces
Like YAML, TOON uses indentation to represent nesting:
user:
name: Alice
settings:
theme: dark
No { or } needed.
2. Tabular Arrays with Field Headers
For arrays of objects with the same structure, TOON declares fields once:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
Instead of:
{
"users": [
{"id": 1, "name": "Alice", "role": "admin"},
{"id": 2, "name": "Bob", "role": "user"}
]
}
3. Explicit Metadata for LLM Guardrails
TOON includes:
[N]— array length declaration{fields}— explicit field headers
This gives models clear structure to follow, improving parsing accuracy.
From benchmarks across 4 LLMs, TOON achieved 73.9% accuracy vs JSON's 69.7% on data retrieval tasks.
When Should You Use TOON?
TOON excels in specific scenarios:
✅ Use TOON When You Have:
- Uniform arrays of objects (same fields across items)
- LLM prompts with large datasets
- Token-sensitive applications (cost optimization)
- Tabular or semi-tabular data (logs, analytics, records)
❌ Avoid TOON When You Have:
- Deeply nested non-uniform structures
- Pure flat tables (use CSV instead)
- Systems requiring strict JSON compatibility
- Data where token count doesn't matter
Real-World Token Savings
Here are token counts from actual benchmarks using GPT-5's tokenizer:
| Dataset | JSON | TOON | Savings |
|---|---|---|---|
| 100 employee records | 126,860 | 49,831 | 60.7% |
| E-commerce orders (nested) | 108,806 | 72,771 | 33.1% |
| Time-series analytics | 22,250 | 9,120 | 59.0% |
| GitHub repositories | 15,145 | 8,745 | 42.3% |
| Event logs (semi-uniform) | 180,176 | 153,211 | 15.0% |
TOON consistently saves 15-60% depending on data structure.
For uniform tabular data, TOON approaches CSV-level efficiency while preserving full structure.
How to Use TOON in LLM Prompts
1. Wrap TOON in Code Blocks
Always use ```toon code blocks when including TOON in prompts:
```toon
users[3]{id,name,email}:
1,alice,[email protected]
2,bob,[email protected]
3,charlie,[email protected]
```
This signals the format to the model.
2. Show the Format, Don't Describe It
Models parse TOON naturally when they see examples.
Don't explain the syntax — just show valid TOON.
3. Use Tab Delimiters for Maximum Efficiency
The official library supports tab-delimited mode:
users[2]{id,name,role}:
1 Alice admin
2 Bob user
Tabs tokenize more efficiently than commas in many models.
4. Validate Round-Trips Before Production
TOON is lossless and deterministic, but always test decode mode to verify your data survives conversion:
import { encode, decode } from '@toon-format/toon'
const original = { users: [{ id: 1, name: 'Alice' }] }
const toon = encode(original)
const restored = decode(toon)
console.log(JSON.stringify(original) === JSON.stringify(restored)) // true
Converting Between JSON and TOON
You can convert between formats using:
- The official library:
@toon-format/toon - Our free JSON to TOON Converter
Both run entirely in your browser with no server-side processing.
Example (TypeScript/JavaScript)
import { encode, decode } from '@toon-format/toon'
const data = {
items: [
{ id: 1, name: 'Item A', price: 10.5 },
{ id: 2, name: 'Item B', price: 20.0 }
]
}
const toon = encode(data)
console.log(toon)
// items[2]{id,name,price}:
// 1,Item A,10.5
// 2,Item B,20.0
const json = decode(toon)
console.log(json) // back to original JSON
TOON vs JSON vs YAML vs CSV
| Feature | JSON | TOON | YAML | CSV |
|---|---|---|---|---|
| Token Efficiency | Baseline | 40% fewer | 18% fewer | Best (flat only) |
| Nested Structures | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No |
| LLM Accuracy | 69.7% | 73.9% | 69.0% | 50.5% |
| Sortable Fields | ❌ No | ✅ Yes | ❌ No | ✅ Yes |
| Lossless | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No |
| Human Readable | Medium | High | High | High |
TOON balances all these factors better than alternatives for LLM use cases.
Best Practices for Production Use
1. Integrate the Library Directly
For production pipelines, use the official @toon-format/toon library:
npm install @toon-format/toon
2. Pre-Process Data Before Sending to LLMs
Convert JSON to TOON in your backend before including it in prompts:
const prompt = `
Analyze this user data:
\`\`\`toon
${encode(userData)}
\`\`\`
What patterns do you see?
`
3. Cache Encoded Results
TOON encoding is deterministic — cache results to avoid re-encoding identical data.
4. Monitor Token Savings
Track actual token usage before and after switching to TOON to measure ROI.
5. Test with Your Specific Models
Different models tokenize differently. Benchmark on your exact use case.
Common Misconceptions
"TOON is just CSV with extra steps"
No. TOON supports:
- nested objects
- mixed types
- arrays of primitives
- full JSON data model
CSV is flat and lossy. TOON is lossless.
"Models can't parse TOON"
Benchmarks show TOON achieves higher accuracy than JSON on retrieval tasks.
Models parse it naturally when they see examples.
"TOON is only for AI"
TOON is optimized for LLMs, but it's also:
- human-readable
- useful for logs
- good for config files
- great for data exchange
"It's just YAML"
TOON uses indentation like YAML, but adds:
- explicit array lengths
- tabular encoding
- field headers
These make it more token-efficient and LLM-friendly.
Specification and Implementations
TOON is an open format with:
- Full specification (v2.0)
- Conformance test suite
- Multi-language implementations:
- TypeScript/JavaScript (official)
- Python, Go, Rust, .NET (official)
- 15+ community implementations
Check github.com/toon-format/toon for the latest.
Try TOON Today
The easiest way to experiment with TOON is our free JSON to TOON Converter
Features:
- Bidirectional JSON ⇄ TOON conversion
- No server uploads (runs in browser)
- Copy-ready output
- Real-time validation
Paste your JSON, convert to TOON, and see the token savings instantly.
Final Thoughts
TOON solves a real problem: LLM tokens are expensive, and JSON is wasteful.
By combining YAML's readability with CSV's compactness, TOON achieves:
- ~40% token savings on typical datasets
- Better LLM parsing accuracy
- Lossless bidirectional conversion
- Human-friendly syntax
If you're building AI applications that send structured data to models, TOON is worth testing.
Start with our converter, measure the results, and integrate the library when it makes sense.
Your token bills will thank you.