Your ADRs Are Costing You Tokens

Architecture Decision Records are great documentation. Documented repos have dozens of them and they are valuable for all team members, but they’re also quietly degrading your AI coding assistant’s performance.

The Problem: Context Rot

Research from Chroma and Adobe shows that LLM accuracy drops
non-linearly as input grows. Structured text like ADRs can degrade performance more than random content, because the model follows logical threads that aren’t relevant to the current task.

A typical project with 40 ADRs weighs in at ~50K tokens. That’s 25%
of a 200K context window spent on decisions like “use PrimeNG over
Angular Material”, a conclusion that takes 3 words but lives inside
150 lines of stakeholder context.

In our project it often turned out, that the LLM still thought we are using Cosmos DB as the primary source, which was wrong as we switched mid-project to Azure SQL and kept Cosmos only for ingestion. All documented in ADRs but not in a token efficient way.

The Fix: Two-Tier Progressive Disclosure

The solution isn’t “stop writing ADRs.” It’s to stop loading them
all into context. Think of it as a budget:

Tier 1: Rules files (.claude/rules/)    → "What to do"  (always loaded)
Tier 2: Full ADRs (docs/adr/)           → "Why we chose it" (on-demand)

Step 1: Create a Lightweight ADR Index

Add a rules file (~400 words, ~1K tokens) that gives the LLM a map
of all decisions without the full rationale:

# Active Architectural Decisions (ADR Index)

Consult the full ADR in `docs/adr/` when you need rationale
or detailed context. This index shows current decisions only.

## Database & Data

- ADR-0021: Hybrid — SQL primary, Cosmos for measurements only
- ADR-0020: Read-Modify-Replace for nested array updates

## Auth & Roles

- ADR-0013: Entra External Identities (not B2C)
- ADR-0038: All roles stored in Azure SQL (not Cosmos)

## Superseded (DO NOT USE — read the replacement instead)

- ~~ADR-0010~~ -> ADR-0021 (database)
- ~~ADR-0016~~ -> ADR-0017 -> ADR-0038 (roles source)

The LLM now knows which ADR to pull in when it encounters a
relevant task, without pre-loading 50K tokens of rationale.

One obvious objection: this index needs updating when you add a new ADR. In practice it’s a 1-line addition. If you want to enforce it, a CI check or Claude Code hook that compares the index against docs/adr/ catches drift.

Step 2: Extract Actionable Conclusions into Rules

ADRs contain decisions buried in context. Your rules files should
surface the coding-relevant parts:

## Roles (ADR-0038)

- SQL is the single source of truth for all roles
- `SqlUserRepository` manages role storage and retrieval
- Claims enrichment flow: `UserClaimsTransformation.cs`

## Error Responses (ADR-0033)

- All API errors use RFC 7807 ProblemDetails format
- `AddProblemDetails()` registered in `Program.cs`

If you can’t code from the rule alone, it’s missing information.
If the rule repeats the full ADR, it’s too much.

Step 3: Fix Stale Statuses

Superseded ADRs are the most dangerous context for an LLM. A single audit of 40 ADRs turned up three status issues:

One ADR still marked proposed despite being superseded months ago
One using non-standard approved instead of accepted
One missing a partial supersession note

An LLM that reads a superseded ADR before checking its status will
confidently generate code for the old architecture.

With ADRs sorted, there’s one more source of context bloat worth addressing.

Taming the Rest: `.claudeignore`

ADRs aren’t the only context bloat. A .claudeignore file can
exclude high-token, zero-signal files:

# Auto-generated EF Core migration snapshots (~1.9 MB)
backend/**/Migrations/*.Designer.cs
backend/**/Migrations/*ModelSnapshot.cs

# Generated changelog
CHANGELOG.md

# Lock files
package-lock.json
frontend/package-lock.json

# Historical plan documents from past sessions
docs/plans/

# Instructions for other AI tools
AGENTS.md
GEMINI.md

This cuts ~2.5 MB of content the LLM would never usefully reference.
The key question for each entry: “Would an AI coding assistant ever
need this to write correct code?” If no, ignore it.

What to Look For

Here is a quick checklist for what to exclude and how:

Category	Examples	Action
Generated code	Migration snapshots, lock files, compiled output	`.claudeignore`
Generated docs	Changelogs, auto-generated API docs	`.claudeignore`
Other AI configs	`AGENTS.md`, `GEMINI.md`, `.cursorrules`	`.claudeignore`
Historical artifacts	Past session plans, old design docs	`.claudeignore`
Build artifacts	`obj/`, `bin/`, `node_modules/`	`.gitignore` (already excluded)

The Context Budget

Think of your LLM context like a budget:

Budget Item	Tokens	Priority
System prompt	~15K	Fixed
CLAUDE.md + rules	~4K	Always loaded — keep lean
ADR quick-reference index	~1K	Always loaded
Individual ADRs	1-4K each	Load only when relevant
Code being worked on	20-80K	Variable
Conversation history	50-100K	Grows over session
Remaining for reasoning	30-100K	This is what matters

Every token you save on always-loaded context is a token the model
can spend on reasoning about your actual code.

TL;DR

ADRs are still worth writing: they’re Markdown, LLM-native,
and the “why” prevents uninformed changes
Don’t bulk-load them. Instead create a ~1K token index as a rules file
Extract coding conclusions into rules and keep rationale in ADRs
Audit superseded statuses. Stale ADRs generate wrong code
Use .claudeignore for generated files, lock files, and
other-tool configs
Measure your context budget. Maximize reasoning space