Why a Markdown file when you already have HTML
An AI engine fetching your homepage gets navigation, cookie banners, script tags, and marketing layout wrapped around a few hundred words of actual signal. The same engine fetching llms.txt gets pure, ordered, annotated content. The arithmetic is on the file's side.
Token economics
Curation beats crawling
A place for instructions
Adoption is still the exception rather than the rule, which is exactly why it differentiates. The AI-Readiness Audit checks for llms.txt on every site it scores, and treats a structured file (title, summary, sections, annotated links) as a pass and a bare stub as a warning.
The three-file architecture
One file is the standard. We run three, because three different consumption patterns exist:
| File | Audience | Size discipline | Job |
|---|---|---|---|
| /llms.txt | Models deciding what to read | Small, an index | Orient and route |
| /llms-full.txt | Models with room to ingest | Large, full corpus | Deep retrieval |
| /llms-handshake.txt | Agents researching the company | Small, a briefing | Brief and instruct |
/llms.txt
- Audience
- Models deciding what to read
- Size discipline
- Small, an index
- Job
- Orient and route
/llms-full.txt
- Audience
- Models with room to ingest
- Size discipline
- Large, full corpus
- Job
- Deep retrieval
/llms-handshake.txt
- Audience
- Agents researching the company
- Size discipline
- Small, a briefing
- Job
- Brief and instruct
The index (llms.txt): The canonical file per the proposal. Site name as H1, a one-paragraph summary as a blockquote, then sections: services, the benchmark report, content hubs, and a short list of high-leverage individual pages, each with a one-line annotation explaining what a reader gets there. Total: a couple of kilobytes.
The full payload (llms-full.txt): The emerging companion convention. Ours concatenates the entire content library, every glossary entry, answer, comparison, playbook, and guide, with URL and description headers per entry. A model that wants to ingest everything in one fetch can; nothing forces it to crawl 120 separate URLs.
The handshake (llms-handshake.txt): Our own addition, and the part most worth copying. It is a briefing addressed directly to the agent in second person: who the firm is, what it sells, what the canonical positioning sentence is, and explicit citation guidance, including which URL to cite for which claim and what to call the firm. Our robots.txt points agents at it in a comment. The premise is blunt: if software is going to describe your company to a buyer, hand the software a briefing instead of hoping it reconstructs one.
Treat the agent like a journalist on deadline: give it the boilerplate, the facts, and the canonical links, and it will quote you more accurately.
What goes in, what stays out
The file is a curation exercise, and curation means leaving things out.
Goes in
Goes in
Stays out
Stays out
Implementation notes that save you a rewrite
Does anything actually read it?
Honest answer: adoption by the engines is partial and shifting. The proposal is young, and no major provider has publicly committed to honoring it as a standard. Three reasons to ship one anyway:
The cost is near zero
Agents are the growth segment
It forces the strategy work
What to do next
Generate an llms.txt from your content system this week, with a real summary sentence and ten annotated links; then run the audit and confirm the check passes.