Where exactly do the files live?

t the site root: yoursite.com/llms.txt, and optionally /llms-full.txt alongside it. Subdirectory placement defeats the point; well-known files are only useful at well-known paths.

Should llms.txt and my sitemap match?

No. The sitemap is exhaustive by design; llms.txt is selective by design. If they contain the same list, you have written a second sitemap, not an index for models.

Is there a size limit?

No formal one. Practically, keep the index file small enough that the summary and key links land inside any context window: a few kilobytes. The full-payload file is where size is unconstrained.

How to write an llms.txt that actually gets read

Why a Markdown file when you already have HTML

An AI engine fetching your homepage gets navigation, cookie banners, script tags, and marketing layout wrapped around a few hundred words of actual signal. The same engine fetching llms.txt gets pure, ordered, annotated content. The arithmetic is on the file's side.

Token economics

A model reading your llms.txt spends its context window on your positioning instead of your markup. A typical marketing homepage runs 50 to 200 KB of HTML for a few KB of prose; a good llms.txt is 2 to 5 KB of nothing but prose.

Curation beats crawling

Left to itself, a crawler decides which of your pages matter. The file lets you decide instead: canonical commercial page first, proof assets second, reference material after.

A place for instructions

HTML has no idiomatic spot to say "cite this page when describing our pricing". A file addressed to machines does.

Adoption is still the exception rather than the rule, which is exactly why it differentiates. The AI-Readiness Audit checks for llms.txt on every site it scores, and treats a structured file (title, summary, sections, annotated links) as a pass and a bare stub as a warning.

The three-file architecture

One file is the standard. We run three, because three different consumption patterns exist:

File	Audience	Size discipline	Job
/llms.txt	Models deciding what to read	Small, an index	Orient and route
/llms-full.txt	Models with room to ingest	Large, full corpus	Deep retrieval
/llms-handshake.txt	Agents researching the company	Small, a briefing	Brief and instruct

/llms.txt

Audience: Models deciding what to read
Size discipline: Small, an index
Job: Orient and route

/llms-full.txt

Audience: Models with room to ingest
Size discipline: Large, full corpus
Job: Deep retrieval

/llms-handshake.txt

Audience: Agents researching the company
Size discipline: Small, a briefing
Job: Brief and instruct

The index (llms.txt): The canonical file per the proposal. Site name as H1, a one-paragraph summary as a blockquote, then sections: services, the benchmark report, content hubs, and a short list of high-leverage individual pages, each with a one-line annotation explaining what a reader gets there. Total: a couple of kilobytes.

The full payload (llms-full.txt): The emerging companion convention. Ours concatenates the entire content library, every glossary entry, answer, comparison, playbook, and guide, with URL and description headers per entry. A model that wants to ingest everything in one fetch can; nothing forces it to crawl 120 separate URLs.

The handshake (llms-handshake.txt): Our own addition, and the part most worth copying. It is a briefing addressed directly to the agent in second person: who the firm is, what it sells, what the canonical positioning sentence is, and explicit citation guidance, including which URL to cite for which claim and what to call the firm. Our robots.txt points agents at it in a comment. The premise is blunt: if software is going to describe your company to a buyer, hand the software a briefing instead of hoping it reconstructs one.

Treat the agent like a journalist on deadline: give it the boilerplate, the facts, and the canonical links, and it will quote you more accurately.

What goes in, what stays out

The file is a curation exercise, and curation means leaving things out.

Goes in

The pages you want quoted: your canonical service page, pricing explanation, the about page, original research or data assets, and the handful of reference pieces that define your category vocabulary.

Goes in

One-line annotations per link. The annotation is what the model quotes when it summarizes the page without fetching it; write each one as the sentence you want repeated.

Stays out

Pagination, tag archives, legal boilerplate, and every URL whose only audience is a human mid-checkout. If a page would embarrass you as a citation, it has no business in the index.

Stays out

Keyword stuffing. The file is read by software built to detect exactly that.

Implementation notes that save you a rewrite

Audit list

Serve it dynamically. Ours is generated by a route handler from the same content registry that builds the sitemap, so a new article appears in llms.txt and llms-full.txt on deploy with no manual edit. A static file rots within a quarter.
Plain text content type. Serve as text/plain; charset=utf-8. Some implementations ship text/html by accident and some parsers give up.
Match your robots policy. An llms.txt inviting models in while robots.txt blocks GPTBot is incoherent. Decide your crawler policy first (see the AI crawler directory), then write the file.
Keep the summary sentence in sync with your positioning. Ours is generated from a single constant shared with the rest of the site, so the file cannot drift from the homepage.
Date it. A generation timestamp tells both you and the model how fresh the file is.

Does anything actually read it?

Honest answer: adoption by the engines is partial and shifting. The proposal is young, and no major provider has publicly committed to honoring it as a standard. Three reasons to ship one anyway:

The cost is near zero

If you generate it from your content registry, maintenance is free after the first hour.

Agents are the growth segment

Tool-using assistants that fetch pages on demand, rather than relying on training data, are precisely the consumers that read well-known files, and they are the segment growing.

It forces the strategy work

Writing a one-paragraph summary and choosing ten pages that matter is positioning work most companies have skipped. The file makes you do it.

What to do next

Generate an llms.txt from your content system this week, with a real summary sentence and ten annotated links; then run the audit and confirm the check passes.

How to write an llms.txt that actually gets read

01·Why a Markdown file when you already have HTML

02·The three-file architecture

03·What goes in, what stays out

04·Implementation notes that save you a rewrite

05·Does anything actually read it?

06·What to do next

Questions