
Search is changing faster than most businesses realise. When someone asks ChatGPT, Perplexity, or Google’s AI Overviews a question about your industry, they are not clicking through ten blue links. They are reading a synthesised answer pulled from sources the AI decided were authoritative, relevant, and easy to understand.
The question is: Does the AI know your website exists? And if it does, is it reading the right pages?
That is the problem llms.txt was designed to solve.
Proposed in September 2024 by AI researcher Jeremy Howard, llms.txt is a plain text file you place at the root of your website that tells large language models (LLMs) what your site is about, which pages matter most, and how to represent your brand accurately in AI-generated responses. Think of it as a curated briefing document for AI systems, written in clean markdown, placed at yourdomain.com/llms.txt.
The core insight: Without llms.txt, AI models crawl your site and make their own decisions about what to surface. They may pull from outdated blog posts, misrepresent your services, or miss your most important pages entirely. With llms.txt, you take control of that narrative.
This guide covers everything you need to know: what llms.txt is, how it works, which AI platforms are already using it, how to implement it step by step, what the honest limitations are, and what it means for your broader AI SEO strategy.
What this guide covers:
- What llms.txt is and where it came from
- How it differs from robots.txt and sitemap.xml
- Which AI platforms recognise it and how
- Who has already implemented it
- Step-by-step implementation guide with a real example
- Pros, cons, and the honest debate around effectiveness
- How llms.txt fits into your broader GEO strategy
- Common mistakes to avoid
- Frequently asked questions
What is llms.txt and Where Did It Come From?
llms.txt is a markdown-formatted plain text file placed in the root directory of a website, specifically designed to help large language models navigate, understand, and accurately represent that site’s content. It is not an official web standard yet, but it is gaining rapid traction among forward-thinking SEO professionals, developers, and AI-first companies worldwide.
The concept was proposed by Jeremy Howard, co-founder of fast.ai and a prominent AI researcher, in September 2024. Howard’s argument was straightforward: AI systems that crawl websites face the same problem search engine spiders faced in the early days of the web. Without guidance, they waste time on irrelevant pages, misinterpret content structure, and surface outdated or low-quality information in their responses.
His solution was to borrow the concept from robots.txt but flip the intent. Rather than telling crawlers what to avoid, llms.txt tells AI systems what to prioritise.
The Problem llms.txt Solves
When an AI model visits your website without any guidance, it does the following:
- Dumps the entire site into its processing pipeline
- Performs similarity searches and keyword matching across all pages
- Prioritises pages based on its own interpretation of relevance
- May surface a three-year-old blog post instead of your current services page
- Can misattribute, misquote, or misrepresent your brand in generated responses
The result: An AI assistant answering a question about your business might give an answer based on content you would never have chosen to highlight. That is a brand accuracy problem, a competitive visibility problem, and a trust problem all at once.
llms.txt solves this by providing a clean, structured, machine-readable document that says: “Here is who we are. Here are our most important pages. Here is what we want you to know about us.”
The Official Specification
The specification, available at llmstxt.org, defines the following structure for a valid llms.txt file:
- H1 heading: Your website or brand name
- Blockquote: A short, plain-language description of what your site does and who it serves
- Sections (H2): Grouped categories of content (e.g. Services, Blog, Documentation)
- Links with descriptions: Each important page is listed as a markdown link with a brief colon-separated description
- Optional llms-full.txt: A companion file containing the full text of your most important pages in clean markdown
The file lives at yourdomain.com/llms.txt and should be accessible as plain text in any browser.
llms.txt vs robots.txt vs sitemap.xml: What Is the Difference?
This is the most common point of confusion. All three files live at your domain root and guide automated systems, but they serve completely different purposes and target different audiences.
| File | Purpose | Targets | What It Does |
|---|---|---|---|
| robots.txt | Access control | Googlebot, Bingbot, all crawlers | Tells crawlers which pages they CAN and CANNOT access |
| sitemap.xml | Discovery | Traditional search engine crawlers | Lists all pages on your site for efficient indexing |
| llms.txt | Context and prioritisation | AI language models (ChatGPT, Claude, Perplexity, Gemini) | Tells AI what your site is about and which content deserves priority |
The Key Distinction
robots.txt and sitemap.xml are about access and discovery for traditional search engines. llms.txt is about context and representation for AI systems.
A useful analogy: robots.txt is the security guard at the door. Sitemap.xml is the building directory. llms.txt is the executive briefing you hand to a VIP visitor before they tour the building.
Important: llms.txt does NOT replace robots.txt or sitemap.xml. All three serve different functions. You need all three to maximise visibility across both traditional search and AI-powered search in 2026.
The Markdown Difference
Unlike robots.txt (plain text directives) and sitemap.xml (structured XML), llms.txt uses markdown formatting. This is intentional. Markdown is readable by both humans and AI systems without requiring a parser, making it the ideal format for a file designed to communicate with language models.
The markdown elements used in llms.txt include:
| Markdown Element | Usage in llms.txt |
|---|---|
# |
H1 heading (your brand or site name) |
## |
H2 heading (section titles like “Services” or “Blog”) |
### |
H3 heading (subsections) |
> |
Blockquote (your site description) |
- or * |
Bullet points |
[text](url) |
Hyperlinks to important pages |
: description |
Optional descriptions after links |
``` |
Code blocks for technical content |
Which AI Platforms Recognise llms.txt?
This is where the honest answer matters. Support for llms.txt is real but uneven. Here is what the data and public disclosures actually show as of April 2026.
Platforms With Confirmed or Indicated Support
Anthropic (Claude) In November 2024, Claude listed llms.txt and llms-full.txt in their official documentation, making them one of the first major AI platforms to formally acknowledge the standard. This is the clearest institutional endorsement to date.
ChatGPT (OpenAI) GPTBot, OpenAI’s web crawler, has been observed crawling llms.txt files. One documented case: Ray Martinez published his llms.txt file and tracked GPTBot crawling it the very next day. OpenAI has not officially confirmed llms.txt support in their documentation, but crawl log evidence suggests active use.
Perplexity Perplexity has indicated it honours llms.txt files as part of its crawling behaviour. The platform is also one of the earliest to reference the standard in its technical documentation.
Common Crawl Common Crawl, which feeds training data to many AI models, has begun honouring llms.txt directives, making it one of the more impactful early adopters from a data pipeline perspective.
Platforms With Unclear or No Official Confirmation
Google (AI Overviews) Google’s position is nuanced. Google officially states that AI Overviews citations rely on traditional SEO signals, not llms.txt. However, in December 2024, Google added llms.txt files across their own developer and documentation sites, suggesting internal interest even if external policy has not changed. Google’s John Mueller has noted that major crawlers do not currently prioritise llms.txt over standard HTML.
Bing / Microsoft Copilot No official confirmation, but Bing’s index powers ChatGPT and Copilot responses, making Bing Webmaster Tools submission of your llms.txt a recommended step regardless.
Adoption Data
According to an SE Ranking analysis of 300,000 domains, llms.txt has a 10.13% adoption rate as of early 2026. That figure is low relative to the speed of AI search growth, which is precisely why early adoption represents a meaningful competitive advantage right now.
Notable early adopters include:
- Anthropic (Claude’s own documentation site)
- Vercel
- Hugging Face
- Mintlify (who reported 436 AI crawler visits after implementing it, primarily from ChatGPT)
- Google’s developer documentation (briefly, in December 2024)
How to Implement llms.txt on Your Website: Step-by-Step
Creating and deploying llms.txt takes approximately 30 minutes and requires no technical expertise beyond basic file management. Here is the complete process.
Step 1: Audit Your Pillar Content
Do not try to list every page on your site. AI models prioritise density and clarity over volume. Select 5 to 15 pages that define your business. These typically include:
- Homepage: Your high-level elevator pitch
- About page: Your history, mission, team, and credentials
- Services or product pages: What you actually offer
- Pricing page: Costs and tiers (critical for accurate AI responses)
- Key blog posts or guides: Your highest-authority, most-cited content
- Contact page: How to reach you
Exclude: tag archives, thank-you pages, login pages, duplicate content, and any pages you would not want an AI to surface in response to a question about your business.
Step 2: Create the llms.txt File
Open any plain text editor (Notepad, TextEdit, VS Code) and create a file named llms.txt. Structure it as follows:
# Your Brand Name
> A clear, one-to-two sentence description of what your website does and who it serves.
## Services
- [Service Page Name](https://yourdomain.com/services): Brief description of what this page covers.
- [Product Page Name](https://yourdomain.com/products): Brief description.
## About
- [About Us](https://yourdomain.com/about): Company history, team, and mission.
## Key Resources
- [Guide Title](https://yourdomain.com/blog/guide): What this guide covers and who it is for.
- [Case Study](https://yourdomain.com/case-study): What problem was solved and for whom.
## Contact
- [Contact Us](https://yourdomain.com/contact): How to reach the team.
Formatting rules to follow:
- Use
#for your brand name (H1) - Use
>for your site description (blockquote) - Use
##for section headings - Use
- [text](url): descriptionfor each page - Keep descriptions to one sentence each
- No HTML, no navigation elements, no footers
Step 3: Optionally Create llms-full.txt
For sites with complex or technical content, create a companion file called llms-full.txt. This file contains the full clean text of your most important pages, stripped of navigation, ads, and design elements, formatted in markdown.
This gives AI systems that support it a complete, distraction-free version of your best content in a single file.
Step 4: Upload to Your Root Directory
Place the file at the top level of your server, in the same location as your robots.txt file. The result should be accessible at:
https://yourdomain.com/llms.txt
Test by opening the URL in a browser. It should display as clean, readable plain text.
Platform-specific notes:
- WordPress: Upload via FTP to your
public_htmlfolder, or use a plugin like Yoast SEO (which added llms.txt support in 2025) - Shopify: Upload as a static file in your theme assets or use the root redirect method
- Squarespace / Wix: May require workarounds; check platform-specific documentation
- Static sites / GitHub Pages: Add the file directly to your repository root
Step 5: Submit and Verify
- Submit to Bing Webmaster Tools: Since Bing powers ChatGPT and Copilot, this is the highest-priority submission. Go to Bing Webmaster Tools and submit
yourdomain.com/llms.txtas a sitemap. - Submit to Google Search Console: Add the URL as a sitemap for Google’s awareness.
- Check your server logs: Look for crawl activity from GPTBot, ClaudeBot, PerplexityBot, and Google-Extended within 24-72 hours of publishing.
- Test in AI tools: Ask ChatGPT or Perplexity a question about your business and observe what content they surface.
Step 6: Update Quarterly
llms.txt is not a set-and-forget file. Review it every quarter to:
- Add new service or product pages
- Remove outdated content
- Update descriptions to reflect current positioning
- Add new high-authority blog posts or case studies
Real-World llms.txt Adopters: Who Has Implemented It and What Happened
Seeing who has already implemented llms.txt, and what results they reported is more useful than any theoretical argument. Here is the most complete picture available as of April 2026.
Confirmed Adopters and Their Outcomes
| Organisation | Type | What They Did | Reported Result |
|---|---|---|---|
| Anthropic | AI company | Implemented llms.txt across Claude’s documentation site; also listed the standard in the official Claude docs in November 2024 | First major AI platform to formally endorse the standard |
| Vercel | Developer platform | Added llms.txt to their documentation root | Cited as one of the earliest tech-forward adopters |
| Hugging Face | AI/ML platform | Implemented llms.txt across their model and dataset documentation | Used as a reference implementation in the community |
| FastHTML | Web framework | Jeremy Howard’s own project, the first real-world implementation, used as the reference case in the original specification | Demonstrated the full llms.txt + llms-full.txt workflow |
| Mintlify | Documentation platform | Implemented llms.txt and monitored AI crawler logs | Reported 436 AI crawler visits, majority from ChatGPT, within weeks of publishing |
| WordLift | SEO/AI company | Implemented llms.txt as part of their GEO strategy | Reported a 25% increase in organic traffic |
| Springs Apps | Software company | Implemented llms.txt for GenAI-optimised indexing | Reported 20% increase in search engine visibility and 15% improvement in accurate query answers |
| Search/AI | Added llms.txt files across the developer and documentation sites | The file was retrieved within 24 hours; Google has not officially endorsed the standard externally | |
| LangChain | AI framework | Maintains a public llms.txt at js.langchain.com/llms.txt |
Used as a reference example by the LLMsTxt Architect open-source tool |
| Yoast SEO | WordPress plugin | Added automated llms.txt generation in Yoast SEO v26.8+ | Now generates llms.txt automatically for millions of WordPress sites |
What the FastHTML Reference Implementation Looks Like
The FastHTML project, created by Jeremy Howard himself, is the canonical reference for how llms.txt should work in practice. Their implementation includes:
- A root
llms.txtfile listing all key documentation pages with descriptions - Individual
.mdfiles at the same URL as each HTML page (e.g.docs/page.html.md) containing clean markdown versions of the content - Two expanded context files:
llms-ctx.txt(without optional URLs) andllms-ctx-full.txt(with all URLs), generated using thellms_txt2ctxcommand-line tool
This two-layer approach (index file plus full-content companion files) represents the most complete llms.txt implementation and is what Claude’s documentation team modelled their own implementation on.
The Yoast SEO Development: A Turning Point
The fact that Yoast SEO, which powers SEO settings for tens of millions of WordPress sites worldwide, added automated llms.txt generation in version 26.8 is arguably the most significant adoption signal to date. It means llms.txt is now being generated automatically for a massive portion of the web, even by site owners who have never heard of the standard. That scale of passive adoption will accelerate AI platform support significantly.
Free llms.txt Tools: Generate, Check, and Validate Your File
You do not need to build your llms.txt file from scratch. A growing ecosystem of free tools can generate a compliant draft in minutes by crawling your site automatically. Here are the best options available as of April 2026.
Generator Tools
LLMrefs llms.txt Generator: The most fully-featured free generator available. It deep-scans your website, analyses page metadata, discovers internal links, and uses AI to write descriptions for each page. The output is a ready-to-download llms.txt file you can edit before uploading. Free, no credit card required.
Rankability llms.txt Generator and Validator: Combines generation with validation. You can build a file using their form interface, preview it in real time, and validate that it meets the official specification. Also checks any domain for an existing llms.txt file. Useful for auditing competitors.
Lumina SEO llms.txt Generator Free GEO-focused tool that generates your llms.txt and also checks AI crawler access. Useful for identifying whether your site is blocking AI bots in robots.txt at the same time as creating the file.
GEOptimizer Chrome Extension: A browser extension that scans any website you visit for an existing llms.txt file and generates one if it is missing. Useful for quickly checking whether competitors have implemented the standard.
Developer Tools (Technical Users)
llms_txt2ctx (Jeremy Howard’s official CLI tool): The command-line tool from the original specification. Expands your llms.txt into full LLM context files (llms-ctx.txt and llms-ctx-full.txt) by fetching and combining the content of all linked pages. The most technically complete implementation approach.
LLMsTxt Architect A Python package that builds llms.txt automatically from a list of URLs or an existing sitemap. Supports multiple LLM providers (Anthropic, OpenAI, Ollama) for description generation. Useful for large or complex sites where manual description writing is impractical.
llmstxt.directory A community directory of published llms.txt implementations. Browse how other sites have structured their files, vote for useful examples, and submit your own once published. A practical reference library for anyone building their first file.
Checker Tools
To verify your llms.txt is live and accessible after uploading:
- Browser test: Navigate directly to
yourdomain.com/llms.txtin any browser. It should render as clean plain text. - Rankability validator: Paste your domain into the Rankability tool to confirm format compliance.
- Server log monitoring: Check your server logs for
GPTBot,ClaudeBot,PerplexityBot, andGoogle-Extendedwithin 24-72 hours of publishing. Crawl activity from these bots is the strongest signal that the file has been discovered. - Lumina SEO checker: Run your domain through Lumina to verify AI crawler access alongside file validation.
Pro tip: After publishing your llms.txt, ask ChatGPT or Perplexity a direct question about your business, such as “What does [your brand name] do?” and compare the answer to your llms.txt description. If the AI’s answer aligns closely with your file’s content, it is a reasonable indicator the file is being read.
