
If you’re publishing content in 2026 and not measuring traffic from large language models, you’re flying blind on one of the fastest-growing acquisition channels in digital marketing. Across hundreds of websites, we’re already seeing anywhere from 0.5% to 3% of total traffic arriving from LLMs like ChatGPT, Perplexity, Gemini, Copilot, and Claude. That number is only heading in one direction.
The problem is that Google Analytics 4 doesn’t make this easy by default. Without the right setup, most of that AI-driven traffic gets swallowed into your “Referral” bucket or, worse, disappears entirely into “Direct.” You end up with a growing blind spot right as the channel starts to matter.
This guide walks you through every step of setting up LLM referral tracking in GA4 properly: from a quick baseline check, through custom channel groups and regex filters, to recovering the “dark” AI traffic that never shows a referrer. By the end, you’ll have a clean, durable measurement framework you can actually act on.
Key insight: AI referral visitors spend 28% longer per session than organic search users and convert at 1.5x the rate of social traffic. Even at low volumes today, this channel punches well above its weight.
Why LLM Traffic Is Hard to Track (and Why It Matters)
Before jumping into the setup, it helps to understand why GA4 struggles with LLM traffic out of the box. The issue comes down to how AI tools send users to your site.
When someone reads a ChatGPT response and clicks a cited link, the referrer header is sometimes passed cleanly. But many AI apps open links in embedded browsers, suppress referrer data for privacy reasons, or route traffic through intermediary pages. The result is that a meaningful chunk of your AI-driven visits arrives with no source information at all, landing in GA4 as “(direct) / (none).”
The three ways LLM traffic shows up in GA4
- Correctly as referral: The AI tool passes its domain as the referrer (e.g.,
chatgpt.com,perplexity.ai). GA4 logs it under the Referral channel. This is the best-case scenario. - Incorrectly as a generic referral: GA4 sees the source but lumps it into the default Referral channel alongside every other non-search referrer. You can’t isolate AI traffic without a custom channel group.
- As direct traffic, the referrer is stripped entirely. This is sometimes called “shadow AI” traffic and is the hardest to recover. Estimates suggest 20-30% of all LLM-driven visits fall into this category.
The practical implication: You need two things working together. First, a custom channel group that correctly classifies the referrals GA4 can see. Second, a behavioural segment that helps you surface the shadow traffic hiding in Direct. This guide covers both.
The Top 20 LLM Referral Sources to Track
Before configuring GA4, you need to know which domains to watch. This table covers the 20 most significant LLM and AI assistant platforms sending referral traffic as of 2026, along with how they typically appear in GA4.
| LLM / AI Platform | Likely GA4 Session Source | Medium | Notes |
|---|---|---|---|
| ChatGPT | chatgpt.com | referral | Most consistent referrer passing. High volume. |
| OpenAI (direct) | openai.com | referral | Some traffic comes via openai.com rather than chatgpt.com |
| Perplexity AI | perplexity.ai | referral | Strong B2B referrer; 25% of AI sign-ups for some brands |
| Google Gemini | gemini.google.com | referral | May vary by app surface and device |
| Microsoft Copilot | copilot.microsoft.com | referral | Can vary; include edgeservices.net as a proxy |
| Edge Services (Copilot proxy) | edgeservices.bing.com | referral | Often masks Copilot traffic |
| Bing Chat | bing.com/chat | referral | Separate from standard Bing organic |
| Claude (Anthropic) | claude.ai | referral | Growing share, especially in professional use cases |
| Grok (xAI) | grok.x.ai | referral | Doubled publisher traffic within one quarter in 2025 |
| You.com | you.com | referral | AI-native search engine |
| Poe (Quora) | poe.com | referral | Aggregates multiple LLMs, including GPT-4 and Claude |
| DuckDuckGo AI Chat | duckduckgo.com/chat | referral | Privacy-focused; referrer sometimes stripped |
| Writesonic | writesonic.com | referral | AI writing tool with browsing capability |
| Copy.ai | copy.ai | referral | Content generation platform |
| Nimble | nimble.ai | referral | AI-powered research assistant |
| iAsk.ai | iask.ai | referral | Factual AI search engine |
| Phind | phind.com | referral | Developer-focused AI search |
| Mistral (Le Chat) | chat.mistral.ai | referral | Growing European user base |
| Meta AI | meta.ai | referral | Integrated into Facebook, Instagram, and WhatsApp |
| Bard (legacy) | bard.google.com | referral | Legacy domain; still appears in some datasets |
Note: This list changes fast. New AI surfaces launch frequently and referrer domains shift. Review and update your regex every 2-3 months.
Step 1 – Check Your Baseline in GA4
Before building anything new, check whether GA4 is already picking up some LLM traffic in its default reports. This takes two minutes and tells you what you’re working with.
- Log in to Google Analytics 4 and select your property.
- In the left navigation, go to Reports > Acquisition > Traffic Acquisition.
- In the Session primary dimension dropdown, make sure Session source/medium is selected.
- Scroll through the table and look for entries like
chatgpt.com / referral,perplexity.ai / referral, orgemini.google.com / referral.
If you see them, GA4 is already capturing some LLM referrals. If you see nothing, that doesn’t necessarily mean you’re getting zero AI traffic. It could mean all of it is arriving as Direct, which is exactly the problem we’ll fix in the steps below.
What you’re looking for at this stage:
- Any AI domain appearing as a session source
- The volume relative to your total sessions
- Which landing pages are those sessions hitting (blog posts and guides attract the most LLM referrals)
Once you have a baseline snapshot, move to Step 2.
Step 2 – Build a Custom Exploration Report for LLM Traffic
The fastest way to get a clear picture of your LLM traffic without changing any GA4 configuration is to build a custom Exploration report. This is non-destructive and gives you an instant view of AI referrals.
Create the exploration
- In GA4, click Explore in the left navigation menu.
- Click Blank to start a new exploration.
- Rename it to something like “LLM Referral Traffic & Conversions.”
- Set the segment to All Users and click Confirm.
Add dimensions and metrics
Under the Variables panel on the left:
- Click + next to Dimensions and add: Session Source/Medium and Page Referrer
- Click + next to Metrics and add: Sessions and Key Events (Key Events is what GA4 now calls Conversions)
- Click Import to confirm each selection
Drag both dimensions into the Rows section of the Settings panel. Drag Sessions and Key Events into the Values section.
Apply the LLM regex filter
This is the critical step. Without a filter, the report shows all traffic.
- Scroll to the bottom of the Settings panel and click + Add filter.
- Select Page Referrer as the dimension.
- Set the condition to match the regex.
- Paste the following regex:
^.*(chatgpt\.com|openai\.com|perplexity\.ai|claude\.ai|gemini\.google\.com|bard\.google\.com|copilot\.microsoft\.com|edgeservices\.bing\.com|bing\.com\/chat|grok\.x\.ai|you\.com|poe\.com|duckduckgo\.com\/chat|writesonic\.com|copy\.ai|nimble\.ai|iask\.ai|phind\.com|chat\.mistral\.ai|meta\.ai).*$
- Click Apply.
Your report will now show only sessions where the page referrer matches one of the major LLM platforms. You’ll see a breakdown by source/medium alongside session counts and key events, giving you a clear picture of which AI tools are driving traffic and whether those visitors are converting.
Pro tip: Add Date as a secondary dimension to track trends over time. A sudden spike in Perplexity or Gemini referrals often correlates with your content being cited in an AI-generated answer.
Step 3 – Create a Custom Channel Group for LLM Referrals
The Exploration report from Step 2 is great for analysis, but it doesn’t change how GA4 classifies traffic in your standard reports. For that, you need a Custom Channel Group. This is the permanent fix that keeps AI traffic from being buried inside the generic Referral bucket going forward.
Why you must create a NEW channel group (not edit the default)
GA4 applies channel group changes retroactively to historical data if you edit the default group. That means your past data gets rewritten, which makes trend analysis unreliable. Always create a new channel group to preserve your historical baseline.
Setup instructions
- In GA4, go to Admin > Data Display > Channel Groups.
- Click Create new channel group.
- Name it “AI / LLM Referrals.”
- Click Add new channel and name it “AI Referrals.”
- Set the condition to: Session Source matches regex.
- Paste the following regex pattern:
.*(chatgpt\.com|openai\.com|perplexity\.ai|claude\.ai|gemini\.google\.com|bard\.google\.com|copilot\.microsoft\.com|edgeservices\.bing\.com|grok\.x\.ai|you\.com|poe\.com|duckduckgo\.com|writesonic\.com|copy\.ai|nimble\.ai|iask\.ai|phind\.com|chat\.mistral\.ai|meta\.ai).*
- Click Save.
The critical ordering step
This is where most people get it wrong. GA4 processes channel rules from top to bottom and assigns traffic to the first matching rule it finds. If your “AI Referrals” channel sits below the default “Referral” channel, traffic from bing.com/chat will be grabbed by the Organic Search rule before your AI rule ever gets a look.
After saving, drag your “AI Referrals” channel to position 1 in the list, above Organic Search and Referral.
Moving the AI Referrals channel to the top improves classification accuracy from around 60% to 95%, according to testing data from PassionFruit Analytics.
Once saved, your standard Traffic Acquisition report will show a dedicated “AI Referrals” row, separate from organic, direct, and generic referral traffic.
Step 4 – Recover “Shadow AI” Traffic Hidden in Direct
Even with a perfect channel group configuration, 20-30% of LLM-driven visits will still arrive as “(direct) / (none)” because the referrer header was never passed. You can’t eliminate this entirely, but you can surface it using two complementary techniques.
Technique 1: The Shadow AI Segment
AI-referred visitors have a distinctive behavioural fingerprint that separates them from people who typed your URL directly or came from a bookmark. Build this segment in GA4 Explorations:
- In your Exploration, click + Add segment.
- Create a new segment with these conditions:
- Session Source exactly matches
(direct) - New/Returning = New Users
- The landing page does not contain
/(i.e., not the homepage)
- Session Source exactly matches
- Save it as “Shadow AI Proxy.”
The logic here: genuine direct traffic usually goes to your homepage. People who arrived via an AI recommendation are new visitors landing deep on a blog post or guide page. This segment won’t be 100% precise, but a spike in it that correlates with a drop in organic search is a strong signal that AI citation is at work.
Technique 2: UTM-Tagged Share Links
This is the most reliable long-term fix. When your content is distributed in AI-heavy communities (Slack workspaces, Discord servers, internal wikis, newsletters), add UTM parameters to the URLs so GA4 can attribute the traffic correctly regardless of referrer stripping.
A simple UTM scheme for AI traffic:
utm_source=llm
utm_medium=ai-referral
utm_campaign=content-name
When someone clicks a UTM-tagged URL, GA4 reads the parameters directly and attributes the session correctly, bypassing the referrer problem entirely. One SaaS client reduced misattributed Direct traffic by 35% after implementing UTM-tagged share links across their content distribution workflow.
The bottom line: combining the Shadow AI segment with UTM tagging lets you capture 70-80% of true LLM referrals, which is enough to identify patterns, measure growth, and make informed decisions about content investment.
Step 5 – Track the Right Metrics (Not Just Sessions)
Volume alone is a vanity metric for LLM traffic. The real story is in engagement and conversion. Here’s what actually matters and what the benchmarks look like across sites actively measuring this channel.
Key metrics to monitor
| Metric | What to measure | LLM Benchmark |
|---|---|---|
| Sessions | Total visits from AI sources | 0.14-0.17% of all traffic today |
| Engagement Rate | % of sessions lasting 10+ seconds | Higher than social; comparable to organic |
| Average Session Duration | Time spent per visit | 28% longer than organic search visitors |
| Conversion Rate | Key events per session | 2.3% for AI vs 2.0% for search, 1.1% for social |
| Landing Pages | Which pages attract AI citations | 60% land on FAQs, guides, and how-to content |
| Revenue per Session | For e-commerce sites | AI buyers spend ~$30 more per order in some verticals |
How to view these metrics in GA4
- Go to Reports > Acquisition > Traffic Acquisition.
- Switch the channel group to your new “AI / LLM Referrals” group using the dropdown at the top of the report.
- Add secondary dimensions like Landing Page or Device Category to break down the data further.
- For conversion data, make sure your key events (form submissions, purchases, sign-ups) are properly configured under Admin > Events.
Building a Looker Studio dashboard
If you’re reporting to a client or team, a Looker Studio dashboard that surfaces AI traffic quarter-over-quarter is far more useful than a raw GA4 export. Connect your GA4 property, use your custom channel group as a filter, and build a comparison view: AI vs Organic vs Social vs Paid. The contrast in conversion rates tends to get attention fast.
Key takeaway: AI referrals are currently small in volume but outsized in quality. A SaaS brand that received just 500 AI-referred visits converted 50 of them into demo requests, a 10% conversion rate that would be exceptional from any channel.
GA4 LLM Tracking Setup Checklist
Use this as your go-to reference before signing off on the setup. Each item is ordered by priority.
| Priority | Action | Why It Matters |
|---|---|---|
| 1 | Confirm the GA4 tag fires on all deep content pages | LLM traffic lands on blog/resource pages, not the homepage. Missing tags = undercounting. |
| 2 | Check Traffic Acquisition for existing AI referrals | Establishes your baseline before any changes. |
| 3 | Create an Exploration filtered by LLM regex | Immediate analysis without touching GA4 config. |
| 4 | Create a Custom Channel Group: “AI / LLM Referrals” | Keeps AI traffic separate from generic Referral in all standard reports. |
| 5 | Move AI Referrals to position 1 in channel ordering | Prevents Organic Search or Referral rules from grabbing AI traffic first. |
| 6 | Build the Shadow AI segment in Explorations | Surfaces LLM traffic hiding in Direct. |
| 7 | Implement UTM-tagged share links on key content | Recovers attribution when referrers are stripped. |
| 8 | Configure key events for conversions | Volume without conversion data is meaningless. |
| 9 | Build a Looker Studio dashboard | Makes the data visible to teams and clients on an ongoing basis. |
| 10 | Schedule a quarterly regex review | New AI platforms launch constantly. Your list needs refreshing every 2-3 months. |
What to Do With Your LLM Traffic Data
Tracking is only valuable if it drives action. Once your setup is live and data starts flowing, here’s how to turn those numbers into a content and SEO strategy.
Double down on pages getting AI citations
If your Exploration report shows that three of your blog posts are consistently receiving LLM referrals, those pages are being cited in AI-generated answers. That’s a signal to update them regularly, expand them with more depth and structured data, and build internal links from those pages to your conversion-focused content.
Optimise for AI citation, not just search ranking
Content that gets cited by LLMs tends to share common characteristics:
- Factual density: Specific statistics, named examples, and verifiable data points
- Structured formatting: Clear headings, bullet lists, and FAQ sections that AI can extract cleanly
- Schema markup: FAQ schema and HowTo schema increase citation likelihood by an estimated 30-40%
- Direct answers: Sections that open with a concise, self-contained answer to a clear question
Watch for “citation frequency” on your top keywords
Run a manual audit of your top 20 revenue-driving keywords by asking ChatGPT, Perplexity, and Gemini the questions your audience searches for. Note whether your brand is cited. If competitors are being cited and you’re not, that’s a content gap worth addressing, and your GA4 data will confirm whether fixing it translates into referral traffic growth.
Refresh your tracking every quarter
The AI landscape moves fast. Grok doubled its publisher traffic within a single quarter in 2025. New platforms launch, referrer domains change, and the relative weight of each LLM shifts. Set a calendar reminder to review your regex list and channel group every 2-3 months.
Frequently Asked Questions
Can I track 100% of LLM referral traffic in GA4?
No. An estimated 20-30% of AI-driven visits will always appear as Direct traffic because some AI apps strip referrer headers entirely. However, combining a custom channel group with the Shadow AI segment and UTM-tagged links allows you to capture 70-80% of true LLM referrals, which is sufficient to identify trends and inform strategy.
Does GA4 automatically track ChatGPT traffic?
Not in a useful way. By default, GA4 lumps ChatGPT traffic into the generic Referral channel alongside every other non-search referrer. You need to create a Custom Channel Group with a regex filter and place it above the default Referral channel to isolate and measure it properly.
How do I know if a spike in Direct traffic is actually coming from AI tools?
Use the Shadow AI segment: filter for sessions where Session Source = (direct), User Type = New User, and Landing Page is not the homepage. A spike in this segment that coincides with a drop in organic search is a strong indicator that AI citation activity is increasing. It’s a proxy, not a definitive answer, but it’s the best available signal.
Which LLM sends the most referral traffic?
ChatGPT currently sends the highest volume of referral traffic for most sites. Perplexity AI punches above its weight in B2B contexts, accounting for up to 25% of AI-driven sign-ups for some SaaS brands. Gemini is growing rapidly on mobile. The mix varies significantly by industry and content type.
Is LLM traffic worth tracking if my numbers are small?
Yes. AI referrals currently average 0.14-0.17% of total traffic across most sites, but finance and B2B brands are already seeing 2-3% of sessions from AI sources. More importantly, AI-referred visitors convert at higher rates than social traffic and spend more time on the page. The volume is small now; the trajectory is not.
Do I need to edit the default GA4 channel group?
No. Editing the default channel group rewrites your historical data, which makes trend comparisons unreliable. Always create a new, separate channel group for AI / LLM Referrals and leave the default group intact.
How often should I update my LLM regex list?
Every 2-3 months at minimum. New AI platforms launch regularly, existing tools change their referrer domains, and the relative traffic contribution of each platform shifts quickly. Grok, for example, doubled publisher traffic within a single quarter after launch. Treating your regex as a “set and forget” will cause you to miss emerging sources.
What schema markup helps with AI citation?
FAQ schema and the How To schema are the most impactful for LLM citation. Adding structured data that clearly labels questions and answers makes it significantly easier for AI models to extract and cite your content. Some estimates put the citation likelihood improvement at 30-40% for pages with well-implemented FAQ schema compared to unstructured content.
Can I track LLM traffic in Google Search Console?
Google Search Console tracks clicks from Google Search properties, including AI Overviews. It does not track traffic from third-party AI tools like ChatGPT, Perplexity, or Claude. For those sources, GA4 with a custom channel group is the right tool.
What is the difference between LLM referral traffic and AI Overview traffic?
AI Overview traffic (from Google’s search results) comes through Google Search and appears in GA4 as organic search traffic, not as a referral. LLM referral traffic comes from standalone AI tools like ChatGPT, Perplexity, and Claude, where users click a link in the AI’s response and land on your site. These are two distinct channels that require different measurement approaches.
Final Thoughts
LLM referral traffic is still a small slice of most websites’ total sessions, but the quality of those visitors and the pace of growth make it worth measuring now rather than later. The setup outlined in this guide takes less than an hour to implement and gives you a permanent, durable measurement framework that improves as AI traffic grows.
The core takeaway: GA4 won’t do this for you automatically. Without a custom channel group, a regex-filtered exploration, and a shadow traffic segment, you’re making content and SEO decisions without visibility into one of the fastest-evolving acquisition channels in digital marketing.
Start with Step 1 today. Even a baseline check of your current Traffic Acquisition report will tell you whether AI tools are already sending visitors to your site. From there, each step builds on the last, and the whole setup can be live within an afternoon.
If you need help configuring GA4 for AI traffic tracking, building Looker Studio dashboards, or developing a content strategy that earns more LLM citations, the team at SaaSLinks works with businesses across Australia on exactly this. Get in touch to discuss your analytics setup.
