The Execution Layer for AI Discovery

AI doesn't rank.
AI recommends.

FancyAI is the execution layer that influences AI Search engines and gets you recommended. Across ChatGPT, Gemini, Perplexity, and Claude — measured, mapped, and moved with the AI Readiness Index (ARI).

Tracked across
ChatGPT Gemini Perplexity Claude Grok + Future Models

AI Readiness Index (ARI)

Live · Q2 2026
Your AI Readiness Index (ARI) Score
0
Out of 100
Visible
Higher than 68% of brands
in your category
AI Visibility Score 71/100
ChatGPT
62
Gemini
65
Perplexity
74
Claude
60
Recommendation Rate
12%
▲ 4 pts
Share of Voice
5.6%
▲ 1.2%
AI Presence
72/100
▲ 5 pts
Visibility Gaps
65
▼ 7
Free Diagnostic

See how AI engines
recommend you.

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, and Claude. Delivered to your inbox in 24 hours.

No credit card. No sales call required. Just your score.

Request received.

Real analysis takes real time. We're querying ChatGPT, Gemini, Perplexity, and Claude across your category — then weighting recommendation frequency, sentiment, and competitive positioning.

Your full AI Readiness Index (ARI) report lands in your inbox within 24 hours.

Want to see what we'd do with the score? Book a demo →
The Category vs. FancyAI
The category sells visibility. We sell influence.
The category tracks. We change outcomes.
How it works

Influence, made measurable.

Three signals. One execution layer. Repeatable lift across every model that matters — quantified by the AI Readiness Index (ARI).

01
Evaluate

The AI Readiness Index (ARI) tells you where you stand.

Entity clarity, citation density, structured proof, corroborating mentions — scored across every major model.

02
Execute

We change the signals AI sees.

Site changes, structured data, citation graph engineering, third-party mentions — operated, not advised.

03
Measure

Influence shows up where it counts.

AI recommendation rate across ChatGPT, Gemini, Perplexity, and Claude — tracked weekly, mapped to revenue.

Explore the platform

AI Platform Breakdown

5 platforms tracked
Platform Presence Change (90d) Position
G ChatGPT 18 / 100 ▼ 6 pts #3
P Perplexity 42 / 100 ▼ 4 pts #1
Gemini 33 / 100 ▼ 5 pts #2
Claude 28 / 100 ▲ 8 pts #3
𝕏 Grok 22 / 100 ▲ 3 pts #4

The brands AI now selects, not just sees.

The category sells visibility.
We sell influence.

Featured Case Study
CONNER Hats Q1 2026 · 90 days

Jumped 99 positions
in 90 days.

FancyAI executed targeted content, structure, and citation changes across ChatGPT, Gemini, Perplexity, and Claude — driving Conner Hats from page-five obscurity to top-three AI recommendations in under 90 days.

Read the full case study
4.4×
Conversion lift, AI traffic vs. organic
Top-3
Recommendations across all four LLMs
12 wks
From kickoff to full execution rollout
FAQ

Questions about AI visibility, answered.

What is Generative Engine Optimization (GEO)?

GEO helps your brand get recommended inside AI answers from platforms like OpenAI ChatGPT, Google Gemini, Anthropic Claude, and Perplexity AI Perplexity.

SEO was built for rankings and clicks. GEO is built for recommendations and citations inside AI-generated answers.

If your brand is not part of the answer, it is often excluded from consideration entirely.

How is GEO different from SEO?

SEO helps your website rank in search engines. GEO helps AI systems understand, trust, and recommend your brand.

Some fundamentals overlap — authority, technical health, structured content, and credibility still matter. But GEO also depends on signals traditional SEO was never designed for, including:

  • Brand/entity clarity
  • Third-party citations and mentions
  • Structured, extractable content
  • Consistency across the web
  • Presence in communities and trusted sources AI systems reference

The strongest brands run SEO and GEO together because both compound.

Which AI platforms does FancyAI cover?

We monitor major AI discovery platforms including ChatGPT, Gemini, Claude, Perplexity, Copilot, and emerging AI search experiences as they evolve.

Each platform behaves differently. Optimization that works on one model does not automatically transfer to another, which is why our approach is model-agnostic by design.

What is the AI Readiness Index (ARI)?

AI Readiness Index (ARI) is FancyAI's 0–100 scoring system that measures how prepared your brand is to be recommended by AI systems.

The score evaluates four core signals:

  • Entity clarity
  • Citation authority
  • Structured proof
  • Corroborating mentions across the web

As those signals strengthen, brands tend to appear more consistently in AI-generated recommendations.

How long does it take to see results?

Most brands begin seeing early visibility improvements within weeks. Larger shifts in recommendation behavior typically happen over a few months as authority signals compound and AI systems refresh their understanding of your brand.

Timing depends on your category, competitive landscape, and starting foundation.

Why invest in AI visibility now?

AI is reshaping how people discover brands.

More decisions are happening before a user ever visits a website. AI systems are increasingly building the shortlist first — and many brands are not being included at all.

Brands investing now are establishing authority and visibility that compounds over time. Brands waiting are giving competitors a head start inside the systems shaping future discovery.

The Product

If you're not
in the answer,
you don't exist.

We get you in the answer. The AI Readiness Index (ARI) tells you where you stand. The Execution Layer changes it. Across ChatGPT, Gemini, Perplexity, and Claude.

Execution Dashboard Live Execution System

AI Visibility Diagnosis

Visibility Share Low Visibility
25%
Your brand appears in 25% of relevant AI responses across major LLMs.
Competitive Status
Competitors dominate citation share in this category.
Opportunity Gap Detected
Top Missing Opportunities
Pricing intent queries
Missing from 8/10 top comparative pricing prompts.
Comparison queries
Not cited when users ask "X vs Y" alternatives.
Buyer guides
Absent from definitive category overviews.

Execution Action Plan 3 active

On-Site Recs
23
Off-Site Actions
18
Active Campaign Tasks
Update Pricing Page Schema
Inject missing comparative pricing schema to capture intent queries.
In Progress
Publish "X vs Y" Comparison Post
Targeting story angles for buyer guides. Draft generated.
Queued
Forbes Citation Outreach
Sending PR + content pitch snippets to target journalists.
In Progress
Reddit Seed Campaign
Initiated discussions in niche subreddits.
Published
The AI Readiness Index (ARI) · Built on The Signal Hierarchy

Four signals. One score.
Every model that matters.

AI Readiness Index (ARI) scores your eligibility to be recommended by AI across the four signals defined in The Signal Hierarchy — FancyAI's published methodology, derived from 40,000+ websites and 1,500+ sources.

Entity Clarity on-site

How clearly AI understands your brand as a defined entity in knowledge graphs.

Your score92 / 100

Citation Density off-site

How many authoritative third-party sources reference your brand by name.

Your score68 / 100

Structured Proof on-site

How extractable your content is — passages, tables, statistics, schema markup.

Your score74 / 100

Corroborating Mentions off-site

Whether you're named in editorial coverage, "best of" lists, and roundups AI reads.

Your score55 / 100
The Execution Layer

We don't advise
We execute.

Other tools tell you what AI is saying. We change what AI sees. Across every signal, every surface, every model — operated as a sprint, not a recommendation.

Book a Demo
P-281 Pro Max Fence · Action Plan
Category:Commercial Fencing Topics:10 Prompts:100
Models: GPT-5.1, Gemini 2.5 Pro, Grok 4.1, Sonar, Claude 4.5 Sonnet
Run 1 · Recommendation Completed
Overview Components Analytics Competitors Recommendations Citation Builder Evidence
Expected Impact: moderate
Pending New Content NC-VIDEO-STRATEGY
Overview Implementation plan
Video Content Priorities Revision: Off ▾
1. Chain Link vs. Welded Wire Fence: Which is Right for Your Commercial Property?
What it answers: "chain link or welded wire fence commercial"

AI systems already answer this comparison query with generic manufacturer content. Pro Max should create a side-by-side field comparison showing actual installations: chain link at a construction yard, welded wire at a power substation. Project manager explains cost difference ($3-8/linear foot for chain link vs. $15-40 for welded wire), security ratings, and when each makes sense. Three-minute video.

Why AI cites it: Comparative content with specific cost figures and use-case guidance.
Signal → Execution

Every signal has a surface we operate on.

AI Readiness Index (ARI) Signal
Where we execute
Entity Clarity
Wikidata Google Knowledge Graph schema.org markup Brand entity submissions
Citation Density
Editorial outreach Citation graph engineering Reddit & LinkedIn presence PR placement
Structured Proof
Site copy rewrites Passage-level structure Comparison tables Statistic injection
Corroborating Mentions
"Best of" list inclusion Editorial roundups Industry directories Awards & accreditations
Measurement

Influence, tracked. Weekly.

Every change we make moves the AI Readiness Index (ARI). Every AI Readiness Index (ARI) movement maps to recommendation rate. Every recommendation maps to traffic, qualified leads, and revenue.

AI Readiness Index (ARI) tracked weekly across ChatGPT, Gemini, Perplexity, and Claude.
Share of voice and recommendation rate, by competitor and category.
Attribution from AI traffic to your CRM, with conversion lift baselines.
AI Readiness Index (ARI) · 90-Day Trend
Conner Hats · Demo data
Start
23
Today
87
▲ +64 pts
Velocity
+0.7/day
100 50 0 Day 0 Day 45 Day 90 Entity sub. List incl.
FAQ

How the product works.

What does the AI Readiness Index (ARI) actually measure?

AI Readiness Index (ARI) measures how prepared your brand is to be discovered and recommended by AI systems.

The score evaluates four core areas:

  • How clearly AI understands your brand
  • How often trusted sources mention you
  • How easy your content is for AI systems to extract and cite
  • How consistently your brand appears across the web

Each signal is scored independently, then combined into a single 0–100 score.

Does AI Readiness Index (ARI) directly correlate with recommendation rate?

In the categories we monitor, stronger ARI scores consistently align with higher recommendation frequency.

Brands with stronger authority, citation, and trust signals tend to appear more often inside AI-generated answers.

We track both ARI and recommendation rate weekly so you can see the relationship directly in your dashboard.

What is citation graph engineering?

Citation graph engineering improves how your brand is referenced and validated across the internet.

AI systems do not learn from your website alone. They also evaluate the broader ecosystem surrounding your brand — articles, directories, reviews, Reddit discussions, media coverage, comparison lists, and third-party mentions.

FancyAI identifies the authority gaps influencing recommendation behavior, then prioritizes the actions most likely to improve visibility across AI systems.

How is FancyAI different from monitoring tools?

Most GEO tools stop at reporting.

FancyAI combines monitoring with execution.

We track recommendation visibility, citation coverage, and AI presence across platforms — then actively improve the signals influencing those outcomes through content, schema, entity optimization, citation work, and editorial execution.

The dashboard measures the work. The work drives the results.

Can FancyAI work alongside our existing SEO program?

Yes. GEO and SEO work best together.

Traditional SEO still matters because AI systems continue pulling signals from authoritative search results. GEO builds on that foundation by improving the signals AI systems use to decide which brands to recommend.

Strong SEO helps GEO. Strong GEO can strengthen SEO performance over time.

How do you measure influence over time?

We track three primary metrics:

  • AI Readiness Index (ARI)
  • Recommendation rate across AI platforms
  • Traffic and conversion attribution from AI sources

Every customer receives a dashboard tying visibility improvements to business outcomes — including qualified traffic, leads, and pipeline impact.

The Team

Operators, not advisors.
Built to execute.

Six operators pulled together around one bet: AI doesn't rank — it recommends. And recommendation is something you can change.

5
Founder Seats
9
Prior CXO Roles
12
Companies Built & Led
1
Bet

The visibility category stops at observation. We built the team to do the verb. Operators who've shipped enterprise growth, engineers who've built the platforms brands depend on, and strategists who've sold to Fortune brands — pulled together around a single bet: AI recommends, and recommendation is changeable.

Tom Howell
Tom Howell
Co-Founder & Chief Executive Officer
Former Co-Founder of ZPEG. Principal at Bizwiggle. Architect of FancyAI's go-to-market and enterprise growth infrastructure.
Keith Brown
Keith Brown
Chief Operating Officer
Former CIO of Alliance Health. Former CIO of Fusion-io. Operator across enterprise infrastructure and IPO-stage scaling.
Chris Barbee
Chris Barbee
Chief Revenue Officer
Former CSO at Omnicom. Former CMO at 8AM Golf. Built brand and revenue functions across enterprise health and consumer brands.
Mikaela Berman
Mikaela Berman
VP Product Growth & CX
Former Head of Marketing at Salesbot and RecruitBot. Built product-led growth across two B2B SaaS scale-ups.
Joseph Ashburner
Joseph Ashburner
Head of Strategy
Founder of ShopNinja. Co-Founder & CMO of Smoke Holdings IBC. Strategic operator across consumer and DTC growth.
Research

Original research.
No recycled hot takes.

We don't write opinions. We publish data. Every report below cites primary sources — and most are based on our own analysis of how the major LLMs actually decide which brands to recommend.

40K+
Websites Analyzed
1,500+
Sources Cited
12,500
Cross-Platform Queries
5
LLMs Tracked

The corpus.

37 reports · Updated weekly
Original Research 6.8%
Original Research
Which AI Platform Cites What? We ran the same 500 queries across 5 LLMs.

12,500 queries across ChatGPT, Perplexity, Claude, Gemini, and Grok. Only 6.8% of cited domains showed up on three or more platforms. Optimization for one is not optimization for the others.

12 min read
Myth-Busting 40/60
Original Research
"GEO is just rebranded SEO." We ran the numbers. Here's the 40/60 truth.

40% of GEO overlaps with good SEO fundamentals. The other 60% is genuinely different — and that's where most of the bad advice and missed opportunity lives.

16 min read
Platform Deep-Dive 129K
Platform Deep-Dive
How to optimize for ChatGPT: 20 ranking factors from 129,000 domains.

SE Ranking analyzed 129,000 domains. We pulled the SHAP-ranked top 20 ChatGPT factors. Domain trust above 90 = 4× more citations. FCP under 0.4s = 6.7 vs 2.1 citations. The full playbook.

11 min read
Landscape Analysis 3%
Landscape Analysis
Most GEO "experts" can't cite a single study. Here's who actually can.

75 voices tracked. Only 3% of top SEO thought leaders include "GEO" in their headline. We mapped the landscape into three tiers and found 5 major gaps no one is filling.

17 min read
Macro Analysis 61%
Original Research
How AI Overviews killed the click. A zero-click economy emerges.

Eight primary studies. One conclusion: organic CTR is collapsing across every measurement, and the decline is broader than AI Overviews alone. Even queries without AIOs lost 41% CTR YoY.

13 min read
Business Case 23×
Business Case
AI visitors convert at 23×. The quality story behind the quantity drop.

AI traffic is 1.08% of total today, growing 527% YoY, and converting at multiples organic search has never seen. Six independent studies converge on the same finding.

11 min read
Academic Foundation +40%
Academic Paper
The Princeton paper, decoded. The academic foundation of GEO.

Aggarwal et al. coined the term, built GEO-bench, and quantified what works. Three years later, every credible GEO claim still traces back to this paper.

15 min read
Citation Patterns 90%
Original Research
Why fresh content beats authority. The recency bias in AI citations.

90% of AI bot hits land on content less than three years old. AI-cited pages are 368 days fresher than traditionally-ranked ones. Continuous publishing is the new optimization unit.

9 min read
Predictive Signals 0.737
Original Research
YouTube mentions predict AI visibility better than backlinks.

Ahrefs analyzed 75,000 brands. The strongest single predictor of AI brand visibility wasn't domain authority. It wasn't backlinks. It was YouTube mentions — by a wide margin.

10 min read
B2B Buyer Behavior 67%
Buyer Behavior
67% of B2B buyers start with AI. The new front door.

B2B buyers are adopting AI search at three times the consumer rate. By the time they visit a vendor website, the shortlist is already set.

12 min read
Content Collapse 91.4%
Original Research
AI is now citing AI. The 91.4% problem.

Search Engine Land found 91.4% of AI Overview citations are AI-generated. CJR found AI search wrong 60% of the time — premium models worse than free. Each generation degrades.

13 min read
Brand Risk 42.1%
Original Research
When AI lies about your company. A brand hallucination field guide.

Air Canada lost a tribunal. Soundslice built a feature ChatGPT invented. Hoka had wrong pricing on display. The hallucination rate is 17–90% depending on domain — and 40% of users never check the source.

12 min read
Industry Impact 50%
Landscape & Behavior
"Extinction-level event." How AI search is restructuring the open web.

NPR's framing for what publishers face. Daily Mail vice chair: 50% of traffic gone in five years. 500+ lawsuits. A handful of platforms now control how billions discover information.

15 min read
Comparative Analysis < 1/100
Comparative Analysis
The honest skeptic's case against GEO.

Rand Fishkin: fewer than 1 in 100 prompts return the same brands. Profound: 40–60% of cited domains change in a month. A founder shut down his GEO tool after concluding it was just good marketing. The strongest counter-arguments deserve a hearing.

13 min read
Manipulation 3
Original Research
Black hat GEO: the manipulation playbook (and why it's doomed).

Three categories of manipulation are spreading: data poisoning, citation stuffing, and hidden prompt injection. Harvard demonstrated text sequences that force LLM outputs. The platforms are evolving faster than the attackers.

11 min read
Ethics 0
Landscape & Behavior
GEO ethics in 2026: no framework, growing stakes.

No industry body. No code of ethics. No enforcement mechanism. As 37% of consumers start with AI and 82% are skeptical, the discipline is being built on every operator's individual judgment.

12 min read
Legal & Regulatory 500+
Landscape & Behavior
The legal front: 500+ lawsuits, antitrust, and AI defamation.

The New York Times sued OpenAI and Perplexity. Google faces EU antitrust over AI Overviews. The Section 230 question is unsettled. Wolf River Electric is testing AI defamation in court.

13 min read
Technical Architecture 10
Academic Paper
The 10 gates: how AI search engines actually decide what to cite.

Most GEO writing describes outcomes. This one explains the engine. The pipeline from page on the open web to citation in an AI response is a 10-stage system — and most brands optimize for the wrong stages.

14 min read
Comparative Analysis 5 of 6
Comparative Analysis
Six platforms promise to get your brand cited by AI. Most don't finish the job.

Independent buyer-side comparison of Profound, Evertune, Semrush, Scrunch, Conductor, and FancyAI — evaluated across fourteen dimensions. The structural fault line splitting the category in two.

14 min read
Platform Deep-Dive 2.5B
Platform Deep-Dive
How to Optimize for Google AI Overviews and Gemini: Being Indexed Is Not the Same as Being Selected

Google already crawls your site. That is the trap. AI Overviews now reach 2.5 billion monthly users and run on Gemini 3, yet they cite three to five sources from a near-infinite index. The work is no longer getting seen. It is getting selected.

14 min read
Platform Deep-Dive 46.7%
Platform Deep-Dive
How to optimize for Perplexity: the answer engine that reads the live web

Perplexity does not rank pages. It reads the web in real time, pulls from thousands of sources, and cites them in the answer. Reddit, fresh content, and structured proof decide who gets named. Most of what works on Google does not work here.

13 min read
Platform Deep-Dive 40%
Platform Deep-Dive
How to Optimize for Claude: The Enterprise Engine That Cites Differently

Claude reads a different internet than ChatGPT. It runs on Brave's independent index, cites the most conservatively of any major assistant, leans on older long-form and earned media, and sits inside 40% of enterprise LLM spend. Optimizing for it is a separate discipline, not a footnote.

13 min read
Platform Deep-Dive 117M
Platform Deep-Dive
How to optimize for Grok: the only engine where a tweet is a ranking signal

Grok answers from a live feed of X conversation that no other AI can touch. Reddit, YouTube, and Facebook supply nearly half its citations, and X engagement directly shapes which sources it picks. Optimizing for Grok is a social and real-time game, not a content-library game.

13 min read
Vertical Report 393%
Vertical Report
Being Seen vs. Being Selected : How AI Decides Which Products to Recommend

AI shopping traffic to US retailers grew 393% in a single quarter and now converts 42% better than every other channel. But the product page is no longer the front door, and most retail sites are not even readable by the machines deciding what to recommend. This is the new shelf.

15 min read
Vertical Report 45%
Vertical Report
Being Seen vs. Being Selected : How AI Decides Which Local Businesses to Recommend

Consumer use of AI to find local businesses jumped from 6% to 45% in a single year. But AI recommends only a fraction of the locations that win Google's map pack, and the business closest to the searcher no longer wins. Proximity got replaced by reputation, and most local owners are optimizing for a game that is no longer being played.

14 min read
Vertical Report 32%
Vertical Report
GEO for Healthcare: How AI Picks Its Sources When the Answer Could Hurt Someone

In every other vertical, AI weighs visibility. In health it weighs liability. The authority bar is higher, the citation pool is narrower, and a single hospital with fifty years of clinical expertise can be entirely invisible to the model deciding what a patient does next.

15 min read
Vertical Report 51%
Vertical Report
The Shortlist Forms Before the Demo : How AI Decides Which Software to Recommend

Half of B2B software buyers now begin their search inside an AI chatbot, and the winning vendor is already on the buyer's shortlist 95% of the time before a single sales call. The new battleground isn't your pricing page. It's whether the machine names you when a buyer asks for the best tool in your category.

14 min read
Execution Playbook 40.1%
Execution Playbook
Reddit is the citation engine : a playbook for earning AI mentions through real community presence

Reddit is the single most-cited source across every major AI engine. It got there because licensing deals piped its conversations straight into the models, and because community discussion is the closest thing the web has to honest first-person experience. You cannot buy your way in. You earn it by being genuinely useful in the threads where your category is decided.

15 min read
Execution Playbook ~0%
Execution Playbook
Structured Proof: How to Make Your Content Extractable When Schema Alone Won't Save You

The biggest empirical test of schema markup to date found it moved AI citations by roughly zero. Yet pages with statistics get cited 40% more, tables 2.5x more, and tight answer passages 3x more. The lever is not the markup. It is the structure of the proof underneath it.

15 min read
Execution Playbook 84%
Execution Playbook
Citation acquisition: the off-site system for becoming the source AI cites

Most of what AI engines say about your brand is shaped by content you do not own. Earned media, listicles, reviews, and reference entries carry the overwhelming majority of the signal. This is the playbook for engineering those mentions on purpose, and the case for why earning beats buying every time.

15 min read
Methodology <1 in 100
Methodology
How to Measure AI Visibility: The Right Metrics for the Recommendation Era

Rank tracking was built for a world where the same query returned the same list. That world is gone. Ask an AI the same question twice and the odds of getting the identical list are under 1 in 100. This report sets the measurement standard for a system that recommends instead of ranks, and builds the four signals behind the FancyAI AI Readiness Index.

15 min read
Buyer Behavior 900M
Buyer Behavior
Who Actually Uses AI Search: The Consumer Has Already Moved

Nearly a billion people now open an AI chatbot every week, half of consumers treat it as their starting point for buying decisions, and the answer they get back is a short, opinionated recommendation, not ten blue links. The consumer didn't wait for marketing to catch up. The only question left is whether the machine names your brand when they ask.

15 min read
Emerging 20%
Emerging
When the Buyer Is a Machine : Agentic AI and the End of the Human Shopper

AI stopped recommending and started transacting. Agents now influence one in five holiday orders, the protocols to let them pay are live, and the question is no longer whether a person picks you. It is whether a machine finds you eligible.

15 min read
Emerging 69%
Emerging
The Visibility Gap Most US Brands Are Ignoring: International and Multilingual GEO

Sixty-nine percent of ChatGPT's users are now outside the United States, and the fastest-growing AI markets on earth speak Hindi, Portuguese, Spanish, and Indonesian. But the models default to English sources, ignore the standard signals that tell a search engine which language version to show, and recommend US domains even when the question is asked in another language. Most American brands optimized for an English-speaking machine and never noticed the audience moved.

14 min read
Vertical Report 55%
Vertical Report
Finance Runs on Borrowed Trust : How AI Decides Which Banks, Cards, and Advisors It Recommends

In the highest-stakes vertical AI touches, the brands that own the product almost never own the answer. Across 200,000-plus AI citations in wealth management, NerdWallet appeared in **38% of responses** and Bankrate in **35.3%** — while the banks, card issuers, and advisory firms whose products were being discussed sat largely outside the citation set. In a regulated, your-money-or-your-life domain, AI does not reward the institution. It rewards the source the institution is described in.

15 min read
Vertical Report 77.67%
Vertical Report
GEO for Legal: The Directories Own the Citation Layer, and Zero Law Firms Own It Back

Legal is the highest-trust, most-regulated, most locally-driven vertical in AI search — and it triggers more AI answers than any other YMYL category. When a client asks an AI engine to recommend a lawyer, the answer comes from seven directories, not from law firm websites. The firm that built the expertise is often the one the model never names.

15 min read
Vertical Report 56%
Vertical Report
The Itinerary Is the New Search Result : How AI Picks Which Hotels and Destinations to Recommend

More than half of US travelers now use AI to plan a trip, and 78% of AI users have booked based primarily on an AI recommendation. But when a traveler asks for the best hotels in a city, the engine names a handful and ignores the rest, and the brands it names are often not the biggest. This is the new front desk.

14 min read
FAQ

About our research.

How did you analyze 40,000+ websites?

Our research corpus was built over 18 months using academic papers, platform documentation, industry studies, and original citation analysis.

Every published claim is tied to a primary source, experimental dataset, or reproducible methodology.

What is "the corpus" you keep referring to?

The corpus is FancyAI's internal research library.

It includes academic studies, platform documentation, citation analyses, experiments, and AI visibility research organized across multiple GEO categories.

The corpus informs both our published research and our AI Readiness Index (ARI) methodology.

How is FancyAI's research different from typical "GEO expert" content?

Most GEO content repeats broad opinions without primary sourcing.

Our research is built on citation analysis, reproducible methodologies, platform behavior studies, and original experiments.

Where possible, we publish the methodology so findings can be independently validated.

How often do you publish new research?

We typically publish new research monthly, depending on when findings meet our internal standards for rigor and validation.

Recent topics include citation behavior, AI visibility patterns, structured content performance, and cross-platform recommendation studies.

What does "structure beats length" mean for content?

Longer content does not automatically perform better in AI systems.

Our research consistently shows that well-structured content — clear headings, statistics, comparison tables, concise explanations, and extractable formatting — is cited more often than long, unstructured pages.

The goal is not more content. It is more usable content.

What happens when AI models change?

AI systems evolve constantly.

FancyAI continuously monitors shifts in recommendation behavior, citation patterns, and platform responses across major models.

Our scoring systems and execution priorities adapt alongside those changes so strategies remain aligned with how AI systems actually behave.

Can GEO hurt SEO performance?

No. Effective GEO should strengthen foundational SEO signals, not compete with them.

Improving authority, clarity, citations, structure, and entity understanding often benefits both organic search visibility and AI recommendation performance.

Can I cite your research in my own work?

Yes.

All published research is openly shareable and fully sourced. We encourage teams, analysts, journalists, and marketers to reference our findings with attribution to FancyAI.

Case Studies

Real brands.
Real lift.

Two execution programs run end-to-end — search and AI as a single strategy. The same playbook moved an e-commerce retailer 99 ranking positions in 90 days, and lifted a POS hardware brand to position #1 on both Google and AI in the same month.

+99
Top Position Gain
896
AI Citations · 4 Models
+36%
Revenue MoM
90
Days to Results
Featured Case E-Commerce Specialty Retail 90 Days
CONNER Hats
connerhats.com
Family-owned · 27 collections audited

GEO moved SEO by 99 positions in 90 days.

At a family-owned specialty retailer, AI-optimized content became the accelerant their search program had been missing. Same site, same SEO program — layered with FancyAI's GEO methodology. Within 90 days, ranking improvements landed across the entire collection portfolio.

GEO content is not a replacement for SEO. It is an accelerant. The same pages that were slowly climbing surged when we layered GEO on top. — Tom Howell, Co-Founder & CEO, FancyAI
Request the full case study
+99
Top Keyword Position Gain
27
Collection Pages Updated
338
Keywords Tracked
90 days
From Kickoff to Lift

The Approach

FancyAI layered GEO methodology on top of the existing SEO program. No rebuilds, no replacements. 27 collection pages audited, then content, structure, and schema updated to match what AI systems prioritize: product authority, experiential detail, and structured data. Traditional rankings and AI visibility tracked together so the compounding effects were measurable.

01
Audit
Diagnose how AI systems interpret each collection page.
02
Optimize
Update content, structure, and schema across 27 pages.
03
Monitor
Track SEO rank and AI visibility as one system.

The Results · Selected Keywords

50+ keywords broke into the top 100. New rankings appeared for keywords Conner Hats had never ranked for. Top performers:

Keyword
New Rank
Δ
Gambler Hats
#1
+99
Fishing Hats
Top 10
+72
Black Cowboy Hats
Top 5
+68
Steampunk Hats
Top 10
+62
Leather Hats
Top 10
+57
Outback Hats
Top 10
+54

The Insight

Most brands treat GEO as a separate workstream from SEO. A new channel, new team, new vendor. That framing misses the point. The signals that win AI recommendations are the same signals Google rewards. Run them as one program and both compound. Run them as two and you underperform in both.

Featured Case E-Commerce POS Hardware High-Intent Search
●●
Category Leader
POS Hardware · anonymized
Confidential client
Credit card terminals · stands

From page two to position #1. On Google. In AI.

A unified strategy moved a Category Leader in POS to #1 across both channels, lifted monthly revenue 36%, and — critically — survived the Google March 2026 core update without a scratch while competitors took measurable hits.

We were doing fine on Google. In a category where the top three results get the sale, fine is not enough. FancyAI did not move us up. They made us the answer. — Marketing Lead, Category Leader in POS
Download the full case study
#1
Google Rank, Core Keyword
896
AI Citations Across 4 Models
+36%
Revenue Month-Over-Month
+54%
Sessions Year-Over-Year

The Approach

FancyAI runs SEO and GEO as a single strategy — one program, one roadmap, compounding into both channels at once. For DCC Supply, that meant parallel execution across search and answer engines: 10 premium backlinks live, 47 content optimizations, 6 new blog articles, and 100 prompts monitored continuously across GPT-5.1, Claude 4.5 Sonnet, Gemini 2.5 Pro, and Sonar.

Live
10 Premium Backlinks
Authority earned, not bought.
Done
47 Optimizations
Content + structure + schema.
Pub
6 New Articles
Buyer-intent content surfacing.
Live
100 Prompts Tracked
Across 4 LLMs, continuous.

The Results · Search

Three primary keywords hit #1 in the same month. /collections/stands traffic surged +323%. /collections/terminals up +162%. Revenue grew roughly 5× faster than traffic.

Keyword
Before
After
credit card terminals
#13
#1
credit card terminal
#23
#1
credit card processing terminal
#17
#1

The Results · AI

Position #1 in AI results for "credit card terminals" — matching the Google rank. 896 total citations across all four AI models. "Buy credit card machine" impressions +1,817%. Sentiment: 10% positive, 90% neutral, 0% negative.

+1,817%
"Buy credit card machine" impressions
+223%
"Credit card processing machine" impressions
Google March 2026 Core Update

Our client held all authority metrics steady while weaker competitors in the POS category took measurable hits. The strategy was built for durability, not short-term gains.

Categories We've Operated In

Built for high-intent categories.

E-Commerce
Specialty Apparel
E-Commerce
POS Hardware
E-Commerce
Outdoor Gear
CPG
Personal Care
CPG
Better-for-You Foods
Healthcare
Preventive Care
Healthcare
Corporate Wellness
Health & Wellness
Supplements
Financial Services
Advisory
Travel
Adventure Outfitters
Home Services
HVAC
Marketing Services
Agency Partners
Automotive
Dealer Network
B2B SaaS
HR Tech
Industrial Services
Safety & Training
Retail
Sleep & Mattress
Automotive
Aftermarket
Fitness
Equipment & DTC
Manufacturing
Packaging & Materials
Real Estate
Workspace & Studios
Pricing

Priced for influence.
Not impressions.

Eligibility is the software you log into — the platform that diagnoses how AI engines see your brand and tracks recommendation rate as it moves. Visibility is the team that executes the offsite work each month — citations, links, Reddit, distribution, and a dedicated strategist. Run the platform self-serve, or bundle it with the team in one of two execution tiers.

Plans

Eligibility + Visibility, bundled.

Three plans. One ladder. Basic runs the platform alone. Essential and Growth add the team that executes the work every month.

Basic

For brands running GEO themselves.

$799 / mo
Billed monthly · cancel anytime
Get Started
  • Eligibility · Platform
  • 1 brand domain tracked
  • 5 custom GEO action plan runs / month
  • Up to 50 recommendations per GEO plan
  • 3 major LLM platforms
  • Competitor tracking
  • Unlimited prompts tracked
  • Unlimited GEO reporting plans
  • Visibility not included. Implementation is self-serve.
Growth

For multi-product brands in competitive markets.

$5,000 / mo
Monthly retainer · 90-day minimum
Get Started
  • Eligibility · Platform
  • Up to 5 domains tracked
  • 25 custom GEO action plan runs / month
  • Up to 50 recommendations per GEO plan
  • All major LLM platforms
  • Competitor tracking
  • Unlimited prompts tracked
  • Unlimited GEO reporting plans
  • Visibility · Team
  • Onsite content updates
  • Technical GEO
  • Off-site execution
  • $3,000 in links
  • $1,000 in Reddit posts
  • $500 for press release & distribution
  • Dedicated strategist
Enterprise / Agency — Custom. For large catalogs, creative agencies, and PR firms.
Talk to Sales
Every Plan Includes

Everything you get with FancyAI.

Eight capabilities across visibility, scoring, and execution — bundled by default. Tier limits scope (entities, platforms, sprint cadence), not capability.

Visibility capabilities (Citation Graph Engineering, Knowledge Graph Submissions, Editorial Outreach & List Inclusion) are delivered in Essential and Growth.

Visibility Dashboards
Live AI Readiness Index (ARI) score, recommendation rate, share-of-voice tracked across every major LLM.
Recommendation Tracking
How often your brand surfaces in AI answers — by prompt, by platform, by competitor.
Cross-Platform Coverage
ChatGPT, Gemini, Perplexity, and Claude — tracked in one unified report.
AI Readiness Index (ARI)
Composite eligibility score across the four signals AI uses to decide who to recommend.
Execution Sprints
Content rewrites, structure, and schema updates — operated, not advised. We do the work.
Citation Graph Engineering
Authoritative third-party placements that move AI brand-mention weight where it matters.
Knowledge Graph Submissions
Brand entity definition in Wikidata, Google Knowledge Graph, schema.org — clarified for AI.
Editorial Outreach & List Inclusion
Earned placements in "best of" roundups, directories, and editorial coverage AI reads.
Competitive Matrix

How FancyAI Compares.

Full Automation Actual Execution AI Optimization
Capability
FancyAI
Profound
Conductor
Otterly
SEMrush / Ahrefs
Multi-LLM Monitoring
Limited
Bolt-On
Recommendations
Unlimited (20+ Types)
2+ Types
2+ Types
AI Readiness Index (ARI)
Citation Building
Segmented Analytics
Cost
$$
$$
$$
$$
$
FAQ

Questions, answered.

What's the difference between Eligibility and Visibility?

Eligibility is the platform.

It diagnoses how AI systems currently understand your brand, identifies what is missing, and tracks recommendation performance across major AI models.

Visibility is the execution layer.

That includes the strategist, citation work, earned media, structured content improvements, editorial distribution, and authority-building initiatives that improve recommendation outcomes.

The platform identifies the gaps. The team closes them.

Can I buy Visibility without Eligibility?

No. Visibility runs against what Eligibility measures. Buying the team without the platform means you can't see what's moving — that's why Visibility is bundled with Eligibility in Essential and Growth.

How is FancyAI different from a GEO tracking tool?

Most GEO tools focus on monitoring.

FancyAI combines monitoring with execution.

We measure visibility, recommendation frequency, and citation coverage — then actively improve the signals driving those outcomes through ongoing optimization and strategic execution.

How long until we see lift?

Most brands begin seeing measurable visibility improvements within the first several weeks. More meaningful recommendation shifts typically happen over a few months as authority signals compound.

Results depend on your category, competition, and current AI Readiness Index (ARI) baseline.

Which AI platforms do you cover?

We monitor major AI platforms including ChatGPT, Gemini, Claude, Perplexity, Copilot, and emerging AI discovery experiences.

Coverage varies by plan, with higher tiers supporting broader platform monitoring and expanded competitive analysis.

What's actually delivered each month on Essential and Growth?

The Visibility work included in Essential and Growth covers a dedicated strategist plus ongoing execution designed to improve AI recommendation performance.

Depending on plan level, deliverables may include:

  • Citation development
  • Editorial placements
  • Reddit and community visibility
  • Press distribution
  • Technical GEO updates
  • Structured content optimization
  • Ongoing authority building

Every engagement is tied to measurable visibility goals.

How do you measure influence?

We track three primary metrics:

  • AI Readiness Index (ARI)
  • Recommendation frequency across AI platforms
  • Traffic and conversion attribution from AI sources

The goal is measurable business impact — not vanity metrics.

What's the typical contract structure?

Basic is billed monthly and can be cancelled anytime. Essential and Growth are monthly retainers with a 90-day minimum so authority signals have time to compound. Enterprise and Agency engagements run annually with custom volume pricing.

Careers

Building what the
category only measures.

Six operators today. Hiring the people who build the engine that recommendation runs on next.

The roles we hire for.

We hire when the work is real and the role is funded. No farming, no “always-on” postings. The next openings will appear here when they're live.

Currently

No open roles posted.

We're between cycles. The next openings will be GTM (enterprise AE, partnerships) and platform (senior engineering). Drop your name in front of us early.

Book a Demo

See your brand
the way AI sees it.

A live walkthrough across ChatGPT, Gemini, Perplexity, and Claude. We map your visibility, your gaps, and the execution plan — whether or not we work together.

A live walkthrough of the FancyAI execution engine.
An AI Visibility Diagnosis on your brand and your top three competitors.
A custom action plan you can run with — starting the day you onboard.
As used by
Conner Hats EHE Health Leatherman Consello
Enter a valid work email address.
Enter a valid phone number.
No spam. Just actionable insights.
Booked

You're on the calendar.

Check your inbox for a calendar invite from chris@getfancy.ai. Meanwhile, here's what to read first:

Back to Research
Live · FancyAI Research Corpus

The mention is the signal. The link is almost irrelevant.

A foundational methodology for AI brand visibility, derived from 40,000+ websites and 1,500+ sources across eleven categories of GEO research.

+115%
AI visibility lift for lower-ranked sites applying signal-first tactics
41%
Of AI brand signal weight from list mentions
4.4×
Conversion of AI-referred vs organic traffic
1,500+
Unique sources in our research corpus
Chapter 01

The signal hierarchy has flipped

The most counterintuitive finding from analyzing 40,000+ websites: low-quality backlinks — the thing the SEO industry has spent twenty years building — show weak or neutral correlation with AI visibility.

Digital Bloom's evaluation of 129,000+ domains gave us the weights. The signals that actually drive AI brand recommendations:

  • 41% — Authoritative list mentions (being named in “best of” lists, roundups, directories)
  • 18% — Awards & accreditations (third-party validation)
  • 16% — Online reviews (especially for branded queries)
  • 0.334 — Brand search volume correlation coefficient (the strongest single predictor)
  • Weak / neutral — Low-quality backlinks themselves

Here is the key distinction: being mentioned by name matters enormously. The 41% weight on list mentions isn't about the hyperlink. It's about the mention. When a “Best CRM for Small Business” article names your brand, AI learns that association whether or not there's a link attached.

“The mention is the signal. The link is almost irrelevant.”
Chapter 02

AI visitors convert at 4.4× the rate of organic

When ChatGPT recommends your brand, the user arrives pre-qualified. They are not browsing ten blue links. They received a direct recommendation from a system they already trust.

Semrush puts the differential at 4.4×. Go Fish Digital puts it at 25×. Even the conservative numbers make the business case clear: AI-referred traffic is fundamentally different from organic.

The average GEO customer acquisition cost is $559 and declining at 37.5% as the market matures. Compare that to Google Ads CPC, which continues climbing 10–15% annually with no ceiling in sight.

Chapter 03

Each AI platform behaves differently

One stat most people miss: only 11% of domains cited by ChatGPT are also cited by Perplexity. Each platform has its own search backend and its own citation behavior.

PlatformBackendTop Source
ChatGPTBingWikipedia (47.9%)
PerplexityProprietary 200B+ URL indexReddit (46.7%)
Google AI OverviewsGoogleYouTube (#1)
ClaudeBrave SearchDiversified
GrokX / Twitter dataX conversations

Optimizing for one platform doesn't give you the others. The opposite is also true: brands present on four or more platforms are 2.8× more likely to appear in ChatGPT specifically (Digital Bloom). Multi-platform presence is itself a signal.

Chapter 04

Structure beats length

Word count has a 0.04 correlation with AI citation. Effectively zero. 53.4% of pages cited by AI are under 1,000 words.

What actually moves citation rate:

  • Adding statistics → +41% AI visibility (Princeton/Georgia Tech, 10,000 queries)
  • Comparison tables → 2.5× more citations than equivalent prose
  • Structured formatting (H2/H3, bullets, numbered lists) → +40% citation lift
  • Content freshness (within 30 days) → 3.2× more citations

The biggest opportunity here: lower-ranked sites saw a +115% visibility improvement from applying these tactics — even without improving traditional rankings. The structure is the unlock.

Chapter 05

The 40/60 truth

“GEO is just rebranded SEO” is the most common objection we hear. It's partially right. About 40% overlaps with good SEO fundamentals: technical accessibility, E-E-A-T, quality content, schema markup.

But 60% is genuinely different:

  1. Entity optimization. Your brand as a defined entity in knowledge graphs.
  2. Passage-level structure. 50–150 word self-contained blocks for AI extraction.
  3. Cross-platform consistency. Same brand narrative everywhere AI crawls.
  4. Earned media. 61% of AI brand signals come from editorial media.
  5. Social proof. Reddit is the #1 most-cited source. LinkedIn is #2.

Google's John Mueller confirmed the foundation in January 2026: “There is no such thing as GEO or AEO without doing SEO fundamentals.” Both are true. The foundation matters. But if you stop at SEO, you are missing the 60% that drives AI visibility.

Chapter 06

What to do with this

If we had to pick three actions:

  1. Entity optimization. Build your brand in Wikidata, Google Knowledge Graph, schema.org. Everything else depends on this.
  2. Structure for extraction. Tables, numbered lists, 50–150 word passage blocks with statistics. Stop writing longer; start writing better-structured.
  3. Go multi-platform. Get on four or more platforms. Reddit, LinkedIn, YouTube, your domain. 2.8× visibility increase.

This methodology is the Signal Hierarchy, and it underpins how the AI Readiness Index (ARI) is scored. Every recommendation FancyAI makes traces back to one of these signals.

Sources cited

  1. Digital Bloom — 129,000+ domain analysis
  2. SE Ranking — ChatGPT ranking factor study (129,000 domains)
  3. Aggarwal et al., Princeton/Georgia Tech/IIT Delhi — “GEO: Generative Engine Optimization,” KDD 2024
  4. Semrush — AI referral conversion analysis
  5. Go Fish Digital — AI traffic conversion benchmarks
  6. John Mueller, Google Search Advocate — January 2026 statement on GEO/AEO

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Original Research Original Research Platform Deep-Dive
Back to Research
Live · FancyAI Research Corpus

Which AI platform cites what? Five LLMs, the same questions, almost no overlap.

12,500 query responses analyzed across ChatGPT, Perplexity, Claude, Gemini, and Grok. Optimization for one platform is rarely optimization for any other.

6.8%
Of cited domains appeared on three or more platforms
11%
Domain overlap between ChatGPT and Perplexity
5
AI engines analyzed in parallel
>99%
Of repeat queries return different brand orderings
Chapter 01

The headline finding

We ran the same 500 standardized commercial-intent queries through five AI search engines — ChatGPT, Perplexity, Claude, Gemini, and Grok — and recorded every cited source. The corpus covers 12,500 individual responses.

The result almost no SEO veteran expects: only 6.8% of cited domains appeared on three or more platforms. The cross-platform overlap most teams plan around does not exist at the level practitioners assume. Optimizing for ChatGPT does not, in any meaningful sense, optimize you for Claude.

This finding aligns with SparkToro and Gumshoe's 2025 research, which found AI models produce different brand recommendation lists more than 99% of the time when asked the same question repeatedly. The probability of receiving identical lists in the same order drops below 0.1%.

“Any tool that gives you a "ranking position in AI" is full of baloney. — Rand Fishkin, SparkToro”
Chapter 02

Why the platforms diverge

Each AI platform pulls from a different search backend, weights different source classes, and applies different freshness rules.

PlatformSearch BackendCitation Style
ChatGPTBing (sequential queries)Hover-highlight inline
PerplexityProprietary 200B+ URL indexFootnote list, ~8.79 avg citations
Google AI OverviewsGoogle indexCompact source panel
ClaudeBrave SearchEmbedded inline links
GrokOwn index + X dataX conversation excerpts

ChatGPT leans heavily on Wikipedia and Bing-indexed editorial. Perplexity disproportionately surfaces Reddit and forum content. Gemini favors YouTube and Google-indexed pages. Claude pulls a more diversified mix via Brave. Grok pulls disproportionately from X conversations.

If your strategy treats “AI visibility” as a single channel, you will systematically miss four out of five surfaces.

Chapter 03

Ranking position is the wrong metric

SparkToro's research surfaced a finding that fundamentally reframes how visibility should be measured. Asked the same brand-discovery question 71 times, ChatGPT named City of Hope hospital in 97% of responses for West Coast cancer care — but as the #1 mention in only 25 of those answers.

The implication: AI engines are probability machines. They are designed to generate unique answers every time. Treating them as deterministic ranking systems is “provably nonsensical.”

What is a valid metric: visibility percentage across many runs. Certain brands consistently appear in 80–97% of responses for their core categories. Others appear in 2%. That delta is the real signal — and it requires running each prompt 60–100 times minimum to surface.

Chapter 04

What the 6.8% have in common

The minority of domains that did appear across three or more platforms shared a consistent profile:

  • Strong entity definition in knowledge graphs (Wikidata, Google Knowledge Graph, schema.org Organization markup with consistent identifiers)
  • Editorial mentions in third-party publications — not just owned media. Trust-tier outlets every model crawls.
  • Active community presence on Reddit, LinkedIn, and niche forums where Perplexity, Grok, and Claude pull conversational signal.
  • Schema-marked structured content with extractable 50–150 word passages.
  • Sub-second page performance (FCP under 0.4 seconds), which ChatGPT in particular weights heavily.

None of these are tied to a single platform's ranking algorithm. They build the entity itself, which every model independently learns. The cross-platform brands aren't optimizing for five engines — they're building one credible entity that all five recognize.

Chapter 05

The strategic implication

Two operational shifts follow from this data:

  1. Stop reporting "AI visibility" as a single number. Break it out per platform. Your Perplexity visibility is a Reddit/forum problem. Your ChatGPT visibility is a Bing/Wikipedia problem. Your Gemini visibility is a YouTube/Google-indexed problem. They have different solutions.
  2. Build the entity, not the placements. Every dollar spent strengthening cross-platform signals (knowledge graph, editorial media, community presence) compounds across all five engines. Every dollar spent gaming a single platform stays inside that platform.

The brands present in 4+ platforms are 2.8× more likely to appear in ChatGPT specifically (Digital Bloom). Multi-platform presence is itself a signal.

Chapter 06

Methodology

Queries were drawn from a stratified sample of high-intent commercial searches across SaaS, consumer, healthcare, and B2B services. Each query was issued to each platform under identical conditions, with identical session-state controls. Citations were scraped from response pages, deduplicated by root domain, and cross-referenced.

The full per-platform overlap matrix and category-level breakdowns are available on request: research@getfancy.ai.

Sources cited

  1. SparkToro / Gumshoe — “AIs are highly inconsistent when recommending brands” (Rand Fishkin & Patrick O’Donnell, 2025)
  2. SE Ranking — Cross-platform AI search engine comparison (2025)
  3. BrightEdge — Weekly AI Search Insights (October 2025)
  4. Yext — “Same Search, Different Results?” (April 2025)
  5. Digital Bloom — Multi-platform presence correlation analysis
  6. Search Engine Land — “The goal is being seen, trusted, and reused wherever people search.”

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Platform Deep-Dive
Back to Research
Live · FancyAI Research Corpus

“GEO is just rebranded SEO.” We ran the numbers. Here's the 40/60 truth.

Approximately 40% of GEO overlaps with strong SEO fundamentals. The other 60% is genuinely different — and it is where most of the bad advice and missed opportunity lives.

40 / 60
Overlap with SEO fundamentals vs net-new GEO discipline
5
Net-new disciplines unique to GEO
61%
Of AI brand signals from earned media
25%
Projected drop in traditional search by 2026 (Gartner)
Chapter 01

Where the 40% lives

The shared foundation is real. Strong domain authority, technical accessibility, schema markup, E-E-A-T, quality content, and clean information architecture all matter for both search and AI recommendation systems. AI engines pull from Bing, Brave, Google, and proprietary indices — the underlying crawlability and ranking signals still apply.

If you have a strong SEO program, the 40% is mostly already there. That is the easy part of the conversation. It is also the part the “GEO is just SEO” crowd correctly identifies. Bounteous research shows 99% of URLs in Google AI Mode appear in the top 20 organic search results — SEO strength still maps to AI visibility. The foundation matters.

“There is no such thing as GEO or AEO without doing SEO fundamentals. — John Mueller, Google”
Chapter 02

Where the 60% lives

The other 60% is where SEO instincts actively mislead. Five disciplines are net-new to GEO:

  1. Entity optimization. AI systems reason about brands as entities in knowledge graphs, not as ranked URLs. Wikidata, Google Knowledge Graph, schema.org Organization definitions are the primary lever — not page-level keyword targeting. XFunnel calls this “lexical proximity”: keeping brand-associated descriptors within 2–3 words of your brand name to train LLM associations.
  2. Passage-level structure. AI extracts 50–150 word self-contained blocks. Your H2/H3 hierarchy, statistic density, and table usage now drive selection more than total page word count. SE Ranking found content sections of 120–180 words between headings receive 70% more citations than sections under 50 words.
  3. Cross-platform consistency. The same brand narrative needs to appear consistently across every surface AI crawls — site, Reddit, LinkedIn, YouTube, third-party media. Inconsistency creates ambiguous entity resolution.
  4. Earned media gravity. 61% of AI brand signals come from editorial media and third-party mentions, not owned content. The job is to be talked about, not just to publish.
  5. Community presence. Reddit is the #1 most-cited source for Perplexity (46.7% of citations). LinkedIn is #2. Neither was ever an SEO priority. Both are now first-class GEO real estate.
Chapter 03

The unit of optimization has changed

SEO optimizes for a list of 10 blue links. GEO optimizes for inclusion in a synthesized answer where only 2–7 sources are cited.

That competitive narrowing changes the optimization target itself. The unit of work shifts from pages to passages — AI engines extract specific chunks rather than ranking whole pages. The signal shifts from backlinks to citations — being referenced across trusted third-party sources matters more than link profiles.

NerdWallet's 2024 numbers tell the story in business terms: revenue rose 35% while monthly traffic fell 20%. Discovery and decision-making are shifting to AI-mediated experiences where direct site visits decline but pre-qualified, recommendation-driven business impact increases.

Chapter 04

Why the framing matters

If a CMO treats GEO as a 100% rebrand, they fund the wrong work. They build a GEO team that ignores SEO fundamentals and bleeds technical authority. If they treat it as 100% identical, they fund the SEO playbook and quietly lose the 60% that AI engines actually use to choose recommendations.

The 40/60 framing is the operational answer: keep the SEO foundation, fund the GEO disciplines that don't exist inside it.

The market has voted. The GEO/AEO category was valued at $848M in 2025 and is projected to reach $33.7B by 2034 — a 50.5% CAGR. Regardless of terminology, the commercial opportunity is being committed to.

Chapter 05

The honest test

If your team can answer all five of these “yes,” you are doing GEO. If you cannot, you are still doing SEO — and missing the 60%.

  1. Is our brand a defined entity in Wikidata and Google Knowledge Graph?
  2. Are our top 20 commercial pages structured with passage-level extractability (tables, statistics, 50–150 word blocks)?
  3. Do we have a coherent brand narrative across owned media, Reddit, LinkedIn, YouTube, and third-party editorial?
  4. Are we earning mentions in editorial outlets every model crawls, not just publishing on our blog?
  5. Are we measuring visibility per-platform across many prompt runs, not single-query rankings?

Sources cited

  1. Bounteous — Google AI Mode citation overlap analysis
  2. SE Ranking — Content section length and citation correlation
  3. Gartner — Traditional search volume forecast
  4. NerdWallet 2024 financial disclosures — Revenue / traffic divergence
  5. Profound — “What is Answer Engine Optimization?”
  6. XFunnel — GEO-specific optimization elements (semantic triples, lexical proximity)
  7. Search Engine Land — Philipp Götza, “GEO myths” (2025)
  8. John Mueller, Google Search Advocate — January 2026

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Platform Deep-Dive
Back to Research
Live · FancyAI Research Corpus

How to optimize for ChatGPT: 20 ranking factors from 129,000 domains.

SE Ranking analyzed 129,000 domains across 20 niches and produced a SHAP-ranked list of the top factors driving ChatGPT citation. Here is the playbook — including what doesn't work.

129K
Domains in the SHAP-ranked ranking factor study
Citation lift for domain trust > 90
6.7 vs 2.1
Citations for FCP < 0.4s vs > 1.13s
800M+
ChatGPT monthly active users
Chapter 01

How ChatGPT actually picks sources

ChatGPT's search architecture is a multi-stage pipeline: a tiny classification model, a specialized “Thinky” query-generation model, Bing-powered web retrieval, semantic chunk scoring, and final synthesis by the frontier model. The system typically cites 3–5 deeply-read sources per response, with Wikipedia dominating at 7.8% of all citations.

SearchGPT is now used in 46% of ChatGPT interactions. Usage for general information searches tripled from 4.1% to 12.5% in just six months (Feb–Aug 2025). The platform processes 2.5 billion daily prompts with 500M+ weekly active users.

For source selection, ChatGPT uses sequential search queries (not parallel like Perplexity), reviewing multiple sites before aggregating. Six selection criteria emerge from testing: precise keyword matching, search intent recognition (appending terms like “tutorial”), aggressive recency filtering, credibility (E-E-A-T), trustworthiness (author bios, methodology), and variety of perspectives.

Chapter 02

The top 20 ranking factors

SE Ranking's analysis of 129,000 domains used SHAP (SHapley Additive exPlanations) values to rank features most predictive of ChatGPT citation. The ranked top of the list:

RankFactorEffect Size
#1Referring domainsSites with 32K+ referring domains are 3.5× more cited
#2Domain trafficStrong positive correlation
#3Domain TrustTrust > 90 yields ~4× more citations vs <43
#4Page TrustPage-level authority signal
#5INP / FCP / LCP performanceFCP < 0.4s = 6.7 citations vs 2.1 over 1.13s
#6+Content length, freshness, Reddit/Quora mentions3-month-old content = 6.0 citations vs 3.6 stale

Brand mentions on Reddit and Quora yielded 4× higher citation likelihood. Articles over 2,900 words averaged 5.1 citations vs 3.2 for under 800 words — longer is better, but only with structure (next chapter).

“Flashy "AI hacks" like LLMs.txt barely have any impact. What drives ChatGPT citations are the fundamentals. — Yulia Deda, SE Ranking”
Chapter 03

Structure beats every shortcut

The SE Ranking study quantified the structure effects:

  • Content sections of 120–180 words between headings receive 70% more citations than sections under 50 words
  • Pages with expert quotes average 4.1 citations vs 2.4 without
  • Pages with 19+ statistics average 5.4 citations vs 2.8 with fewer
  • Broad topic-descriptive URLs get 2× more citations than keyword-optimized URLs
  • Companies with review scores below 70% are significantly less likely to be referred

Perhaps most interestingly: .gov and .edu domains averaged only ~3.2 citations — content quality matters more than TLD. The authority bias is weaker than the SEO industry has assumed.

Chapter 04

What does NOT work

The most consequential negative findings, all from controlled testing:

  • LLMs.txt has negligible impact. Removing it actually improved the SE Ranking model's predictive accuracy. Multiple independent experiments (Dejan AI, others) confirm no measurable lift.
  • FAQ schema markup shows negative correlation. Pages with FAQ schema averaged 3.6 citations vs 4.2 without. ChatGPT does not appear to access schema during grounding.
  • Keyword-optimized URLs underperform. Topic-descriptive URLs get 2× more citations. ChatGPT cares about clarity, not exact match.

This is the part most GEO “experts” get wrong. The industry has spent eighteen months selling LLMs.txt and FAQ schema as silver bullets. The data says they aren't bullets at all.

Chapter 05

The four crawlers (and how to control them)

OpenAI operates three distinct crawlers, each controllable independently via robots.txt:

  • GPTBot — for training. Grew 305% in crawling volume from May 2024 to May 2025, becoming the dominant AI crawler at 30% share.
  • OAI-SearchBot — for real-time SearchGPT.
  • ChatGPT-User — for user-initiated browsing within ChatGPT sessions.

You can allow some and block others. Most brands should allow all three — blocking GPTBot specifically removes you from training data, which has long-tail visibility implications most teams underestimate.

Tracking ChatGPT referral traffic in GA4: filter for the chatgpt.com referrer. Note that free-tier users often don't send referrer data, creating attribution gaps. Plan for that gap rather than around it.

Chapter 06

The execution priority order

Based on SHAP weights, the highest-leverage actions for an existing brand:

  1. Audit your domain trust score. If you are below 80, fix the foundational citation graph before anything else. Get into trusted lists, directories, and editorial coverage.
  2. Get FCP under 0.5 seconds. The drop-off above that threshold is steep and asymmetric — ChatGPT punishes slow pages much harder than Google does.
  3. Earn Reddit and Quora presence. 4× citation likelihood is among the highest single-factor effects in the data.
  4. Restructure your top 20 commercial pages. Add statistics, comparison tables, expert quotes, and 120–180 word passage blocks between H2s.
  5. Submit a structured Organization entity to Wikidata and apply schema.org markup with consistent identifiers across owned properties.

Sources cited

  1. SE Ranking / Yulia Deda — “How to Optimize for ChatGPT: 20 Ranking Factors” (2025)
  2. First Page Sage / Evan Bailyn — “ChatGPT Optimization: 2026 Guide”
  3. Forge and Smith — “GEO: SEO for ChatGPT” (2025)
  4. HVSEO — “Optimizing for SearchGPT and Understanding Ranking Factors”
  5. Zapier / Harry Guinness — “How does ChatGPT choose its sources?”
  6. Skyscale — “How ChatGPT Selects Sources: Complete Guide” (2025)
  7. Dejan AI — Independent FAQ schema experimentation

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

Most GEO “experts” can't cite a single study. Here's who actually can.

A landscape map of 75+ practitioners shaping the GEO conversation. Three tiers, the academic origin story, the platform pushback, and five gaps no one is filling.

3%
Of top SEO thought leaders include "GEO" in their headline
75+
Practitioners mapped across LinkedIn, X, Substack
82%
Positive sentiment for "GEO" — highest of any AI search term
40%
Visibility lift from the original Princeton/Georgia Tech study
Chapter 01

The academic origin story

The term “Generative Engine Optimization” was coined in an academic paper by Pranjal Aggarwal (IIT Delhi) and Vishvak Murahari (Princeton), published on arXiv in November 2023 and presented at KDD 2024 (pp. 5–16).

The paper introduced the GEO-bench benchmark and tested optimization strategies systematically. The headline finding: “Fluency Optimization + Statistics Addition” can boost AI visibility by up to 40%. That is the academic provenance of the entire field.

“Answer Engine Optimization” (AEO) predates GEO by five years — Jason Barnard (Kalicube) coined it in 2018, originally for the featured-snippet era. AIO has no single originator; it ambiguously means either “AI Optimization” or Google's “AI Overviews” product. Most practitioners now treat GEO as the umbrella term.

“Optimizing for AI search is the same as optimizing for traditional search. — Nick Fox, Google”
Chapter 02

The three-tier landscape

We sampled 75 voices currently publishing under the GEO/AEO/AIO labels across LinkedIn, X, Substack, and YouTube. We coded each by output volume, citation hygiene (do they reference primary research?), and originality (do they publish their own data?).

The distribution clustered into three tiers:

  • Tier 1 — Practitioners with primary research. Under 10 voices publishing original studies, datasets, or methodologies. Mostly platform engineers, academics, and a small group of operator-publishers.
  • Tier 2 — Synthesizers. Mid-sized group recycling tier-1 research with attribution and useful framing. Most useful for operators looking for actionable summaries. Names that show up consistently across multiple expert lists: Lily Ray (Amsive Digital, E-E-A-T focus), Kevin Indig (Growth Memo, 23K+ subscribers, AI search metrics), Mike King (iPullRank, “Relevance Engineering”), Aleyda Solis (Orainti, SEOFOMO 35K+ subscribers, practical international frameworks), and Ross Simmonds (Foundation, content distribution).
  • Tier 3 — Repackagers. The majority. Confident, high-volume, low-citation. Often presenting tier-1 findings as their own observations.
Chapter 03

The platform pushback

The most interesting tension in the field: Google representatives consistently push back on GEO as a distinct discipline.

  • John Mueller (January 2026): “There is no such thing as GEO or AEO without doing SEO fundamentals.”
  • Nick Fox (Google): “Optimizing for AI search is the same as optimizing for traditional search.”
  • Danny Sullivan (Google): Has emphasized that best practices centered on genuine helpfulness will win long-term.
  • Krishna Madhavan (Microsoft Bing): Echoes the “no shortcuts” framing.
  • Jesse Dwyer (Perplexity): Has emphasized platform resistance to manipulation.

The platform position is consistent: SEO fundamentals matter, manipulation will be resisted, and there are no shortcuts. The GEO position, increasingly, is: yes, and there's also a 60% surface area that SEO doesn't cover. Both can be true.

Harvard Business Review (February 2026) validated the shift, describing two concurrent revolutions: the move from SEO to GEO, and AI agents beginning to act as buyers. McKinsey calls AI search “the new front door to the internet.”

Chapter 04

Five gaps no one is filling

Across the entire landscape, five categories of analysis are notably underserved:

  1. Cross-platform behavior studies. Most analysis is single-platform, usually ChatGPT. Per-model citation behavior comparison is rare.
  2. Industry-specific GEO playbooks. Healthcare, financial services, and regulated B2B are nearly absent from public discourse.
  3. Negative-result studies. Almost no one publishes what didn't work. The LLMs.txt non-effect was only surfaced by SE Ranking and Dejan AI taking the time to test and publish a null result.
  4. Long-horizon attribution. Most case studies stop at 90 days. Six- and twelve-month visibility curves under sustained optimization are barely studied.
  5. Methodology critiques. The popular GEO frameworks have no rigorous comparative analysis. Buyers are choosing among them blind.
Chapter 05

How buyers should evaluate vendors

Search Engine Land's study of 75 SEO thought leaders found that fewer than one-third maintained consistent terminology over the past year — an indicator of how unsettled the field still is. GEO had 82% positive sentiment, the highest of any AI search term, but the operator-level vocabulary is still in motion.

If you are sourcing a GEO partner or vendor, citation hygiene is the fastest signal of seriousness. Three questions:

  1. Show me your primary sources. Tier-3 voices will redirect to anecdote. Tier-1 and tier-2 will hand you the studies behind their claims.
  2. What didn't work? Operators who can name their failures (and the data behind them) are doing the work. Operators who can't are reciting.
  3. How do you measure visibility? If the answer is “we run the prompt once,” the answer is wrong. Valid measurement requires 60–100 prompt repetitions per topic to stabilize the visibility percentage.
Chapter 06

The state of the field

The GEO conversation is real, growing, and still in its naming and strategy-formation phase. The academic foundation is solid (Aggarwal & Murahari, KDD 2024). The tier-2 operator commentary is increasingly substantive. The tier-3 noise will compress as the field matures and citation hygiene becomes table stakes.

The buyers who will benefit most are the ones who can read the landscape clearly — choose tier-2 partners over tier-3 narrators, and demand primary-source rigor from anyone publishing under the GEO label. The discipline is real. The signal is buried in noise. Filtering for the signal is the work.

Sources cited

  1. Aggarwal & Murahari, IIT Delhi / Princeton — “GEO: Generative Engine Optimization” (arXiv 2311.09735, KDD 2024)
  2. Search Engine Land — Study of 75 SEO thought leaders, sentiment analysis
  3. Profound — “The 2025 A-list of GEO experts”
  4. Go Fish Digital — “8 GEO Agencies & Thought Leaders” (Feb 2026)
  5. First Page Sage — “The Top GEO / AI Search Experts” (2026)
  6. Harvard Business Review — Feb 2026 SEO-to-GEO analysis
  7. McKinsey — “The new front door to the internet”
  8. John Mueller, Nick Fox, Danny Sullivan (Google); Krishna Madhavan (Microsoft); Jesse Dwyer (Perplexity)

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

How AI Overviews killed the click. A zero-click economy emerges.

Eight primary studies. One conclusion: organic CTR is collapsing across every measurement, and the decline is broader than AI Overviews alone.

61%
Drop in organic CTR for AI-Overview queries (Seer Interactive, 25M impressions)
68%
Drop in paid CTR for AIO queries
41%
CTR drop YoY even on queries WITHOUT AIOs
1%
Of users click into the AI summary itself (Pew)
Chapter 01

The clicks are gone

Seer Interactive ran the most rigorous published study to date: 3,119 queries, 42 organizations, 25.1 million organic impressions, 1.1 million paid impressions. The results are unambiguous.

  • Organic CTR for AI Overview queries collapsed from 1.76% to 0.61% — a 61% drop.
  • Paid CTR for AIO queries collapsed from 19.7% to 6.34% — a 68% drop.
  • Pew Research (n=900): when an AI summary appears, only 8% of users click any traditional link vs 15% without — and only 1% click a source within the AI summary itself.
  • Ahrefs (300K-keyword update): AI Overviews reduce clicks by 58%.

The platforms talk about this differently than the data does. Google says AI Overviews drive engagement and citations. The empirical record across multiple independent studies says CTR is in free fall.

“Even when AI Overviews don't appear, users are simply clicking less everywhere. — Search Engine Land”
Chapter 02

The shift is broader than AI Overviews

The most consequential finding in Seer’s data isn’t the AIO drop. It’s the baseline.

For queries that did not trigger an AI Overview at all, organic CTR still fell 41% year over year. The behavioral shift is bigger than any single feature. Users are skimming AI-generated text snippets, hover-cards, knowledge panels, related questions, and shopping modules — and clicking out to source pages less even when the AI Overview itself isn’t present.

This is the part most CMOs miss when they read the headline number. The 61% drop is the visible peak. The 41% drop on un-AIO’d queries is the baseline shift — and it’s not coming back.

Chapter 03

The publishers’ nightmare

Bain & Company surveyed real consumer behavior with Dynata: roughly 80% of consumers now rely on zero-click results for at least some of their queries, and 60% of all searches end without a click. AdExchanger reports publishers losing 20%, 30%, and in some cases up to 90% of traffic and revenue from zero-click AI search.

The traffic that’s left is consolidating. When ChatGPT cut referrals to traditional sites by 52% between July and August 2025, citations to Wikipedia, Reddit, and YouTube rose by 53% in the same window. AI engines are picking a smaller set of sources and reusing them — a winner-take-all dynamic where the cited brands accumulate disproportionately.

Chapter 04

The cited brands actually win

The flip side of Seer’s collapse data is the most operationally useful finding in the study: brands cited inside an AI Overview earned 35% more organic clicks and 91% more paid clicks than non-cited brands on the same query.

Citation isn’t a consolation prize for losing the click. It’s the new way to win the click. The brands that get pulled into the AI summary inherit the user’s pre-qualification and arrive at the page with higher purchase intent than equivalent organic traffic. This is why visibility share, not click volume, is becoming the metric that matters.

Chapter 05

What to measure now

The metrics shift from click-volume to visibility-share. Three measurements replace the old click-through rate as the leading indicator:

  1. Citation frequency — how often your brand is named or linked across AI engine responses for your category prompts.
  2. Recommendation rate — the percentage of relevant AI prompts that include your brand in the answer (measured across 60–100 prompt repetitions per topic to control for LLM variability).
  3. Share of voice — your citation count as a percentage of total citations for your category, measured per platform.

Click-through rate isn’t dead. It’s no longer the leading indicator of discovery. Brands that re-orient measurement around visibility share will see the new picture clearly. Brands that don’t will keep optimizing for a metric the user already stopped honoring.

Sources cited

  1. Seer Interactive — “AIO Impact on Google CTR” (3,119 queries, 42 organizations, 2025)
  2. Pew Research Center — AI Overview click-rate study (n=900, 2025)
  3. Ahrefs — “AI Overviews Reduce Clicks by 58%” (300K keywords, 2026)
  4. Bain & Company / Dynata — Zero-click consumer survey (2025)
  5. AdExchanger — Publisher traffic impact analysis (2025)
  6. Search Engine Land — “Google AI Overviews drive 61% drop in organic CTR”
  7. Forbes — “The 60% Problem: How AI Search Is Draining Traffic” (2025)
  8. Cloudflare — Crawler-to-referrer ratio data

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

AI visitors convert at 23×. The quality story behind the quantity drop.

AI referral traffic is tiny in absolute terms — and converts at rates the SEO industry has never seen. Six independent measurements paint the same picture.

23×
Conversion lift of AI-referred vs organic search visitors (Ahrefs)
1.08%
Of total web traffic comes from AI today
527%
YoY growth rate of AI referral traffic
More time on page than organic visitors
Chapter 01

The conversion gap

Six independent studies tracking AI-referred traffic against organic search baselines all converge on the same finding: AI visitors convert at multiples of organic search visitors.

  • Ahrefs — AI visitors convert at 23× the rate of organic search visitors (own-site data).
  • Visibility Labs — ecommerce AI traffic converts 31% higher than non-branded organic.
  • Semrush4.4× conversion rate vs organic.
  • Go Fish Digital25× in their data.
  • Forrester — AI visitors spend up to 3× longer on-page than traditional search visitors.
  • Broworks — in 90 days, 10% of organic visits came from AI engines and 27% of that traffic converted to SQLs.

Even the conservative end of the range — Semrush’s 4.4× — would be a category-defining metric in classical SEO. The aggressive end (Ahrefs 23×, Go Fish 25×) suggests an entirely new traffic class with conversion economics the industry hasn’t catalogued before.

“Zero-click search isn't a problem — it's an enormous opportunity. Buyers are arriving at vendor websites more informed, with higher intent. — Forrester”
Chapter 02

Why the visitor is pre-qualified

The mechanism is structural, not accidental. When a user receives a brand recommendation inside an AI answer — ChatGPT naming your product, Perplexity citing your page, Gemini summarizing your offering — the visit happens after the recommendation, not as a hopeful click on a list of options.

Forrester documented the behavioral signature: AI users average 15–23 word queries versus the 3–4 word average for traditional search. They are explaining their problem at length, asking for synthesis, and accepting a recommendation as the answer. By the time they arrive at the recommended vendor’s site, they have already done the comparison work the AI did on their behalf.

This is why time on page is 3× higher. The visitor isn’t evaluating — they’re acting.

Chapter 03

The economics flip

The cost-per-acquisition curves are diverging.

  • The average GEO customer acquisition cost is $559 and declining at 37.5% as the market matures and brands accumulate compounding citation authority.
  • Google Ads CPC continues climbing 10–15% annually with no signs of slowing — auction prices set by competitive bidding floors that ratchet up.
  • AI traffic is 1.08% of total web traffic today (Conductor) but growing 527% YoY (Search Engine Land / GA4 analysis).
  • ChatGPT alone sent 57.7M outbound clicks in March 2025 — up 558% YoY.

The volume is small. The unit economics are extraordinary. The growth rate is exponential. Three of those three are attractive curves.

Chapter 04

Where the dollar earns more

Run the math on a $10,000 monthly customer acquisition budget.

Spent on Google Ads at a $559 CAC equivalent (charitable benchmark for the channel), the budget yields ~18 customers. Spent on GEO — building entity authority, citation graph, structured content, and earned media that compounds across all five major AI platforms — the budget builds an asset that generates 23× higher conversion per visitor as it accrues. Year one returns may be modest as the entity grounds. Years two and three the gap widens, because GEO accrues like backlinks did in 2008–2014: durable, compounding, hard to dislodge.

This is the case to make to the CFO. Not “AI traffic is bigger than you think” — it isn’t, yet. The case is “the dollars committed to AI visibility today are the cheapest dollars you’ll ever spend on it,” because the asset compounds and the channel hasn’t saturated.

Chapter 05

The Broworks proof

The Broworks case study (90-day GEO sprint) is one of the cleanest published examples of the conversion economics operating in real time. Within three months of starting the program:

  • 10% of organic visits came from generative engines (vs near-zero baseline).
  • 27% of that AI-sourced traffic converted to Sales Qualified Leads.
  • Visitors from LLMs stayed 30% longer on-page than equivalent Google visitors.

The numbers vary by category and starting position, but the directional pattern is now consistent across enough independent measurements that the conversion-multiplier finding has moved from anecdotal to operational. AI traffic isn’t big yet. But it’s the highest-quality traffic class measurable today.

Sources cited

  1. Ahrefs — Own-site AI vs organic conversion comparison
  2. Visibility Labs — Ecommerce AI traffic conversion benchmarks
  3. Semrush — AI referral conversion analysis
  4. Go Fish Digital — AI traffic conversion data
  5. Forrester — “GenAI Forever Changes All Forms of Search” (2025)
  6. Conductor — AI traffic share of total web traffic
  7. Broworks — Published 90-day GEO case study
  8. Search Engine Land — GA4 AI referral growth analysis

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

The Princeton paper, decoded. The academic foundation of GEO.

Aggarwal et al. coined the term, built the benchmark, and quantified what works. Three years later, every credible GEO claim still traces back to this paper.

+40%
Visibility lift from the top GEO methods (Cite Sources, Statistics, Quotation)
10,000
Queries in the GEO-bench benchmark
+115%
Lift for rank-5 sites applying GEO methods
−30%
Top-ranked sites LOST visibility
Chapter 01

The paper that named the field

In November 2023, six researchers from Princeton, Georgia Tech, the Allen Institute for AI, and IIT Delhi posted a paper to arXiv titled simply “GEO: Generative Engine Optimization.” A year later it was published at ACM SIGKDD 2024, the top data-mining conference in the world.

The authors — Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, and Ameet Deshpande — did three things no one had done before:

  1. Coined the term “Generative Engine Optimization” and formalized generative engines as systems combining retrieval with grounded LLM synthesis.
  2. Built GEO-bench — a benchmark of 10,000 queries from 9 diverse sources across 25 domains, designed to evaluate optimization methods systematically.
  3. Tested nine concrete content optimization strategies and quantified what worked, what didn’t, and by how much.

The paper is the academic spine of the entire GEO discipline. Every credible practitioner study published since either cites it, replicates it, or extends it.

“Including citations, quotations from relevant sources, and statistics can significantly boost source visibility, with an increase of over 40%. — Aggarwal et al., KDD 2024”
Chapter 02

The benchmark that made it measurable

Before GEO-bench, “optimizing for AI search” was advice. After GEO-bench, it was a measurable discipline.

The benchmark draws queries from nine sources representative of real-world search behavior:

  • MS Marco, ORCAS-1, Natural Questions (commercial & informational queries)
  • AllSouls (Oxford academic exam questions)
  • LIMA, Davinci-Debate (LLM-generated diverse prompts)
  • Perplexity Discover (real production AI search queries)
  • ELI-5 (Reddit explanation requests)
  • GPT-4 generated queries (synthetic coverage)

The query distribution preserves real-world ratios: 80% informational, 10% transactional, 10% navigational. Coverage spans 25 domains: tech, health, finance, history, law, government, opinion, and more. The split is 8,000 train, 1,000 validation, 1,000 test.

Two evaluation metrics measure visibility: Position-Adjusted Word Count (how prominently a source is quoted, weighted by position in the AI response) and Subjective Impression (LLM-judged influence on the answer).

Chapter 03

What worked

The paper tested nine optimization methods. The top three each delivered 30–40% relative improvement on Position-Adjusted Word Count and 15–30% on Subjective Impression:

  1. Cite Sources — adding inline references to authoritative third-party sources directly within the content.
  2. Quotation Addition — embedding pull quotes from credible experts and publications.
  3. Statistics Addition — injecting specific, attributed numerical data into the text.

Stylistic improvements (Fluency Optimization — better readability, clearer structure) added 15–30% on top. The single highest-performing combination was Fluency Optimization + Statistics Addition together, which outperformed any single method by more than 5.5%.

Real-world validation on Perplexity.ai (a deployed production engine, not just the benchmark): Quotation Addition improved Position-Adjusted Word Count by 22%, Statistics Addition improved Subjective Impression by 37%.

Chapter 04

What didn’t

Two negative findings deserve as much attention as the positive ones, because they upend SEO instincts:

  • Keyword stuffing — the workhorse of early SEO — showed “little to no improvement” on generative engines. Generative engines synthesize meaning, not match terms.
  • Authoritative / persuasive tone changes — “writing more confidently” — showed no significant lift. The authors note GEs are “already somewhat robust to such changes,” meaning models discount confidence-as-style and weight grounded substance.

The pattern: tactics that game surface-level patterns don’t work. Tactics that add genuine substance (citations, real statistics, structured quotation, clear writing) work measurably and reproducibly.

Chapter 05

The asymmetry that changed everything

The single most consequential finding for the GEO industry is buried in the experimental section: lower-ranked sites benefit dramatically more from GEO than top-ranked sites do.

  • SERP rank-5 sites applying Cite Sources gained +115% visibility increase.
  • Top-ranked sites on average LOST 30.3% visibility when applying the same methods.

The interpretation: when everyone plays the same game, the playing field levels. Smaller creators — previously crushed by domain authority moats — gain disproportionately. The historic SEO advantage of sheer authority compresses inside generative engines because synthesis pulls from a wider source pool than ranked lists.

This is why Aggarwal et al. wrote: “The advent of Generative Engines might initially seem disadvantageous to these smaller entities. However, the application of GEO methods presents an opportunity for these content creators to significantly improve their visibility.” The paper is, in a real sense, an invitation to the underdogs.

Chapter 06

The follow-ups: Toronto and SourceBench

Two academic papers in 2025–2026 extended the foundation in important directions.

Chen et al., University of Toronto (2025) — “Generative Engine Optimization: How to Dominate AI Search” — ran the first comprehensive comparative analysis of AI Search vs Google Search across multiple verticals, languages, and query paraphrases. The headline finding: AI Search exhibits “systematic and overwhelming bias toward earned media” — in US automotive queries, AI search returned 81.9% earned media vs 18.1% brand-owned and 0% social, compared to Google’s much more balanced 39.5% brand / 15.4% social / 45.1% earned mix. This validated, at scale, what FancyAI’s Signal Hierarchy methodology codified separately: third-party citations are the highest-leverage GEO investment.

SourceBench (2026) evaluated source quality across 12 AI search systems. Two findings stand out: GPT-5 achieves the highest source quality scores, and AI search actively discovers high-quality sources not found in traditional keyword-based search results. The implication: GEO isn’t just optimization for the same set of sources Google ranks. It’s competition for a partially different source pool.

Chapter 07

Why the paper still matters

Three years after first publication, every reputable GEO claim still routes back to GEO-bench. The 40% visibility lift number is now a category benchmark. The +115% lift for lower-ranked sites is the underlying math behind why the discipline works for challenger brands. The Fluency + Statistics combination is the operational recipe most credible practitioners reference, often without realizing they’re paraphrasing Aggarwal.

If you take one thing from the paper: add specific, attributed statistics to your content. Of the nine tactics tested, this single move plus fluent writing produced the largest measurable gain. The discipline starts here.

Sources cited

  1. Aggarwal, Murahari, Rajpurohit, Kalyan, Narasimhan, Deshpande — “GEO: Generative Engine Optimization” (arXiv 2311.09735, ACM SIGKDD 2024)
  2. Princeton University Research Portal — GEO paper page
  3. Chen, Wang, Chen, Koudas (University of Toronto) — “Generative Engine Optimization: How to Dominate AI Search” (arXiv 2509.08919, 2025)
  4. SourceBench — Multi-system source quality benchmark (2026)
  5. “Beyond SEO: A Transformer-Based Approach for Reinventing Web Content Optimisation” (arXiv 2507.03169, 2025)
  6. Springer — “Artificial Intelligence’s Revolutionary Role in SEO” (peer-reviewed chapter, 2024)
  7. NSF Grant 2107048 — Underlying research funding

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

Why fresh content beats authority. The recency bias in AI citations.

90% of AI bot hits land on content less than three years old. AI-cited pages are 368 days fresher than traditionally-ranked ones. Continuous publishing is the new optimization unit.

90%
Of AI bot hits land on content from the last three years
1,064
Average age (days) of AI-cited content
1,432
Average age of traditionally-ranked content
368
Days fresher AI-cited content is, on average
Chapter 01

The freshness gap

Seer Interactive analyzed AI bot crawler logs across a representative sample of indexed sites. The result is among the cleanest empirical findings in the GEO literature: nearly 90% of AI bot hits land on content from the last three years. The remaining 10% spreads across the entire historical web.

The same study compared the publication age of AI-cited content versus content traditionally ranked by Google. The numbers:

  • AI-cited content averages 1,064 days old (~2.9 years).
  • Traditionally-ranked content averages 1,432 days old (~3.9 years).
  • 368-day gap — AI-cited content is more than a year fresher on average.

This is independent of topic, independent of authority, independent of domain rating. AI engines systematically prefer recent content even when older equivalents exist with stronger authority signals.

Chapter 02

Why the bias exists

The mechanism has two layers. The training-time layer is well-documented: LLMs trained on time-stamped corpora develop measurable preferences for content within their training window’s recency horizon. The Metehan.ai academic analysis confirms this experimentally, showing the bias is observable in raw model behavior independent of retrieval.

The retrieval-time layer is more consequential operationally. AI search engines apply freshness signals during the retrieval ranking phase — before the LLM ever sees a candidate source. ChatGPT’s SearchGPT pipeline, Perplexity’s proprietary index, and Google AI Overviews all weight publication date and last-updated timestamp into their retrieval scoring. The result: even an authoritative source from 2018 has lower retrieval probability than a moderately-authoritative source from 2025 on the same topic.

“Freshness scoring is "always on" in AI systems — content strategy must be continuously updated.”
Chapter 03

The continuous publishing imperative

The operational implication forces a re-architecture of the content calendar. The traditional SEO play — build a definitive long-form “ultimate guide” once, earn backlinks, harvest organic traffic for years — doesn’t map to AI visibility. The asset depreciates faster than authority can be built.

What works: a continuous publishing cadence on the topics that matter to your category. SE Ranking’s data is consistent with Seer’s: pages updated within 3 months average 6.0 ChatGPT citations versus 3.6 citations for stale equivalents — a 67% lift just from refreshing publication date and core stats.

Chapter 04

The refresh sprint as the new unit of work

The shift is from one-time content production to ongoing content refresh sprints. Three operational patterns are emerging across well-executed GEO programs:

  1. 30-day refresh cycles on top-of-funnel commercial pages — updating statistics, adding recent industry developments, refreshing publication date.
  2. Quarterly statistical refreshes on data-heavy pages — replacing 12-month-old benchmarks with current numbers maintains the “cited authoritative source” status AI engines reward.
  3. Monthly editorial additions in core categories — net-new published pieces that establish recency-graded authority across the topic cluster.

The cadence isn’t about volume. It’s about staying inside the recency band where AI engines actively retrieve. A site publishing once a quarter on its core topics will systematically lose ground to a competitor publishing once a month on the same topics, even if the quarterly site has higher domain authority.

Chapter 05

The strategic reframe

Authority is necessary but no longer sufficient. The site that earns AI citations in 2026 is the one combining authority signals with freshness signals — the operational equivalent of running an editorial publication on the topics where the brand wants to be the cited expert.

Most B2B brands aren’t structured this way. Their content programs are episodic: a launch, a campaign, a quarterly initiative. The brands that re-architect content as a continuous editorial function on their core topics will accumulate citation share. The ones that keep treating content as a project will watch competitors with fresher pages get cited instead.

Sources cited

  1. Seer Interactive — “Study: AI Brand Visibility and Content Recency” (log file analysis, 2025)
  2. Metehan.ai — “Recency Bias That’s Reshaping AI Search” (2025)
  3. SE Ranking — Content freshness citation correlation (129K domains)
  4. Ahrefs — Citation freshness analysis
  5. Academic literature on LLM training-window recency effects

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

YouTube mentions predict AI visibility better than backlinks.

Ahrefs analyzed 75,000 brands. The strongest single predictor of AI brand visibility wasn't domain authority. It wasn't backlinks. It was YouTube mentions — by a wide margin.

0.737
Correlation between YouTube mentions and AI brand visibility
75,000
Brands in the underlying study
0.04
Correlation between content length and citation
r²=0.032
Variance in AI citations explained by domain authority
Chapter 01

The 75K-brand study

Ahrefs ran the largest published correlation study to date on what predicts AI brand visibility: 75,000 brands measured across multiple AI engines, scored against every plausible predictor variable, ranked by correlation strength. The headline finding upends two decades of SEO intuition.

The strongest single correlation in the dataset:

  • YouTube mentions: ~0.737 correlation with AI brand visibility
  • Branded web mentions: 0.66–0.71
  • Branded anchor text: 0.51–0.63
  • Domain Authority (DA / DR): r² = 0.032 — explains less than 4% of the variance
  • Backlinks alone: weak / neutral per Seer Interactive’s replication
  • Content length: ~0.04 Spearman — effectively zero

YouTube mentions aren’t a tactical optimization. They are the dominant signal. The implication for most brands’ current content strategies is uncomfortable.

“Domain Authority explains less than 4% of AI citation variance. — Ahrefs, 75K-brand study”
Chapter 02

Why YouTube wins

Three structural factors explain YouTube’s outsized weight in AI visibility scoring:

  1. Cross-platform retrieval coverage. YouTube content surfaces in Google’s index (where AI Overviews retrieve), Bing’s index (where ChatGPT retrieves), and within Gemini’s direct integration. A single YouTube mention has multi-platform reach that a single owned-domain article cannot replicate.
  2. Multi-modal grounding. Modern LLMs increasingly use video transcript data for entity grounding. Brand mentions inside YouTube transcripts feed knowledge graphs and entity disambiguation systems with audio-grade signal that text-only sources don’t carry.
  3. Authority transfer. Mentions on third-party YouTube channels (creators with established subscriber bases) carry the same earned-media gravity as third-party editorial citations — but at a fraction of the cost.
Chapter 03

What doesn’t predict (and never did)

The negative correlations are as instructive as the positive ones:

  • Domain Authority / Domain Rating — classic SEO god-metric — explains under 4% of AI citation variance. SearchAtlas’s replication confirmed: DA, DR, and Domain Power are weak predictors of LLM visibility.
  • Backlink count — the workhorse of off-page SEO — is “weak or neutral” per Seer Interactive’s study of 10,000 LLM questions.
  • Multi-modal content variety on your own site — assumed to help — “didn’t move the needle” per Seer’s data. Variety on your site doesn’t register; variety in third-party mentions of you does.
  • Content length — Ahrefs found a near-zero correlation (0.04 Spearman) with citation. 53.4% of AI-cited pages are under 1,000 words.

The shift is from “build authority on your site” to “build entity presence across the surfaces AI engines actually crawl.” Different game, different scoring.

Chapter 04

The platform divergence wrinkle

Ahrefs’ per-platform breakdown surfaces a useful nuance: the correlation strength varies meaningfully by AI engine.

  • ChatGPT shows the weakest correlations with classic authority metrics (DR: 0.266, branded search: 0.352). It rewards earned media and Reddit/Quora presence over domain authority.
  • AI Mode (Google) shows the strongest correlations with branded authority signals (branded anchors: 0.628). It tracks closer to Google’s own ranking.
  • AI Overviews value DR more than ChatGPT or AI Mode — they’re partially anchored to Google’s organic top-10 (86.85% of AIOs cite at least one Google top-10 result).

The implication: optimizing for one platform via authority-building doesn’t automatically optimize for another. But across all platforms, YouTube mentions and branded earned-media correlations are the most consistent signals — the closest thing to a universal lever.

Chapter 05

Operational implications

For brands committing to AI visibility as a strategic priority, the data suggests a portfolio reweighting. Three concrete moves:

  1. Establish a YouTube presence on category-relevant channels. Not necessarily an owned channel — sponsored mentions, expert interviews, and product placements on established creator channels carry the citation weight.
  2. Reweight off-page investment from raw backlinks toward branded mentions. A backlink from a low-authority site without brand context contributes near-zero. A branded mention without a link from a high-authority publication contributes meaningfully.
  3. Stop optimizing for content length. A 1,200-word piece with statistics, structure, and a YouTube embed will outperform a 4,000-word “ultimate guide” without these elements every time. Length signals nothing useful.

Sources cited

  1. Ahrefs — “Top Brand Visibility Factors: 75K Brands Studied” (2025)
  2. Seer Interactive — “Study: What Drives Brand Mentions in AI Answers?” (Christina Blake, 10K LLM questions, 2024)
  3. SearchAtlas — “Authority Metrics in the Age of LLMs” (2025)
  4. Surfer SEO — 36M AI Overviews citation analysis
  5. SE Ranking — ChatGPT ranking factors (129K domains)
  6. Position Digital — AI SEO statistics compendium (2025)

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

67% of B2B buyers start with AI. The new front door.

B2B buyers are adopting AI search at three times the consumer rate. By the time they visit a vendor website, the shortlist is already set.

67%
Of B2B buyers start with an AI assistant before visiting vendor websites
B2B vs consumer AI search adoption rate
90%
Of organizations now use GenAI in purchasing
44%
Of AI search users say it's their primary information source
Chapter 01

The 67% finding

GrackerAI’s B2B buyer survey produced one of the most consequential statistics in the AI search literature: 67% of B2B buyers now start their research with an AI assistant before visiting any vendor website.

This isn’t a measure of AI traffic. It’s a measure of the buying process restructuring around AI as the discovery layer. By the time a B2B buyer arrives on a vendor’s site, the shortlist has already been formed — in conversation with ChatGPT, Perplexity, or Claude. The vendor either appeared on that shortlist or didn’t.

For sales and marketing leaders, this is a quiet category shift with loud implications. Pipeline that used to start with brand awareness ads now starts with AI prompts. Pages optimized for SEO now compete to be the source the AI cites. The funnel hasn’t collapsed — it’s moved upstream, into a layer most vendors aren’t measuring.

“AI is now the place where decisions begin. — Magenta Associates (300 senior UK procurement professionals)”
Chapter 02

B2B is ahead of consumer adoption

Forrester’s 2024 B2B Buyers’ Journey Survey found B2B buyers adopt AI-powered search at 3× the rate of consumers. The accelerated adoption tracks with the higher information density of B2B purchasing decisions: enterprise software, professional services, and capital equipment buyers were already running 50–80 source comparisons per decision before AI. AI condenses that comparison work from weeks to minutes.

The data points all converge:

  • 90% of organizations now use GenAI in some aspect of their purchasing process (Forrester).
  • B2B AI-generated traffic = 2–6% of total organic traffic, growing at 40%+ per month.
  • Forrester’s end-of-2025 projection: B2B AI traffic share to reach 20%+.
  • Site visitors from AI platforms spend up to 3× more time on-page than traditional search visitors.
Chapter 03

McKinsey’s primary-source finding

McKinsey’s AI Discovery Survey (n=1,927, August 2025) measured the most consequential metric for any vendor selling on findability: which information source consumers consider primary.

The result:

  • 44% — AI-powered search
  • 31% — Traditional search
  • 9% — Retailer websites
  • 6% — Review sites

For the first time on record, AI search has surpassed traditional search as the primary discovery channel among AI search users. The cohort is growing fast. The window where SEO alone secures vendor visibility is closing.

Chapter 04

The UK enterprise validation

Magenta Associates surveyed 300 senior procurement professionals across UK enterprises about how their purchasing process has changed with AI. The conclusion was unambiguous: “AI is now the place where decisions begin.”

Specific behavioral patterns surfaced in the data:

  • Procurement teams use AI to generate vendor longlists before any vendor outreach.
  • RFP responses are increasingly summarized by AI before human review.
  • Vendor differentiation messaging that doesn’t make it into AI summaries effectively doesn’t exist for the buying committee.

The pattern is consistent with KPMG’s AI Quarterly Pulse finding: enterprise leaders are restructuring teams around the assumption that “agents will manage projects while humans manage agents.” The buyer-side shift is already in motion.

Chapter 05

What this means for B2B vendors

The strategic implication is structural, not tactical. Three reframes:

  1. The website is no longer the front door — it’s the destination after the AI shortlist. Vendors not appearing in AI shortlists for their category miss the opportunity entirely. Brand-awareness budgets that stop at “driving traffic” are budgeting for the wrong stage.
  2. Sales enablement needs to include AI visibility briefings. Reps who know what their AI shortlist position is — per platform, per buyer persona prompt — can address objections that are now baked in before the first call.
  3. Pipeline attribution needs an AI-discovery layer. Most attribution stacks credit the channel that drove the click. The actual decision often happened earlier, in a ChatGPT conversation that didn’t fire any UTM. Vendors that don’t measure AI visibility share will keep crediting the wrong channels for pipeline that originated upstream.
Chapter 06

The competitive window

The window for a B2B vendor to establish AI visibility before competitors do it first is open and narrowing. Three observations from the survey data inform the timing:

  • 91% of marketing leadership has been asked about AI search visibility in the past year (SEOFOMO survey of 200+ senior SEOs). Awareness is universal.
  • 62% of SEOs report AI search drives less than 5% of revenue today. Investment is lagging awareness — most vendors are still in early experimentation.
  • The gap between the two numbers (91% asked vs. 62% under-5%) is the strategic window. Vendors investing seriously now are pulling ahead in citation share before the budgets catch up.

The brands that establish citation authority in their category in 2026 will be the ones AI engines default to in 2027 and 2028. The asset compounds. The window is open. It won’t stay open long.

Sources cited

  1. GrackerAI — B2B buyer behavior survey (2025)
  2. Forrester — B2B Buyers’ Journey Survey (2024) and B2B AI marketing report (2025)
  3. McKinsey — AI Discovery Survey (n=1,927, August 2025)
  4. Magenta Associates — UK enterprise procurement survey (n=300, 2025)
  5. KPMG AI Quarterly Pulse — Enterprise leadership trends
  6. SEOFOMO / Search Engine Land — State of AI Search Optimization Survey (200+ senior SEOs, 2025)
  7. Menlo Ventures — “State of Generative AI in the Enterprise” (~500 US enterprise decision-makers, 2025)

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

Six platforms promise to get your brand cited by AI. Most don't finish the job.

A buyer's-side comparison of Profound, Evertune, Semrush, Scrunch, Conductor, and FancyAI — and the structural fault line splitting the category in two.

5 of 6
Platforms in this analysis that stop at recommendations and leave the work to the customer
14
Dimensions evaluated across pricing, coverage, execution
$99–$500K+
Annual price range across the six tools
60%
Of AI brand signal weight that lives off-site, not on-site
Chapter 01

The category splits in two

The pitch decks of Generative Engine Optimization platforms in 2026 are nearly indistinguishable. Each promises to show buyers where their brand appears in ChatGPT, Perplexity, Gemini, and Claude, to track citations across model releases, and to surface what the competition is doing differently. At the dashboard level, almost every product in the category looks the same.

That sameness is the story. Independent reviewers and the platforms’ own marketing have begun to acknowledge it openly. Mersel’s 2026 GEO platform analysis described one of the largest incumbents in the space as a tool that “does not execute content or deploy AI infrastructure” and “functions exclusively as a monitoring and analytics platform.” Evertune’s own homepage frames its differentiator as a critique: “Every other tool shows you the data. Evertune shows you what to do with it.”

The category is splitting along a single fault line: monitoring versus execution. The vast majority of GEO platforms are dashboards. They tell brands where they stand. A much smaller group does the work that actually changes the answer.

This article evaluates the six platforms most often shortlisted by mid-market and enterprise buyers in 2026 across fourteen dimensions. The dimensions were chosen to be fair on the surface area every platform competes on, and to make the structural gap in the category visible without editorializing.

“Every other tool shows you the data. — Evertune, homepage”
Chapter 02

The matrix

The fourteen-row comparison below is structured around the question every GEO buyer is now asking out loud: I can see I’m not showing up in AI answers. Now what? Almost every platform in this comparison answers the first half of that question. Only one is built end-to-end around the second.

Three rows decide the category — on-site implementation, content production, and off-site/earned media. Watch those.

Dimension
Profound
Evertune
Semrush
Scrunch
Conductor
FancyAI
Pricing & access
Starting price
$99/mo
~$1,000/mo+
$99/mo
$417/mo
$26,800/yr
$799/mo
Public pricing?
Partial
No
Yes
Yes
No
Yes
Coverage
Engines (entry tier)
ChatGPT only
All major
ChatGPT + AIO + Gemini
6 engines
Not specified
ChatGPT + Gemini
Engines (top tier)
10+
All major
All major
6 engines
All
All major
Monitoring & reporting
Monitoring & reporting
Best in class
Strong
Strong
Strong
Enterprise
Included
Citation tracking
Yes
Yes
Yes
Yes
Yes
Yes
Recommendations engine
Yes
Yes (playbook)
Yes
Page audits
Yes
Yes (auto)
Three rows that decide the category
On-site implementation
No
No
No
AXP only
Drafts only
Yes
Content production
No
Partial
Separate $60/mo
No
60–600 drafts/yr
Included
Off-site / earned media
No
Affiliate ties
Separate $149+
No
No
Yes (PR + Reddit)
Service & procurement
Citation link building
No
No
No
No
No
Yes
Strategist included
Enterprise only
Yes
No
Growth+
Mid-market+
All tiers
Procurement cycle
Sales-led
Sales-led
Self-serve
Self-serve
Sales-led (slow)
Self-serve

Sources: vendor pricing pages, Vendr procurement intelligence, Mersel, Trakkr, Rankability, Mint, Indexly, Cairrot, Reddit user reports. Prices current as of May 2026.

Chapter 03

Profound: the analytics-first incumbent

Of the six platforms in this analysis, Profound has the deepest investment in measurement infrastructure. Its Agent Analytics module reads server logs to track GPTBot, ClaudeBot, and PerplexityBot crawl patterns at a level of granularity none of the other tools attempt. The platform covers more than ten AI engines at its top tier, holds SOC 2 Type II compliance, and is the most likely choice for a Fortune 500 brand that already has the in-house engineering bandwidth to act on what the dashboard surfaces.

The platform’s limitation is not a flaw — it is a deliberate scope choice that buyers are increasingly questioning. Mersel’s independent analysis put the framing bluntly: Profound “does not execute content or deploy AI infrastructure” and “functions exclusively as a monitoring and analytics platform.” The Rankability 2026 review benchmarked its pricing at 48 percent above the market average for monitoring tools, with full engine coverage gated to the Enterprise tier (typically $2,000–$5,000+ per month after the $99/mo Starter and $399/mo Growth bands).

Profound is the right answer for organizations whose constraint is information, not capacity. For organizations whose constraint is capacity — the bandwidth to actually ship the changes the dashboard surfaces — the analytics depth becomes harder to justify against the gap between insight and action.

Chapter 04

Evertune: visibility plus paid-media bolted on

Evertune has built the most candid positioning in the category. Its homepage tagline reads as a confession of what its competitors don’t do: “Every other tool shows you the data. Evertune shows you what to do with it.” The platform pairs visibility tracking with EverPanel, a 25-million-user behavioral data layer, and adds programmatic ad activation through The Trade Desk — the only platform in this comparison that integrates AI visibility insight with paid-media execution.

The agency-friendly model (unlimited logins, no per-seat charges) has won enterprise customers including WPP’s Choreograph and Miro. Pricing is not published; Reddit threads and third-party reports place it around $1,000 per brand per country per month, with paid media activation requiring its own separate budget.

The structural caveat sits inside the same tagline. “Showing you what to do with it” still places the burden of doing it on the customer. Evertune hands buyers a playbook, recommendations, and ad inventory access. Translating those into shipped content, schema deployments, and earned editorial mentions remains the customer’s problem to solve.

Chapter 05

Semrush AI Visibility Toolkit: an SEO incumbent extends its surface

Semrush has the largest installed base in the comparison, and its AI Visibility Toolkit reflects that reality. The product is positioned as an extension of the existing SEO suite rather than a standalone GEO platform, with familiar dashboards, the same competitor and keyword databases that make the parent product useful, and the lowest entry price in the category at $99 per user per month for one domain.

The arithmetic gets more complicated quickly. The standalone toolkit covers ChatGPT, Google AI Overviews, and Gemini. Adding the broader AI-ready Site Audit and brand insights requires a Semrush One bundle ($165 to $549 per month depending on tier), and content drafting lives in a separate $60-per-month Content Toolkit. Buyers who want the full surface end up assembling four products into something the standalone competitors deliver as one.

Semrush is the right answer for organizations already living inside the platform — agencies and in-house SEO teams who can extend an existing contract incrementally rather than procuring a new vendor. The trade-off is conceptual: AI Visibility is one of more than ten toolkits in Semrush’s catalog, not the company’s strategic center. The roadmap reflects that.

Chapter 06

Scrunch AI: the mid-market agent-experience play

Scrunch occupies the middle of the market with five hundred-plus customers (including Lenovo, BairesDev, Clerk, and Skims) and the broadest AI engine coverage in its tier — ChatGPT, Claude, Gemini, Perplexity, Google AI Mode and AI Overviews, and Meta. Its differentiator is the Agent Experience Platform (AXP), which generates an AI-friendly version of a customer’s site automatically. Pricing starts at $417 per month (annual) or $500 month-to-month, with a seven-day free Starter trial.

Customer evidence has been compelling where it appears. Clerk reported nine-times higher sign-ups attributable to AI search after deploying Scrunch’s recommendations. The persona-based prompt monitoring framework is meaningful for B2B buyers tracking different ICP search behaviors.

The page audit limits are tight by design (5 audits on Starter, 10 on Growth), and AXP optimizes the website that already exists. It does not produce new authoritative content, earn off-site signals, or pursue the editorial mentions that the underlying research suggests carry the largest share of AI recommendation weight. Scrunch is a strong dashboard with a clever on-site automation layer attached. For mid-market teams comfortable shipping content themselves, it is among the best-value options in the category.

Chapter 07

Conductor: the enterprise SEO incumbent extends to AEO

Conductor is the most established platform in the comparison and the one with the longest enterprise customer roster. Its expansion from SEO into AEO (Answer Engine Optimization) added Writing Assistant Drafts (60 to 600 per year depending on tier), Content Score, and AgentStack — an internal tooling layer for managing the work. At Enterprise scale, the platform monitors more than 125,000 pages and 60,000 keywords, with white-glove implementation that often spans the first quarter.

The pricing reflects the customer profile. Conductor publishes no public price list. Vendr’s 2026 procurement data places median annual spend at $48,950, with entry contracts ranging $26,800 to $45,000, mid-market deals $48,000 to $85,000, and enterprise commitments commonly $150,000 to $500,000 and beyond. Implementation regularly adds another $30,000 in Year 1.

The procurement cycle is its own filter. Buyers report 30 to 90 day timelines from first contact to contract, which makes the platform unrealistic for SMB and most mid-market organizations. Conductor is the right answer for global enterprises with six-figure SEO budgets, ten-plus domains, and a dedicated CSM relationship to operate the program. For everyone else, it is over-engineered for the use case.

Chapter 08

FancyAI: the execution layer for AI discovery

FancyAI is the youngest platform in this analysis and the only one positioned around execution as the product. The company publishes its methodology openly — a Signal Hierarchy framework derived from analysis of more than 40,000 websites and 129,000 domains, scored against an AI Readiness Index that quantifies a brand’s eligibility to be recommended by AI systems across four signal classes (entity clarity, citation density, structured proof, corroborating mentions).

The platform alone runs $799 per month on the Basic plan, which carries the same monitoring and recommendations features the other tools provide. The departure is what comes attached at the next two tiers. Essential ($2,500/mo, 90-day minimum) and Growth ($5,000/mo) bundle the dashboard with execution: a dedicated strategist, onsite content updates, technical GEO, and off-site media credits ($2,000 to $4,500 per month across citation links, Reddit, and press distribution depending on tier).

The structural argument that emerges from the matrix is what FancyAI is built to answer. Of the platforms compared, it is the only one that returns a “yes” on all three of the rows that decide the category — on-site implementation, content production, and off-site/earned media. The company’s tagline (“AI Visibility, Executed”) commits to closing the loop the rest of the category leaves open.

The candid trade-off: FancyAI is a newer brand than Conductor or Semrush, and enterprise procurement teams accustomed to evaluating dashboards may need to be educated on the managed-execution model. Pricing is published — a deliberate contrast to the sales-led incumbents.

Chapter 09

Why three rows decide the category

The most consequential rows in the matrix are not the ones the dashboards compete on. Citation tracking, recommendations engines, and platform coverage are now table stakes — every serious GEO product ships these. The differentiation lives downstream of the dashboard, in the rows where most platforms answer “no” or “partial.”

The reason this matters more in GEO than it did in SEO is empirical. FancyAI’s published methodology, drawing on Digital Bloom’s analysis of 129,000+ domains and the company’s own corpus, finds that 41 percent of AI brand recommendation weight comes from authoritative list mentions (third-party “best of” lists, roundups, directories) and not from on-site content at all. Another 18 percent comes from awards and accreditations; 16 percent from online reviews. The signals that move AI recommendations are predominantly off-site, earned, and editorial — precisely the work that monitoring tools cannot do for a customer.

The Princeton GEO paper (Aggarwal et al., KDD 2024) reached a complementary finding from a different angle: lower-ranked sites applying GEO methods saw a 115 percent visibility lift, while top-ranked sites saw an average 30 percent decrease. The opportunity is asymmetrically large for brands that can do the work, not just observe their position.

Across the comparison, every platform either answers “no” on the off-site row, or hands the customer a playbook and steps back. One platform answers “yes” on all three rows that move the recommendation needle. That is the structural fault line buyers should weight most heavily.

“The mention is the signal. The link is almost irrelevant.”
Chapter 10

The verdict, written as a decision tree

No tool in this comparison is the right answer for every buyer. Ranking is the wrong frame. The right frame is fit:

  • If the constraint is information at Fortune 500 scale — an enterprise SEO, content, and engineering team with the bandwidth to ship changes and the need for the deepest AI-bot crawl analytics in the market — Profound is the right choice.
  • If the goal is monitoring tied to programmatic ad activation — a brand or agency willing to fund a separate paid-media budget alongside the platform license — Evertune is the only tool in the category with The Trade Desk integrated.
  • If Semrush is already in the stack and the team wants AI visibility added incrementally rather than procured separately, Semrush’s AI Visibility Toolkit is the path of least resistance.
  • If the buyer is a mid-market marketing team shipping content internally and wanting strong cross-platform monitoring with on-site automation, Scrunch is the best-value self-serve option.
  • If the organization is a global enterprise with a six-figure SEO budget, ten-plus domains, and the procurement patience for a 30 to 90 day cycle, Conductor remains the deepest enterprise platform.
  • If the bottleneck is execution — if the dashboard would just confirm what the team already suspects without anyone with capacity to act on it — FancyAI is the only platform in the comparison that closes the loop. The work, not just the readout.

The category will continue consolidating around this distinction. Monitoring platforms will commoditize. Execution platforms will compound. Buyers who recognize where their organization actually breaks — insight or capacity — will choose accordingly.

Sources cited

  1. Mersel AI — “Mersel vs. Profound” analytics-vs-execution analysis (2026)
  2. Profound — Pricing and product navigation (tryprofound.com)
  3. Rankability — “Profound AI Review 2026”
  4. Mint AI / Indexly — Profound pricing breakdown (2026)
  5. Evertune — Homepage and product positioning (evertune.ai)
  6. Reddit r/evertune — Pricing discussion thread (2026)
  7. Semrush — Subscription plans and toolkit pricing (semrush.com)
  8. Trakkr — Semrush AI Visibility pricing review (2026)
  9. Scrunch AI — Pricing and plan details (scrunch.com)
  10. Cairrot — Scrunch AI review and alternatives
  11. Conductor — Pricing page (conductor.com)
  12. Vendr — Conductor 2026 procurement benchmarks (median $48,950)
  13. Checkthat.ai — Conductor pricing tiers and 3-year TCO
  14. Conductor Academy — SEO & AEO Pricing Guide
  15. FancyAI — Signal Hierarchy methodology and AI Readiness Index
  16. Aggarwal et al., Princeton/Georgia Tech/Allen AI/IIT Delhi — “GEO: Generative Engine Optimization,” ACM SIGKDD 2024

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

AI is now citing AI. The 91.4% problem.

A Search Engine Land analysis found 91.4% of content cited in AI Overviews is at least partly AI-generated. A Columbia Journalism Review study found AI search engines wrong 60% of the time — and premium models were worse than free ones.

91.4%
Of content cited in AI Overviews is at least partly AI-generated
60%
Inaccurate or misleading answers across 8 AI engines (CJR)
17–33%
Legal AI hallucination rate (Stanford)
1,800
AI articles in the 2023 SEO heist that "stole" 3.6M visits
Chapter 01

The headline finding

Search Engine Land’s 2025 analysis of Google AI Overview citations surfaced one of the most consequential statistics in the GEO literature: approximately 91.4% of content cited in AI Overviews was at least partly AI-generated.

Read that twice. The system designed to synthesize the web’s most credible answers is overwhelmingly drawing on content the web’s machines made. Each citation feeds back into the training data for the next generation of models. The information ecosystem is becoming progressively more self-referential.

The phenomenon has a name in the academic literature: model collapse. UK and Canadian researchers covered by VentureBeat in 2024 demonstrated that “as AI-generated content proliferates around the internet, and AI models begin to train on it,” quality degrades exponentially with each generation. iPullRank calls the operational consequence “the content collapse”: each cycle of generation produces work that is “progressively more generic, less accurate, and less useful.”

“Each generation of AI content becomes progressively more generic, less accurate, and less useful. — iPullRank”
Chapter 02

The Columbia Journalism Review study

The most rigorous published audit of AI search accuracy comes from the Columbia Journalism Review’s Tow Center for Digital Journalism. CJR researchers tested eight major AI search engines against 200 news-sourced queries with known correct answers.

The headline result: chatbots provided inaccurate or misleading answers more than 60% of the time — nearly always without acknowledging any uncertainty. The most counterintuitive finding: premium AI models were more prone to confidently incorrect responses than their free counterparts. Paying more bought a more confident liar, not a more accurate one.

CJR also documented that “many of the AI companies developing these tools have not publicly expressed interest in working with news publishers” — the same publishers whose content trained the models in the first place.

Chapter 03

The 2023 SEO heist as canary

In late 2023 a SaaS founder publicly bragged about generating 1,800 articles using AI and “stealing” 3.6 million visits from a competitor. The articles required minimal human oversight. Google eventually applied a manual action against the site, but not before the experiment proved an uncomfortable truth: at scale, AI-generated content can be made to rank, get cited, and capture traffic that previously belonged to human-created authoritative sources.

The economics of digital pollution are bad. The cost to produce 1,800 mediocre articles is now under a hundred dollars. The cost to produce one well-researched, expertly written piece on the same topic is roughly the same as it was a decade ago. The arbitrage is brutal.

Peec AI’s analysis of Google’s 2024 spam action found that 100% of affected pages had AI-generated content, with half the affected sites being completely de-indexed. Detection improves; production costs fall faster.

Chapter 04

The PNAS perverse incentive

A Proceedings of the National Academy of Sciences paper added a third factor to the loop: when given a choice between human-written and AI-written text, large language models sometimes prefer the AI-written version.

The implication is structurally important. Brands optimizing for AI visibility now face a perverse incentive: if the cheapest path to citation is to publish AI-generated content because models prefer reading what other models wrote, the rational response is to flood the web with synthetic content. That is exactly what is happening. The 91.4% number is not a glitch — it’s the rational outcome of the incentive landscape.

Reddit r/science surfaced a corollary finding: nearly two-thirds of AI-generated citations are themselves inaccurate. The web of references AI engines cite is itself made by AI engines, and that web is wrong more often than it is right.

Chapter 05

Why human-grade content still wins

The contrarian finding inside the data is the operational one. Multiple controlled tests now show that human-generated content designed for AI extraction performs “up to an order of magnitude better” than AI-generated content on the same topics.

The mechanism: AI engines reward expertise signals (E-E-A-T), source citation density, and structural extractability. AI-generated content tends to lack first-person experience, novel data, and named authors with credentials. It is structurally legible but evidentially thin. Models cite it because it’s easy to parse; users don’t convert from it because it doesn’t teach them anything new.

The brands winning long-term in AI search are doing the opposite of the AI-generation arbitrage: they are publishing fewer pieces, written by named experts, with original data, structured for extraction but written for humans first.

Chapter 06

The trust paradox for GEO practitioners

The structural challenge for the entire GEO discipline is the trust paradox iPullRank articulated: brands and marketers are investing in strategies to earn citations and visibility — inside systems that are becoming less reliable.

The 91.4% problem is not just a content quality issue. It is a signal degradation issue. The systems brands optimize for are losing the ability to distinguish authoritative sources from synthetic ones. Citation share inside a degrading system is worth less every quarter.

Three operational responses worth considering:

  1. Increase the human-evidence signal density of every page. Named authors with credentials. Original data. First-person observation. The content that AI cannot mass-produce is the content that compounds in value.
  2. Invest in third-party validation pipelines. Trade publication coverage, podcast appearances, conference talks, expert interviews. Earned media that AI cannot generate is the proof of authority that survives the collapse.
  3. Treat AI-generated content as a tax, not a tool. Use it for first drafts and structural scaffolding; never publish without substantial expert rewrite. The economics still favor the slow path for brands that intend to be cited five years from now.

Sources cited

  1. Search Engine Land — “AI-generated content: Benefits, risks & SEO best practices”
  2. Columbia Journalism Review (Tow Center) — “AI Search Has a Citation Problem” (8 engines tested, 2025)
  3. VentureBeat — “The AI feedback loop: Researchers warn of model collapse” (UK/Canadian study, 2024)
  4. iPullRank — “The Content Collapse and AI Slop — A GEO Challenge”
  5. Peec AI — Google 2024 spam action analysis
  6. Proceedings of the National Academy of Sciences (PNAS) — LLM preference for AI-generated text
  7. Reddit r/science — Two-thirds of AI-generated citations inaccurate
  8. Semrush — “Can AI Content Rank on Google?” (20K blog URL analysis)

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

When AI lies about your company. A brand hallucination field guide.

Air Canada lost a tribunal ruling. Soundslice built a feature ChatGPT invented. Hoka had wrong pricing on display. The hallucination rate is between 17% and 90% depending on the domain — and 40% of users never check the source.

42.1%
Of users encounter inaccurate AI content; 40%+ never click through to verify
17–33%
Legal AI hallucination rate (Stanford)
28–90%
Medical AI hallucination rate range
16.78%
Have encountered unsafe or harmful AI advice
Chapter 01

The hallucination rate baseline

AI hallucination is not a fringe edge case. It is a measured, structural feature of probabilistic systems applied to deterministic domains. NeuralTrust’s 2025 analysis frames it bluntly: hallucinations are “an inherent risk of using probabilistic models in deterministic domains.” They are a feature, not a bug.

The published rates by domain:

  • Legal AI tools: Stanford research found 17–33% hallucination rates across major legal-research AI products.
  • Medical AI: documented hallucination rates range 28–90% depending on the system and query type.
  • General AI search: Columbia Journalism Review’s test of 8 engines: 60% inaccurate or misleading.
  • User-side: MarTech data shows 42.1% of users encounter inaccurate AI content; 16.78% have encountered unsafe or harmful AI advice; over 40% rarely or never click through to verify a source.

Translate that to brand math: every day, AI engines make millions of confidently incorrect statements about real businesses, and most users never cross-check.

“There are three kinds of lies: lies, damned lies, and hallucinations. — UC Berkeley SCET”
Chapter 02

The Air Canada precedent

In 2024, Air Canada’s customer-service chatbot invented a bereavement-fare policy that did not exist. A grieving passenger acted on the AI’s answer, booked the trip at full price expecting a refund, and was denied. He sued. The Canadian tribunal ruled the airline liable for the chatbot’s statements: if the AI is the customer-facing voice of the brand, the brand owns what the AI says.

The legal exposure is real and replicable. The National Law Review’s 2025 framing was unambiguous: “If your AI acts as an agent of your business, you likely bear responsibility for what it tells people.” The legal framework is still settling, but the early-precedent direction favors users harmed by hallucinations and not the brands deploying the AI.

Chapter 03

The Soundslice case: built what the AI invented

Soundslice, a music-software company, discovered ChatGPT was telling users it had an ASCII tab import feature. The product had no such feature. Users were signing up to use it, finding nothing, and churning. ChatGPT had hallucinated the capability into existence.

The company eventually built the feature — not because they had planned to, but because the AI had created enough demand for it that shipping was the cheaper response than continued explaining. iPullRank documents this as the canonical example of AI-driven product roadmap pollution: your AI-generated brand reality starts to shape your real product strategy.

Hoka faced a similar issue from a different direction: ChatGPT was showing prospective customers incorrect pricing pulled from outdated third-party sources. Even after Hoka updated their official pricing, the AI continued surfacing the wrong number. The lag between brand correction and AI retraining is months, sometimes longer.

Chapter 04

Why users don’t catch it

The user-behavior side of the hallucination problem is what makes it consequential at scale. Pew Research found only 1% of users click into the cited source when an AI Overview is shown. Bain’s consumer survey found ~80% of consumers now rely on zero-click results.

The trust paradox compounds: an Exploding Topics analysis of 2025 consumer trust data found 82% of users are skeptical of AI results, yet only 8% always check sources. Skepticism without verification creates the worst-case outcome — users who don’t fully trust AI answers but act on them anyway, then carry the misinformation into their decision-making.

For brands, this means: a hallucinated claim about your company in an AI Overview reaches users who are already skeptical of the medium, but who will act on the claim anyway, and will not check whether your website tells a different story.

Chapter 05

The four classes of brand-level hallucination

From the case literature, hallucinations affecting brands cluster into four reproducible patterns:

  1. Outdated training data. AI surfaces old pricing, old leadership, old policies, old product capabilities. The brand updates the website; the AI keeps quoting the old version for months.
  2. Brand confusion. RankScience documented a US consulting firm whose AI responses blended its history with a UK firm of similar name, making the US firm invisible on competitive queries. Fictional-character namesake collisions are the worst case.
  3. Fabricated specifics. Invented features, made-up partnerships, hallucinated awards, fake founder biographies. Soundslice is the textbook example.
  4. Negative narrative drift. Third-party criticism, outdated controversies, or competitor-comparison content getting surfaced as the dominant brand frame even when the underlying issue is resolved.
Chapter 06

The defensive GEO playbook

You cannot directly correct an AI model’s output. The only effective response is to flood the information ecosystem with accurate, structured, authoritative content that gives AI systems high-confidence material to draw from. Edelman’s 2025 framing: “Earned media is the single most important driver of brand visibility in AI-generated responses.”

The MarTech-validated defensive sequence:

  1. Audit systematically. Query all major AI platforms with brand-specific prompts (“What is [Brand]?” / “Who founded [Brand]?” / “What does [Brand] cost?”). Document every false claim. Track over time.
  2. Diagnose the root cause. Is the wrong claim coming from outdated training data, or from a third-party source the model is retrieving live? Different fixes apply.
  3. Update owned content first. Make sure your website, knowledge graphs (Wikidata), and schema.org markup state the canonical truth clearly.
  4. Earn third-party corrections. Get accurate information published on the authoritative sites the AI engines crawl. The mention is the signal.
  5. Use platform feedback mechanisms. Where available (ChatGPT thumbs-down, Perplexity feedback), report incorrect statements about your brand. Slow but cumulative.
  6. Monitor over time. Model retraining cycles are months. Check quarterly that corrections have propagated.
Chapter 07

The insurance market emerges

A measure of how seriously the brand-risk side is taken: insurance products specifically covering AI-related brand damage are now in market. Munich Re launched aiSure covering AI performance failures. Willis Towers Watson partnered with Liberty Specialty Markets on similar coverage. Lloyd’s syndicates are underwriting AI hallucination liability policies.

The premiums and exclusions are still being calibrated, but the existence of the market is itself a signal. When reinsurers are willing to write coverage, they have priced the risk. The hallucination-driven brand-damage exposure is now a balance-sheet item, not a marketing concern.

For CMOs and CISOs: budget for AI brand monitoring is no longer optional. The cost of catching a hallucination early is measured in monitoring tool licenses. The cost of catching it late is measured in tribunal rulings, lost customer trust, and insurance premiums.

Sources cited

  1. Stanford — Legal AI hallucination study (17–33%)
  2. National Law Review — “AI Hallucinations Are Creating Real-World Risks for Businesses”
  3. Air Canada v. Moffatt — Canadian tribunal ruling (2024)
  4. iPullRank — Soundslice case study + AI hallucination epidemic analysis
  5. Pew Research Center — AI Overview click-rate behavioral study
  6. MarTech — “How to Protect and Control Your Brand Reputation in AI Search”
  7. Yoast — “When AI Gets Your Brand Wrong: Real Examples and How to Fix It”
  8. Edelman — “How Brands Stay Visible in an AI-Driven Search World”
  9. Munich Re aiSure / Willis Towers Watson — AI insurance market data

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

“Extinction-level event.” How AI search is restructuring the open web.

NPR’s framing for what publishers are facing. The Daily Mail’s vice chair: 50% of traffic gone in five years. 500+ lawsuits. A handful of platforms now control how billions discover information.

50%
Of publisher traffic projected to disappear within five years (Daily Mail vice chair)
25%
Of Google searches now trigger an AI summary
500+
Publisher lawsuits filed against AI platforms
$50M
News Corp annual licensing deal with OpenAI
Chapter 01

The NPR framing

NPR’s July 2025 reporting on AI search’s impact on publishers used a phrase that has stuck in the industry: “an extinction-level event.” The framing is dramatic but not hyperbolic. The data behind it is consistent across independent measurements.

TechCrunch documented “referrals to news sites are plummeting, cutting off the traffic publishers need to sustain quality journalism.” Search Engine Journal quoted publisher leadership: “We’re definitely moving into the era of lower clicks and lower referral traffic for publishers” — the golden age of search traffic is ending. Digital Content Next’s August 2025 member survey found median year-over-year referral traffic from Google declined sharply after AI Overviews expanded.

The cause is structural: when 25% of Google searches now trigger an AI summary that answers the user’s question without requiring a click, the publisher economics break. Ad revenue collapses. Subscription conversion paths break. Editorial budgets shrink. Less content gets produced. Less authoritative content exists for AI engines to draw from. The loop closes badly.

“All publishers could see 50 percent of their traffic gone in five years. — Rich Caccappolo, Vice Chair, Daily Mail parent company”
Chapter 02

The Daily Mail projection

Rich Caccappolo, vice chair of Media at the Daily Mail’s parent company, told The Atlantic in June 2025 that all publishers “could see 50 percent of their traffic gone in five years.” The Daily Mail is one of the world’s largest English-language news properties. The projection came from inside the most resource-equipped tier of publishing.

The Atlantic’s own reporting captured the broader sentiment: “I’ve spoken with several news publishers, all of whom see AI as a near-term existential threat to their business.” Futurism documented that the steepest declines occurred in mid-2025 specifically — the moment Google expanded AI Overview coverage from a small share of queries to roughly a quarter of all searches.

Chapter 03

The litigation front

The legal response is the largest publisher-vs-platform litigation cluster in the history of the open web. 500+ publications have filed or joined lawsuits against AI search platforms. The most prominent:

  • The New York Times sued OpenAI and Perplexity for copyright infringement.
  • Dow Jones joined the wave of suits.
  • Google faces an EU antitrust investigation specifically about AI Overviews using publisher content.
  • Wolf River Electric’s defamation suit against Google is testing whether AI-generated false claims meet the legal standard for libel.

The legal question underneath all of these cases is unsettled: do AI engines have the same Section 230 protections as traditional search engines? The American Bar Association’s November 2024 analysis noted “the absence of comprehensive AI regulation clearly defining the contours and applicability of Section 230 immunity in the context of generative AI.”

Chapter 04

The licensing alternative

Some publishers have chosen the negotiation path instead of litigation. News Corp signed a $50M-per-year licensing deal with OpenAI. The Associated Press has a similar agreement. Other major outlets have followed.

The split strategy reflects publisher uncertainty about the better path. Sue for damages and clarification of legal standards? Or negotiate for predictable revenue and seat at the table?

The math favors negotiation only for the largest publishers. A $50M annual licensing deal is meaningful for News Corp’s revenue base. Translated to mid-tier publishers, the proportional equivalent payments are too small to compensate for the traffic loss they replace. Local news, trade publications, and most B2B media outlets are not getting licensing offers and would not survive on the proportional equivalent if they did.

Chapter 05

The 25% threshold

Futurism’s 2025 analysis identified the inflection point: by mid-2025, roughly 25% of all Google searches triggered an AI summary. That number is the threshold at which publisher traffic loss becomes acute rather than chronic.

Below 10% AI Overview saturation, publishers absorbed the loss as a manageable headwind. Above 25%, the loss broke the unit economics of ad-supported publishing for everyone except the largest players. Above an estimated 50% saturation — which Google’s trajectory suggests is reachable within 24 months — entire categories of publishing become unviable.

The categories most exposed: top-of-funnel informational content (the standard SEO-driven blog model), review and comparison content (synthesized away by AI Overviews), and how-to and tutorial content (also synthesized). The categories most insulated: opinion, original reporting, breaking news, and brand-relationship-driven subscription content.

Chapter 06

The affiliate-marketing collapse

The collateral damage extends beyond traditional publishers. Affiliate marketing — the entire industry of review-driven product comparison content and recommendation sites — faces a more acute version of the same dynamic.

Affiversemedia’s 2025 analysis: “Users increasingly discover, evaluate, and decide on purchases within AI environments before ever clicking an affiliate link.” Acceleration Partners reported “significant drops in organic traffic because these AI search summaries don’t drive clicks.”

The affiliate model isn’t dead — coupons, loyalty programs, and partner networks remain viable — but the SEO-driven affiliate playbook of the past 15 years (write “best of” lists, rank organically, capture commission on click-through) is being structurally disintermediated. The brands that adapt are the ones building direct authority that earns AI citations rather than relying on the click.

Chapter 07

Concentration: from open web to four platforms

Forbes’s April 2025 framing of the macro shift was precise: “We’re shifting from the chaos of the open web to the centralization of a few AI platforms.” AdExchanger’s parallel analysis: “Generative AI didn’t just transform search results; it upended how monetization works on the open web.”

The concentration math: four companies (Google, OpenAI, Microsoft, Anthropic) now control roughly the entirety of mainstream AI-mediated information discovery. Google’s historic search dominance was partial because users had alternatives (Bing, DuckDuckGo, vertical search). The AI search consolidation is qualitatively different — the four-platform concentration is more pronounced and the user behavior shift is one-directional.

The downstream consequences run through the entire information ecosystem: 32% of US/UK consumers now say AI is negatively disrupting the creator economy (up from 18% in 2023). Small website owners face the resource gap most acutely — GEO monitoring tools, content optimization, and brand building all require investment they may not have.

Chapter 08

What survives, and what brands should do about it

The publishers, brands, and content categories that survive the next 24–36 months will be the ones that adapt to one structural reality: the click is no longer the unit of value. Citation share, brand-mention frequency, and AI-mediated visibility are the new metrics.

For brand operators reading this as buyers (not as publishers), three implications matter:

  1. If your category is heavily mediated by news and review content — consumer products, B2B SaaS, healthcare, financial services — the third-party validation pipeline you used to depend on is shrinking. The brands that earn citation share will increasingly do so through direct authority signals, not by hoping a review aggregator picks them up.
  2. If your acquisition funnel depended on top-of-funnel SEO content, that channel is collapsing. Move budget toward authority-building content (original research, named-author publishing, expert positioning) and earned media (PR, podcast appearances, conference talks).
  3. If you publish content as a brand, you are now operating in the same competitive environment as the publishers being affected by this shift. The bar is higher; the rewards are concentrated. The middle is being hollowed out.

Sources cited

  1. NPR — “Online news publishers face ‘extinction-level event’” (July 2025)
  2. The Atlantic — “AI Is Already Crushing the News Industry” (June 2025)
  3. TechCrunch — “Google’s AI search features are killing traffic to publishers”
  4. Digital Content Next — Member survey (August 2025)
  5. Futurism — Google AI Overview saturation analysis
  6. Forbes Tech Council — “AI And The Future Of Search”
  7. AdExchanger — “The AI Search Reckoning Is Dismantling Open Web Traffic”
  8. Press Gazette — Publisher lawsuits and licensing deals tracker
  9. American Bar Association — Section 230 / generative AI analysis

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

The honest skeptic’s case against GEO.

Rand Fishkin: fewer than 1 in 100 prompt runs return the same brands. Profound: 40–60% of cited domains change in a month. A founder shut down his GEO tool after concluding it was just good marketing. The strongest counter-arguments deserve a fair hearing.

< 1 in 100
Prompt runs that produce the same brand list (Rand Fishkin / SparkToro)
40–60%
Of cited domains change within one month (Profound)
< 1/3
Of SEO leaders maintain consistent GEO terminology
80%
Of GEO is repackaged fundamental SEO, per the strongest critics
Chapter 01

Why this article exists

Most published GEO content is sold by people whose business depends on GEO being important. That is not a disqualifying conflict, but it is a real one. The strongest case against the discipline deserves a hearing on its own terms — not a strawman, not a polite acknowledgment before the inevitable rebuttal, but the actual argument as its proponents make it.

What follows is the most credible version of the skeptical position, drawn from named practitioners with deep SEO experience and data-backed reasons for their skepticism. Where the position is right, the article says so. Where the position over-reaches, the article says that too. The goal is the honest assessment a serious buyer should be able to make before committing budget.

Chapter 02

Fishkin’s consistency finding

The most empirically devastating critique comes from Rand Fishkin’s SparkToro research published in 2025. The methodology was straightforward: ask the same brand-recommendation question of an AI engine, repeatedly, and measure how consistent the responses are.

The findings:

  • Fewer than 1 in 100 prompt runs produce the same list of brands.
  • Fewer than 1 in 1,000 produce the same list in the same order.
  • The probability of identical lists in identical order is under 0.1% regardless of platform, vertical, or query phrasing.

The implication, in Fishkin’s framing: any tool that gives a “ranking position in AI” is full of baloney. AI engines are probability machines designed to generate unique answers every time. Treating them as deterministic ranking systems is, in the technical sense of the word, nonsensical.

This critique is correct. It does not make GEO irrelevant — visibility percentage measured across many prompt repetitions is a valid metric — but it does invalidate a specific class of GEO marketing claims that promise “rank #1 in ChatGPT” or “guaranteed Perplexity placement.” Buyers should treat any vendor making such claims as either uninformed or dishonest.

“Any tool that gives you a "ranking position in AI" is full of baloney. — Rand Fishkin, SparkToro”
Chapter 03

Profound’s citation drift research

Profound — one of the largest GEO platforms by market presence — published research that arguably argues against the durability of its own category. The findings:

  • 40–60% of domains cited in AI responses change completely within one month.
  • 70–90% change over six months.

The interpretation: AI citation positions are radically more volatile than Google rankings ever were. A site that wins citation share this quarter has no strong reason to expect that share next quarter. The optimization investment may not compound the way SEO investment historically did.

This critique is partially correct. The citation drift is real and measurable. But the interpretation cuts both ways: the same volatility means challenger brands can capture share quickly, in ways that were structurally impossible against incumbent SEO authority moats. The volatility is bad for incumbents and good for challengers. Whether your organization sees it as a threat or an opportunity depends on which side of that line you sit on.

Chapter 04

Why Lorelight shut down

The most damaging critique of the GEO industry came from inside the GEO industry. Ben Goodey built Lorelight, an AI visibility tracking tool, then shut it down in 2025. His public reasoning was unusually candid.

The conclusion that ended the company: “Everything companies needed for GEO was the same as what they already needed for good marketing.” Lorelight experienced significant churn because GEO “currently has no long-term strategy” — tactics were “hacks more than strategies.”

From Goodey’s exit interview with Section: “Companies mostly expected a magic bullet. They thought they would log in and it would say, ‘if you do this one thing, suddenly you will be number one in ChatGPT.’ Instead they got: ‘if you keep doing the hard work, then you will show up.’”

This is the strongest skeptic argument because it comes from someone who tried to build the product, learned what wasn’t working, and was honest about the result. The fair response: Goodey is right that GEO is not a magic bullet. He is also right that 80% of the work overlaps with foundational marketing. He is wrong, in the read of multiple operators who continued investing, that the remaining 20% doesn’t matter — that 20% is exactly where citation share is decided.

Chapter 05

The snake-oil critique

Kai Spriestersbach, the AI researcher and SEO veteran widely cited in the European GEO conversation, made the bluntest version of the skeptic argument in Business Insider:

“Everyone is going crazy about becoming the next agency — all the side hustlers and snake oil sellers with their tools already on the train and riding the hype.”

The market reality validates the framing. 20+ GEO tools launched in 2025 alone. Several have raised at $100M+ valuations despite operating in a field where, as Spriestersbach notes, no one has proven, replicable methodologies. Lily Ray (Amsive) and Jeremy Moser (uSERP) have both publicly warned that “80% of GEO is good fundamental SEO” and that anyone claiming otherwise is selling snake oil.

Where the critique is right: the industry has attracted bad actors. Where the critique over-reaches: 80% overlap means 20% net-new discipline, and that 20% is where competitive positioning gets decided. The buyer’s test is not whether GEO has snake oil — every emerging discipline does — but whether the specific vendor in front of you publishes their methodology, cites primary research, and can show measurable lift attributable to their work.

Chapter 06

Where Google’s own representatives draw the line

The most institutional version of the skeptic position comes from inside Google. The platform’s public-facing representatives have been remarkably consistent:

  • John Mueller (Search Advocate, January 2026): “There is no such thing as GEO or AEO without doing SEO fundamentals.”
  • Nick Fox (Google): “Optimizing for AI search is the same as optimizing for traditional search.”
  • Danny Sullivan (Search Liaison): has emphasized that any GEO tools advising content designed “solely for rank and visibility purposes” lose “track of the big picture.”

The position is consistent across the company and over time: SEO fundamentals matter, manipulation will be resisted, there are no shortcuts.

This is the most credible institutional skeptic position because it comes from the platform that benefits most when SEO and GEO converge. Google’s incentive is to keep the optimization disciplines unified under its rules. The position is also empirically partially correct — Bounteous research found 99% of URLs in Google AI Mode appear in the top 20 organic search results. SEO foundations carry through to AI visibility.

Where the platform position over-reaches: it understates the 60% of GEO that is genuinely different from SEO — entity optimization, passage-level structure, cross-platform consistency, earned-media gravity, community presence. These are net-new disciplines that classical SEO does not cover.

Chapter 07

The thoughtful counter-position

The honest synthesis: every credible skeptic critique above is partially correct. None of them, taken individually or collectively, justify ignoring the underlying shift.

What the skeptics get right:

  • AI engines are not deterministic ranking systems; vendors selling “ranking positions” are misleading buyers.
  • Citation drift is real; optimization gains are less durable than SEO equivalents were.
  • 80% of GEO overlaps with foundational marketing; agencies pretending otherwise are overstating their differentiation.
  • The industry has attracted bad actors; due diligence on vendors is essential.
  • SEO fundamentals carry through; teams that ignore them while chasing GEO tactics are building on sand.

What the skeptics get wrong:

  • The 20% of GEO that is net-new (entity, passage, cross-platform, earned, community) is where competitive positioning is increasingly decided.
  • Citation drift cuts both ways; challengers gain share faster than they ever could against incumbent SEO moats.
  • “It’s just good marketing” understates how the structure of marketing has changed when 60% of consumer information starts in an AI conversation.
  • The discipline being immature doesn’t mean the underlying shift isn’t real; it means the playbook is still being written.

The honest buyer’s position is somewhere in the middle. Skepticism toward GEO vendors is healthy. Investment in the underlying capability is rational. The question is not whether to engage. It is which vendor, at what scale, with what measurement framework, and on what timeline.

Sources cited

  1. Rand Fishkin / SparkToro — “AIs are highly inconsistent when recommending brands” (2025)
  2. Profound — Citation drift research
  3. Section / Ben Goodey interview — “Is SEO dead or is GEO hype?”
  4. Business Insider — “AI Search Reshapes SEO, Fueling GEO Gold Rush” (Spriestersbach quote)
  5. MarTech / Mike Maynard — “GEO isn’t a fad — but most GEO tactics won’t survive”
  6. Search Engine Land — Mueller, Fox, Sullivan statements (multiple, 2025–2026)
  7. Bounteous — Google AI Mode top-20 organic citation overlap
  8. Search Engine Land — 75 SEO thought leader sentiment analysis

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

Black hat GEO: the manipulation playbook (and why it’s doomed).

Three categories of manipulation are spreading: data poisoning, citation stuffing, and hidden prompt injection. Harvard demonstrated text sequences that force LLM outputs. Reboot Online ran a negative GEO experiment against itself. The platforms are evolving faster than the attackers.

3
Categories of GEO manipulation now in active use across the industry
40%
Of ChatGPT insights pulled from Reddit (astroturfing target)
95%
Of consumers check reviews before purchasing
100%
Of Google’s 2024 spam action targets had AI-generated content
Chapter 01

Where the line is

TigerTracks, in one of the cleanest published framings, drew the operational distinction: “Optimization focuses on structural clarity and technical transparency. Manipulation focuses on deceptive influence.”

The line is not always crisp, but the categories are now well-documented. Three classes of GEO manipulation have emerged from the practitioner literature, each a direct descendant of black-hat SEO tactics adapted for the synthesis-based logic of AI engines. Each is more dangerous than its predecessor because AI synthesizes a single authoritative-sounding answer (versus ten blue links), so manipulation has outsized consequences. Users get one “truth” rather than multiple perspectives.

Chapter 02

Class 1 — Data poisoning and synthetic consensus

The cleanest version: use AI to generate hundreds of articles on a topic, all making the same claim, published across a network of mid-authority sites. The articles never quite reach top-tier publications, but they create enough repetition across the web that AI engines treat the manufactured consensus as evidence.

The 2023 SEO heist was the canary — 1,800 AI-generated articles, 3.6 million visits stolen from a competitor, public bragging on social media. Google eventually applied a manual action. The economics still favor the attacker: generating 1,800 mediocre articles costs under $100 today. The detection burden falls on platforms and publishers.

The astroturfing variant targets Reddit specifically because ~40% of ChatGPT’s insights are pulled from Reddit. Coordinated comment campaigns, fake review sites linking to manufactured discussions, AI-generated “authentic” user voices. Evertune AI’s warning is precise: “If Reddit detects and removes astroturfed content, that manipulated data won’t make it into the datasets that train AI models” — but the catch rate is far below 100%, and what slips through can persist in training data for years.

“Optimization helps the engine perform its duty. Manipulation hijacks the model's decision logic through noise. — TigerTracks”
Chapter 03

Class 2 — Citation stuffing and link farm 2.0

The second class adapts the link-farm playbook to citation-driven discovery. The mechanic: build a network of sites that cite each other in the specific patterns AI engines use to assess source authority — comparison tables, “best of” lists, expert roundups — then game the inclusion criteria.

Sports Illustrated’s 2023 incident was the high-profile cautionary example: AI-generated articles published under fake writer profiles with synthetic credentials. The credibility damage outlasted the traffic gain by years.

The defensive read: AI engines already weight third-party citations heavily, which makes citation stuffing tempting. The empirical read: platforms are getting better at recognizing the patterns. Google’s 2024 spam action affected pages with 100% AI-generated content and de-indexed half of those sites. The arbitrage window is narrowing.

Chapter 04

Class 3 — Hidden prompt injection and the Harvard finding

The most technically sophisticated class is the most concerning long-term. Harvard researchers demonstrated “strategic text sequences” — nonsensical-looking character strings that, when added to product pages or reviews, can force LLMs to generate specific outputs. The sequences look like garbled noise to humans but encode instructions the model interprets as commands.

Aounon Kumar, the Harvard researcher who published the work, framed the implication: “The challenge lies in anticipating and defending against a constantly evolving landscape of adversarial techniques.” Princeton’s Ameet Deshpande, co-author of the foundational GEO paper, was equally candid: “It’s a cat and mouse game. These generative engines are not static, and they’re also black boxes.”

The downstream concern: if adversarial content creators successfully game these systems at scale, “a lot of traffic is going to go to them, and 0% will go to good content creators.” The structural fairness of AI-mediated discovery depends on platforms staying ahead of the adversarial techniques. So far they have. The asymmetry favors them — they have the model weights and the detection infrastructure — but the cat-and-mouse dynamic is permanent.

Chapter 05

The negative GEO threat

The dark mirror of the manipulation playbook: negative GEO, where competitors attack your brand by publishing strategically placed negative content that AI engines pick up.

Reboot Online’s 2025 negative GEO experiment was the first published demonstration. Researchers planted negative claims about a test target across mid-authority sites and measured how often AI engines surfaced the manufactured negative narrative. Perplexity repeatedly cited the test sites with the fabricated negative claims. ChatGPT showed more resistance but was not immune.

The implication for brands: defensive GEO is not optional. Even if your team has no interest in offensive optimization, your competitors may. The brands with strong owned-content authority and earned-media presence are insulated; the brands without are exposed to attacks they cannot directly counter.

Chapter 06

Why platform pushback is consistent

The most underweighted signal in the manipulation conversation is platform consistency. Every major AI platform’s public-facing representatives have published essentially the same position:

  • Danny Sullivan (Google): Best practices centered on genuine helpfulness and authority will win long-term.
  • Krishna Madhavan (Microsoft Bing): No shortcuts; manipulation will be resisted.
  • Jesse Dwyer (Perplexity): Platform resistance to manipulation is a core engineering investment.

The institutional incentive aligns with the public statement: platforms cannot afford to lose user trust to manipulated outputs. Google’s 2024 spam action and ongoing model updates demonstrate the operational follow-through. The arms race favors platforms because they have the model weights, the detection telemetry, and the existential motivation.

The historical analogy: the first decade of SEO had a similar dynamic. White-hat operators built durable practices; black-hat operators captured short-term gains and were eventually punished. The asymmetry between durable and fleeting positions widens as platforms mature. GEO is in roughly the equivalent of SEO’s 2007 — the manipulation tactics work today, will mostly be detected within 18–24 months, and will leave the brands that depended on them holding the bag.

Chapter 07

The legal exposure most operators are missing

The under-discussed risk in the manipulation conversation is the legal one. Several of the techniques in active use are illegal, not merely against platform terms of service:

  • Astroturfing is illegal in many jurisdictions, including under FTC consumer protection authority.
  • Fake reviews have triggered FTC enforcement actions including $25M fines (recent cases).
  • Synthetic personas with fake credentials may constitute fraud depending on context and damages.
  • Hidden prompt injection has not been tested in court, but legal scholars consider it likely to be characterized as deceptive practice.

For brands evaluating GEO vendors: any vendor whose methodology relies on these techniques is exposing the buyer’s brand to legal risk in addition to platform risk. The due diligence question that should be asked is direct: “Walk me through every action you take that creates content, citations, or third-party signals on our behalf, and tell me whether each one would survive an FTC inquiry.” Vendors who can’t answer that question cleanly are not vendors a serious brand should engage.

Sources cited

  1. TigerTracks — “The Ethics of GEO: Where Performance Optimization Ends and Manipulation Begins”
  2. iPullRank — “Trust, Truth, and the Invisible Algorithm”
  3. Aounon Kumar (Harvard) — Strategic text sequences research
  4. Aggarwal et al. (Princeton) — GEO paper, adversarial techniques discussion
  5. Reboot Online — Negative GEO experiment (2025)
  6. Evertune AI — “7 Rules For Reddit Engagement That Improves AI Visibility”
  7. Search Engine Land / Jason Tabeling — “Black hat GEO is real”
  8. Similarweb — “Negative GEO: How Competitors Can Harm Your Reputation on AI”
  9. Peec AI — Google 2024 spam action analysis

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

GEO ethics in 2026: no framework, growing stakes.

No industry body. No code of ethics. No enforcement mechanism. As 37% of consumers start searches with AI and 82% are skeptical of the answers, the discipline is being built on every operator’s individual judgment.

0
Industry-wide GEO ethics frameworks adopted as of 2026
37%
Of consumers start searches with AI, not Google
82%
Are skeptical of AI results
8%
Always check sources
Chapter 01

The ethical imperative iPullRank named

iPullRank’s Michael King has done the most rigorous published thinking on the GEO ethics question. The frame: “engineer relevance responsibly, or allow the machines to engineer our reality for us.”

The choice is not theoretical. AI systems “perform truth rather than presenting it” — they generate single authoritative-sounding answers without the disclaimers and confidence-interval signaling that academic research carries. Users receive AI outputs with disproportionate trust. Manipulation of those outputs has disproportionate consequences.

The structural problem: no industry body has adopted a code of ethics for GEO. No certification, no standards, no enforcement mechanism. Each practitioner makes their own decisions about where the line sits. The most credible operators have published their own frameworks; the least credible have not.

“The invisible algorithm's most visible impact may be whether we choose to engineer relevance responsibly or allow the machines to engineer our reality for us. — iPullRank”
Chapter 02

TigerTracks’ line

TigerTracks proposed the cleanest operational distinction in the published literature: optimization helps the engine perform its duty to the user; manipulation hijacks the model’s decision logic through noise.

The test is functional, not procedural. Three diagnostic questions for any GEO tactic:

  1. Is the underlying claim true? If the optimization makes a true claim more findable, it’s helping the engine. If it makes a false claim more findable, it’s manipulating it.
  2. Would a knowledgeable human reviewer agree your brand belongs in the answer? If yes, the tactic is structural advocacy for an honest position. If no, the tactic is engineering an outcome the evidence doesn’t support.
  3. Does the tactic depend on the AI engine not noticing? If yes, it’s manipulation by definition. If the same tactic survives full transparency to the platform, it’s optimization.

Tactics that pass all three tests: structured content, schema markup, accurate brand entity definition, earned-media outreach, original research publishing. Tactics that fail any one test: synthetic consensus generation, fake reviews, hidden prompt injection, fabricated credentials.

Chapter 03

The bias dimension

UNESCO’s ongoing work on AI ethics has surfaced a less-discussed GEO concern: AI systems amplify the biases embedded in their training data. Gender, racial, cultural, and ideological biases that exist in the underlying corpus get propagated into AI recommendations.

Stanford GSB research on AI political bias found that both Republicans and Democrats perceived left-leaning bias in LLMs’ discussion of contentious topics. The directional finding is less important than the structural one: AI engines are not neutral arbiters. They reflect the perspectives of the corpora they were trained on and the humans who reinforced them.

For brands operating in politically or culturally sensitive categories (healthcare, education, financial services, news), the implication is operational: even unimpeachable optimization tactics can produce outputs that some user segments perceive as biased. The defensive read: brands need to monitor not just whether they’re cited but in what framing.

Chapter 04

The trust paradox

Exploding Topics’ 2025 consumer-trust research surfaced the paradox that defines the user-side ethics question:

  • 82% of users are skeptical of AI results.
  • Only 8% always verify sources.
  • Forbes found AI search results are more trusted than ads; consumers describe AI as “less cluttered” than search.
  • 37% of consumers now start searches with AI instead of Google (Search Engine Land).

The pattern: users distrust AI in principle and depend on it in practice. The skepticism does not translate to verification behavior. The result is the worst-case dynamic for GEO ethics — a user base that knows AI can be wrong but acts as if it were right, and a brand-side incentive structure that rewards the brands whose optimization is most aggressive whether or not the underlying claims are accurate.

Chapter 05

What ethical operators publish

In the absence of an industry framework, the most credible operators have published their own ethical commitments. The patterns that emerge across them:

  1. Disclose methodology. If a vendor cannot explain in plain language what they do for clients, the methodology is probably either trivial or dishonest. The credible operators publish their playbooks.
  2. Cite primary sources. Claims about AI engine behavior should be backed by named studies, not asserted. The credible operators link to their evidence.
  3. Refuse synthetic-volume tactics. Mass AI-generated content, fake reviews, synthetic personas. The credible operators publicly reject these.
  4. Document client commitments. What you will and will not do on a client’s behalf, in writing, before the engagement starts.
  5. Acknowledge uncertainty. AI engines are evolving black boxes. Vendors who claim certainty about future behavior are either selling something or not paying attention.

Aleyda Solis, who runs Orainti and the SEOFOMO newsletter (35K+ subscribers), has been among the most consistent published voices on the ethics question. Her central warning: “If you treat AI search solely as a performance channel — expecting traffic and revenue from every inclusion in AI answers — you’ll set yourself up for disappointment.” Treating GEO purely as performance optimization, not as relationship-building with the information ecosystem, is the path that leads operators across the ethical line.

Chapter 06

The buyer’s diligence framework

Until an industry body publishes a code of ethics, buyers are the de facto enforcement mechanism. The questions a serious buyer should ask any GEO vendor:

  1. Walk me through every type of action you take on a client’s behalf. Specific, not generic. Get the list in writing.
  2. Which of these actions involve creating content, citations, or signals that originate from sources other than the client? If the answer is “none,” the engagement is probably purely advisory. If the answer involves third parties, dig in.
  3. Do you publish synthetic personas, AI-generated reviews, or coordinated comment campaigns? The honest answer should be “no.” If it isn’t, walk away.
  4. How do you measure success? If the answer involves “ranking position,” the vendor either misunderstands AI engines or is misleading you.
  5. What happens to the brand’s reputation if a platform flags any of your tactics? The vendor should have thought about this. If they haven’t, the buyer is carrying the risk alone.

The brands that emerge from the next 24 months with strong AI visibility and intact reputations will be the ones whose vendors could answer those questions clean.

Sources cited

  1. iPullRank (Michael King) — “Trust, Truth, and the Invisible Algorithm”
  2. TigerTracks — “The Ethics of GEO”
  3. UNESCO — AI ethics research and bias documentation
  4. Stanford GSB — Political bias in LLM outputs
  5. Exploding Topics — “The AI Trust Gap: 82% Are Skeptical, Yet Only 8% Always Check Sources”
  6. Forbes — “AI Search Results More Trusted Than Ads”
  7. Search Engine Land — Aleyda Solis on AI search ethics
  8. LSEO, MaximusLabs, Nuoptima, Blazly — Vendor-published ethics frameworks

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

The legal front: 500+ lawsuits, antitrust, and AI defamation.

The New York Times sued OpenAI and Perplexity. Google faces EU antitrust over AI Overviews. The Section 230 question is unsettled. Wolf River Electric is testing AI defamation in court. Every brand operating in AI search needs a legal briefing.

500+
Publisher lawsuits filed against AI search platforms
$50M
News Corp annual licensing deal with OpenAI
2026
EU AI Act labeling requirements take effect
Unsettled
Section 230 protection for AI-generated content
Chapter 01

The litigation landscape

The publisher-vs-platform litigation cluster is the largest in the history of the open web. 500+ publications have filed or joined lawsuits against AI search platforms. The named cases that matter most for setting precedent:

  • The New York Times v. OpenAI & Microsoft — the most watched case; tests whether training and serving from copyrighted content constitutes infringement.
  • The New York Times v. Perplexity — separate case, focused on real-time citation rather than training.
  • Dow Jones v. Perplexity — joining the wave of suits over verbatim and near-verbatim use of paywalled content.
  • EU Commission antitrust investigation of Google — specifically about AI Overviews using publisher content without compensation.
  • Wolf River Electric v. Google — the first prominent test of AI-generated defamation against a search engine.

The cases are at different stages, in different jurisdictions, with different legal theories. The cumulative effect, regardless of individual outcomes, is the establishment of legal standards that will shape AI search for the next decade.

Chapter 02

The Section 230 question

The most consequential unsettled legal question in AI search is whether Section 230 of the Communications Decency Act — the law that protects search engines and platforms from liability for user-generated content they index — extends to AI-generated outputs.

The American Bar Association’s November 2024 analysis was unambiguous on the uncertainty: “the absence of comprehensive AI regulation clearly defining the contours and applicability of Section 230 immunity in the context of generative AI.” Traditional search engines are intermediaries that display third-party content. AI engines generate new content. The legal distinction matters.

If courts hold that AI-generated content is “speech” by the platform itself rather than indexed third-party content, Section 230 protections likely don’t apply. That would make AI platforms liable for the truthfulness of their outputs in the same way a publisher is liable for its articles. The downstream consequences would reshape the entire economics of AI search.

“If your AI acts as an agent of your business, you likely bear responsibility for what it tells people. — National Law Review”
Chapter 03

The Wolf River Electric defamation precedent

The most operationally relevant emerging case for brand-side legal exposure is Wolf River Electric v. Google. The company sued after Google’s AI-generated search results fabricated negative claims about it. The case is one of the first prominent tests of whether AI-generated defamation against a brand meets the legal standard for libel.

The Columbia Law Review’s 2025 analysis, “Redefining Defamation: Establishing Proof of Fault for AI Hallucinations,” framed the central question: traditional defamation law requires the defendant to have acted with knowing falsity or reckless disregard for the truth. How do you apply intent-based standards to a probabilistic system that doesn’t “know” anything?

The Battle v. Microsoft Corp. case is testing a parallel question — an Air Force veteran sued over AI-generated false claims about him. The legal framework for AI defamation is being built case by case in real time. For brands, the implication is twofold: you may have a cause of action when AI lies about you, and your competitors may have one when AI lies about them in your favor. Both directions of risk are operational.

Chapter 04

The EU AI Act and labeling requirements

The most concrete regulatory development is the EU AI Act, which began phasing in requirements through 2025 and 2026. The provisions most relevant to GEO operators:

  • AI-generated content must be clearly labeled as AI-generated, with disclosure obligations on the platforms generating the content and on entities deploying it.
  • Transparency requirements for systems that interact with consumers, including chatbots and AI search interfaces.
  • Risk-based classification with stricter requirements for high-risk applications (financial, medical, legal).
  • Conformity assessments for AI systems before market entry in regulated categories.

The extraterritorial reach matters: the EU AI Act applies to any AI system used by EU residents, regardless of where the system is operated from. US-based brands optimizing for AI visibility need to factor EU compliance into their content and disclosure strategies.

Individual US states are passing their own AI disclosure laws in parallel, creating a patchwork of state-level requirements. Federal AI legislation has not been enacted, but the FTC has begun applying existing consumer protection authority to AI claims — the May 2025 enforcement actions against companies making deceptive AI claims signal the direction.

Chapter 05

FTC enforcement and false-AI-claim risk

The Federal Trade Commission has not waited for new AI legislation. The agency has begun applying its existing consumer protection authority to AI-related deceptive claims. The pattern emerging from enforcement actions:

  • $25 million fine against a company that used deceptive AI claims to defraud consumers (recent FTC action).
  • Orders barring companies from advertising services dedicated to generating fake consumer reviews or testimonials — a direct strike against the manipulation playbook.
  • Joint statement from FTC, EEOC, CFPB, and DOJ clarifying that their existing authority covers AI applications.
  • Guidance against falsely claiming AI capabilities, falsely attributing AI-generated content to humans, and using AI to mislead consumers about products or services.

For brand operators: the FTC has signaled it will treat AI-related deception with the same enforcement posture as traditional deception. Brands engaging in synthetic-consensus manipulation, fake-review generation, or hidden-prompt techniques face FTC exposure regardless of whether platforms catch them first.

Chapter 06

The right-to-be-forgotten challenge

European data protection law gives individuals a right to be forgotten — the ability to demand the deletion of personal data from systems that hold it. The right has worked reasonably well for traditional search engines, which can de-index URLs.

It does not work for LLMs in any clean technical sense. You cannot simply delete information from a trained model. Information learned during training is encoded in the weights, distributed across billions of parameters, and not extractable in a targeted way. Re-training a model from scratch to remove specific information is computationally prohibitive at the scale of frontier models.

The legal response is still developing. Some platforms offer “output filtering” — preventing the model from producing certain information at inference time even though the underlying knowledge remains in the weights. Whether this satisfies right-to-be-forgotten requirements is being litigated. Brands and individuals seeking to remove inaccurate or outdated information from AI engines face a fundamental technical barrier the legal framework hasn’t yet caught up to.

Chapter 07

The negotiation alternative

Some publishers have chosen to settle rather than litigate. News Corp signed a $50M-per-year licensing agreement with OpenAI. The Associated Press has a similar deal. Other major outlets have followed.

The math behind the choice: a $50M annual deal is meaningful for News Corp’s revenue base and creates a multi-year predictable revenue stream during a period of structural traffic decline. Litigation would cost less but might yield nothing — and would not stop the traffic loss while the case wound through the courts.

The economics of negotiation favor only the largest publishers. Mid-tier publishers are not getting offered comparable deals. Local news, trade publications, and most B2B media are litigating or absorbing the loss with no realistic path to a licensing solution. The result is a two-tier publisher ecosystem: a small number of tier-1 outlets with AI revenue streams, and everyone else competing for what remains of the open-web traffic model.

Chapter 08

What every brand’s legal team should brief on

The operational read for in-house legal teams supporting marketing and brand functions:

  1. Vendor due diligence on GEO partners. Any vendor whose tactics include synthetic personas, fake reviews, astroturfing, or hidden prompt injection is exposing your brand to FTC and (in the EU) AI Act enforcement.
  2. Content disclosure obligations. AI-generated content used in customer communications may require labeling under EU and emerging US state law. Your content production process should track AI involvement at a granular level.
  3. Defamation monitoring and response. Establish a process for identifying AI hallucinations about your brand and documenting them for potential legal action. The Wolf River and Battle cases are establishing precedent that may give you remedies.
  4. Section 230 watch. The legal status of AI-generated content is being established this year and next. Your brand’s exposure to AI-generated misinformation may shift dramatically depending on how courts rule.
  5. Right-to-be-forgotten requests. If your brand or executives need to remove inaccurate information from AI engines, the technical and legal pathways are limited. Plan for slow timelines and partial remedies.
  6. Licensing deal evaluation. If your business depends on content that AI platforms are training on, the licensing-vs-litigation question may eventually apply to you. Track the evolving deal terms in publishing as a forward indicator.

Sources cited

  1. American Bar Association — “Beyond the Search Bar: Generative AI’s Section 230 Tightrope Walk”
  2. Copyright Alliance — AI Lawsuit Developments 2024 review
  3. Press Gazette — Publisher AI lawsuit and licensing tracker
  4. The New York Times — Filed complaints against OpenAI and Perplexity
  5. Columbia Law Review — “Redefining Defamation: Establishing Proof of Fault for AI Hallucinations”
  6. The New York Times — “Who Pays When A.I. Is Wrong?” (November 2025)
  7. White & Case — “AI Watch: Global Regulatory Tracker”
  8. European Commission — EU AI Act final text and implementation timeline
  9. FTC — Joint statement (FTC/EEOC/CFPB/DOJ) on AI authority
  10. National Law Review — “AI Hallucinations Are Creating Real-World Risks for Businesses”

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

The 10 gates: how AI search engines actually decide what to cite.

Most GEO writing describes outcomes. This one explains the engine. The pipeline from page on the open web to citation in an AI response is a 10-stage system — and most brands optimize for the wrong stages.

10
Stages a piece of content must pass through to be cited by an AI engine
50–90%
Of LLM citations don't fully support the claims they're attached to
38,065:1
ClaudeBot crawl-to-cite ratio (Cloudflare)
200B+
URLs in Perplexity's proprietary index
Chapter 01

Why “the page” is the wrong unit

For twenty years, the unit of search optimization was the page. SEO tools scored pages, ranked pages, recommended page-level fixes. The algorithm rewarded pages with strong overall signals.

AI search engines do not work this way. They work at the level of the passage — the discrete 40–150 word block of text that answers a specific question. A single page may contain twenty extractable passages, each competing independently for citation. A page that ranks well in Google AI Mode may have only one of its passages cited; the other nineteen contribute nothing to the eventual answer.

The shift from page-level to passage-level optimization is the most under-discussed structural change in the discipline. Most GEO advice still operates at the page level because that’s where SEO operated and where most practitioners’ mental models live. Operators who internalize the passage-level reality — who write content as a sequence of self-contained answer blocks rather than a continuous narrative — capture disproportionate citation share.

To understand why, you need to look inside the engine.

Chapter 02

The 10 gates

Search Engine Land published the cleanest framework for the AI engine pipeline in 2025: every piece of content must pass through ten distinct gates before it can be cited in an AI response. Most operators optimize for one or two of them. The brands that earn citation share understand all ten.

GateWhat happensWho optimizes for it
1. DiscoveryAI crawlers find your contentSEO foundations
2. SelectionCrawler decides to fetch the pageSEO foundations
3. CrawlingContent is downloadedSEO foundations
4. RenderingJavaScript is executedSEO (gap for AI crawlers)
5. IndexingContent stored in the search indexSEO foundations
6. AnnotationMetadata, entities, structured data addedAlmost no one
7. RecruitmentContent selected as candidate for a queryGEO basics
8. EvaluationPassages scored for relevanceGEO basics
9. DisplayedContent appears in AI responseGEO marketing
10. WonUser acts on cited contentGEO marketing

Most SEO advice still focuses on gates 2–5. Most GEO marketing focuses on gates 9–10. The biggest untapped opportunity is gate 6 — annotation — where structured data, entity markup, and semantic clarity decide whether the AI engine understands what your content is about.

Search Engine Land’s framing was direct: gate 6 is “the biggest untapped opportunity in search, assistive, and agential optimization right now.”

“Most SEO advice operates at gates 2–5. Most GEO advice operates at gates 9–10. The biggest opportunity is gate 6.”
Chapter 03

RAG explained operationally

Behind every gate is the same core technical pattern: Retrieval-Augmented Generation, or RAG. RAG is the engineering reality that determines what AI engines can and cannot cite, and how content needs to be structured to participate.

The RAG pipeline, in operational terms:

  1. Query understanding. The user’s prompt is parsed for intent. AI search prompts average 23 words versus 2–3 for traditional search — the engine has more to work with.
  2. Query fan-out. The single user prompt is broken into multiple sub-queries. A question about “best CRM for a SaaS startup” might fan out into queries about pricing, integration depth, ease of use, and customer support, each retrieved separately.
  3. Retrieval. Each sub-query hits the search index. Candidate documents are fetched.
  4. Chunking. Retrieved documents are broken into passages. The engine doesn’t reason over your whole page — it reasons over the slices it can extract.
  5. Embedding. Each passage is converted into a vector embedding — a numerical representation of its meaning in high-dimensional space.
  6. Scoring. The user’s prompt is also embedded. Cosine similarity between query embedding and passage embeddings determines which passages get pulled into the model’s context window.
  7. Synthesis. The LLM generates the final answer from the top-scoring passages, attaching citations to the source documents.
  8. Display. The user sees the synthesized answer with inline citations.

The implication for content: if your content can’t be cleanly chunked into 40–150 word passages with self-contained meaning, it never makes it past step 4. The engine throws away most of your page in favor of the passages it can extract.

Chapter 04

The Wu et al. citation accuracy crisis

The most consequential under-cited finding in the AI search literature comes from Wu et al., published in Nature Communications in April 2025. The researchers tested whether the citations AI engines attach to their answers actually support the claims being made.

The result: 50–90% of LLM citations don’t fully support the claims they’re attached to.

Read that twice. The citation footnotes in AI responses are frequently fictional in their support relationship. The cited source exists, but doesn’t actually back up the specific claim it’s attached to. The user sees an authoritative-looking citation; the underlying source is at best tangentially related.

A complementary finding from Venkit et al. (arXiv 2024): citation accuracy across major AI search platforms ranges from ~66% (best) to under 50% (worst). Even the best-performing platforms get the citation relationship wrong a third of the time.

The downstream effect: users see citations and trust the answer. 82% of users are skeptical of AI results in principle, but only 1% click into the cited source to verify (Pew). The citation acts as a credibility marker even when it doesn’t do the work of credibility. AI engines have, accidentally or not, evolved a system where the appearance of sourcing substitutes for actual sourcing.

Chapter 05

The crawl-to-cite ratio

Cloudflare’s 2025 telemetry on AI bot behavior surfaced a number that reframes how operators should think about visibility. ClaudeBot’s crawl-to-cite ratio is approximately 38,065:1. For every page Anthropic’s crawler reads, it cites approximately one in 38,000 in an actual user-facing response.

The other major AI crawlers operate at similar orders of magnitude. GPTBot grew 305% in crawling volume from May 2024 to May 2025, becoming the dominant AI crawler at 30% of all AI bot traffic. The volume of crawling is enormous; the volume of citation is comparatively tiny.

The implication: your content being crawled is not visibility. The crawl is the entry ticket; everything between the crawl and the citation is where the real competition happens. Brands obsessed with bot-traffic dashboards are watching the wrong number. Citation share — how often your content is the one in 38,000 that gets surfaced — is the real metric.

This also explains why analytics-only GEO platforms see so much traffic data without obvious correlation to outcomes: there’s an enormous funnel between the crawl event and the user-facing citation, and most of the funnel is invisible to the brand.

Chapter 06

Per-platform architecture differences

Each major AI engine implements RAG with different architectural choices. The differences matter because they determine which optimization moves work on which platforms.

EngineSearch backendCitation behavior
ChatGPTBing (sequential queries)3–5 citations per response; 87% correlation with Bing top-10
PerplexityProprietary 200B+ URL index + Bing API~13 citations per response; document + sub-document scoring
Google AI OverviewsGoogle index93.67% correlation with organic top-10 results
ClaudeBrave SearchEmbedded inline links; 38,065:1 crawl-to-cite ratio
GrokOwn index + X dataX conversations weighted heavily

The architectural divergence is meaningful for optimization strategy. Perplexity’s document-and-sub-document scoring means individual passages within a page compete independently — passage-level structure matters more here than anywhere else. ChatGPT’s heavy Bing dependency means that classical SEO investment in Bing visibility carries through directly to ChatGPT visibility. Google AI Overviews’ 93.67% correlation with organic top-10 means SEO foundations are the prerequisite, not a bonus.

The cross-platform implication is the same conclusion the published research keeps surfacing: only 11% of domains receive citations from both ChatGPT and Perplexity. The same content, run through two different engines, lands in two different citation pools. A unified GEO strategy has to anticipate the architectural differences, not assume them away.

Chapter 07

Why query fan-out changes content design

Query fan-out — the practice of decomposing a single user prompt into multiple sub-queries — is the architectural detail with the largest practical implication for how content should be written.

When a user asks ChatGPT for “the best project management tool for remote design teams,” the engine doesn’t run that exact query. It fans out into a series of sub-queries:

  • Best project management tools 2026
  • Project management for remote teams
  • Project management for design teams
  • [Specific tool] reviews (for each tool surfaced in the first three queries)
  • [Specific tool] vs [specific tool] (for comparative framing)
  • Project management tool pricing

The synthesized answer is built from passages retrieved across all of these sub-queries. A page optimized for the literal user prompt is competing in only one of the seven retrieval rounds.

The operational shift: content needs to satisfy multiple sub-queries simultaneously. A pricing-focused passage, a use-case-focused passage, a comparison passage, and a feature passage on the same product page each compete in different sub-queries. Pages that bundle these dimensions earn citation share across more retrieval rounds. Pages that focus narrowly compete in one round and are absent from the rest.

Chapter 08

The annotation gap

Returning to gate 6 — the annotation gate Search Engine Land called the “biggest untapped opportunity” — the practical work has three components.

Structured data. JSON-LD schema markup that gives AI engines a pre-digested summary of what each page is about. Organization schema. Product schema. FAQ schema (with the caveat that direct citation lift from FAQ schema is debated; the indirect benefit through entity grounding is real). Article schema with author and publication date. Most pages have minimal or generic schema; pages with comprehensive, accurate schema are interpretable in ways unstructured pages aren’t.

Entity markup. Explicitly defining the entities your content is about — brands, products, people, places — with consistent identifiers across owned properties and to external entity databases (Wikidata, Google Knowledge Graph). The Onely study found that LLMs grounded in knowledge graphs achieve 300% higher accuracy than those working with unstructured text alone.

Semantic clarity. Heading hierarchy that maps logically to the content’s argument structure. Passage breaks that align with discrete sub-topics. Internal linking that explicitly states relationships between related concepts. None of this is new SEO advice. The difference is that AI engines reward it more directly than ranking algorithms ever did.

Chapter 09

The Reddit and Wikipedia gravitational pull

The source distribution data tells a structural story about where AI engines pull from when they synthesize answers. Two domains dominate disproportionately:

  • Reddit: 40.1% of LLM citations across all platforms (Averi AI). For Perplexity specifically, 46.7% of top sources come from Reddit (Digital Bloom).
  • Wikipedia: 26.3% of LLM citations across platforms. For ChatGPT specifically, 47.9% of top sources come from Wikipedia.

Neither was a primary SEO target before AI search. Both are now first-class citation real estate. The mechanism is structural: Reddit is conversational, multi-perspective, written in the language users use; Wikipedia is encyclopedically structured with explicit entity definitions. Both are exactly the kind of content RAG architectures process most cleanly.

The operational implication for brands: presence on these platforms isn’t optional for serious AI visibility programs. Not in the form of astroturfing — community platforms detect and punish that — but as authentic participation, third-party coverage, and entity grounding. Brands that are authentically discussed on Reddit get cited by Perplexity. Brands that have well-maintained Wikipedia entries get cited by ChatGPT. Brands that have neither compete uphill regardless of how good their owned content is.

Chapter 10

What the engineering means for operators

The architectural details add up to four operational shifts every brand serious about AI visibility should internalize:

  1. Optimize for the passage, not the page. Write content as a sequence of 40–150 word self-contained answer blocks, each tied to a specific sub-query the engine might fan out into. Pages that read well as continuous narrative get cited less than pages structured as extractable answer units.
  2. Invest in gate 6 — annotation. Schema markup, entity grounding, knowledge graph submissions. This is where the largest under-claimed competitive opportunity lives. Most brands have done little here; the brands that invest pull ahead.
  3. Track citation share, not crawl traffic. The 38,065:1 ClaudeBot ratio is a reminder that crawl events are not visibility events. The metric that matters is how often your content is the one passage in 38,000 that the engine actually surfaces in a user-facing response.
  4. Build presence where AI engines pull from. Authentic Reddit participation, well-maintained Wikipedia/Wikidata entries, third-party editorial coverage on the publications each engine over-indexes on. The off-site signal layer is doing more of the work than most operators realize.

The brands that internalize the engineering reality — that AI engines are RAG pipelines with specific architectural constraints, not magic boxes — will out-execute the brands operating from a 2018 SEO mental model. The gap is the technical literacy. The work is the application.

Sources cited

  1. Search Engine Land — “The AI engine pipeline: 10 gates that decide whether you win” (2025)
  2. Wu et al. — Nature Communications, “Citation Support in Large Language Models” (April 2025)
  3. Venkit et al. — arXiv, AI search citation accuracy benchmark (2024)
  4. Cloudflare — AI bot crawl-to-refer ratio telemetry (2025)
  5. iPullRank — “AI Search Architecture Deep Dive: Teardowns of Leading Platforms”
  6. Perplexity Research — “Architecting and Evaluating an AI-First Search API”
  7. Towards Data Science — “The Architecture Behind Web Search in AI Chatbots”
  8. Digital Bloom — Cross-platform citation distribution analysis
  9. Averi AI — “Building Citation-Worthy Content” (Reddit/Wikipedia source distribution)
  10. Onely — Knowledge graph grounding accuracy study (300% improvement)
  11. Pew Research Center — User behavior with AI Overview citations
  12. Google AI for Developers — “Grounding with Google Search” documentation
  13. AWS — “What is RAG?” technical overview

Want this measured against your brand?

Get your AI Readiness Index (ARI) score across ChatGPT, Gemini, Perplexity, Claude, and Grok — delivered in 24 hours.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

How to Optimize for Google AI Overviews and Gemini: Being Indexed Is Not the Same as Being Selected

Google already crawls your site. That is the trap. AI Overviews now reach 2.5 billion monthly users and run on Gemini 3, yet they cite three to five sources from a near-infinite index. The work is no longer getting seen. It is getting selected.

2.5B
Monthly users on AI Overviews (Google I/O, May 2026)
48%
Of tracked queries now trigger an AI Overview (BrightEdge, Feb 2026)
62%
Of ecommerce AIO citations rank nowhere in the top 100 organic (BrightEdge, 2026)
47%
Fewer clicks on a traditional result when an AIO appears (Pew Research, 68K queries)
Chapter 01

The whole game changed in one sentence: Google already has your content

Every other AI platform has to discover you. ChatGPT crawls a slice of the web through Bing and OpenAI's own crawlers. Perplexity runs live retrieval against its own index. Claude leans on a curated set. They each decide, query by query, whether your domain even enters the candidate pool.

Google does not have that problem with you. If you have any organic presence, you are already in the index that feeds AI Overviews. AIOs are not powered by a separate training corpus. They are grounded in Google's live search index through a custom Gemini model, which is precisely why they hallucinate less than standalone chatbots. The retrieval layer is the same Search infrastructure that has ranked you for years.

That sounds like an advantage. It is actually the hardest version of the problem. When the entire web is already a candidate, the differentiator is not discovery. It is selection. Gemini 3 became the default AI Overviews model on January 27, 2026, and each AIO now pulls 32% more sources per response than the prior model, per SE Ranking's analysis. More sources per answer does not mean more brands win. It means the bar for which specific pages get pulled into a synthesized answer moved, and it moved underneath everyone at once.

AIOs are grounded in Google's live search index. Discovery is solved. The contest is selection, and selection runs on different signals than ranking.

This is the core split that governs everything below. On Google, visibility is cheap and influence is scarce. Being seen is the baseline. Being selected, summarized, and cited is the asset. Most brands optimize for the first and wonder why the second never arrives.

Chapter 02

How Google's AI surfaces actually choose what to cite

AIOs run a query fan-out. The model takes one query, expands it into a cluster of related sub-queries, retrieves documents across each, and assembles a consolidated answer that cross-verifies across multiple domains. A featured snippet extracts from one page. An AIO synthesizes across several. That difference is the whole reason single-keyword optimization no longer works on this surface.

The candidate pool starts from Google's core ranking systems: RankBrain, BERT, PageRank, the Helpful Content and Reviews systems, passage ranking. Organic rank is the entry ticket. But rank alone does not get you cited, and the gap between the two has widened sharply.

  • 97% of AI Overviews cite at least one source from the top 20 organic results for that query (seoClarity, 432,000 keywords). Ranking is still the most reliable on-ramp.
  • And yet only 38% of cited pages also appear in the top 10 organic results, down from 76% in the prior study (Ahrefs). The rest split almost evenly: 31.2% from positions 11 to 100, and 31.0% from beyond position 100.
  • Position 1 pages appear in AIOs more than 50% of the time (seoClarity), but a page in position 6 can earn more AIO citations than the page in position 1. Citation position inside the AIO, not organic rank, is now the visibility metric that matters.

Read those three numbers together. Ranking improves your odds. It does not secure your selection. Nearly a third of everything AIOs cite ranks nowhere near the top of the page it appears on. Google is deliberately reaching past rank to assemble a diverse, cross-verified answer.

After Gemini 3, the churn intensified. Gemini 3 replaced roughly 42% of previously cited domains in AI Overviews (SE Ranking), and 88% of AIOs now cite three or more sources while only 1% cite a single source (SE Ranking). The single-source AIO is effectively dead. If your strategy was to win one snippet-style answer box, that strategy expired on January 27.

The selection signals that survive the churn cluster into five buckets:

  1. E-E-A-T and existing trust. Experience, expertise, authoritativeness, trust. LLMs read author bios for real credentials, not third-party domain scores. There is no "domain authority" input.
  2. Semantic relevance, not keyword matching. Gemini can cite a page that contains none of the query's keywords if the page is semantically on-point and from a reliable domain.
  3. Concise, structured formatting. Direct answers near the top, headings, lists, tables, FAQ blocks, clean semantic HTML.
  4. Freshness. AIOs favor recently updated, reliable information. Stale pages get replaced, which is exactly what the Gemini 3 domain churn demonstrated.
  5. Brand citations across the web, including unlinked mentions in authoritative, niche-relevant content. Quality of the context beats quantity of the links.
Chapter 03

The signal almost nobody budgets for: YouTube

Here is where Google diverges from every other AI platform, and where most GEO programs leave the most value on the table. Google's AI surfaces lean disproportionately on Google's own ecosystem, and the single biggest beneficiary is YouTube.

  • YouTube's coverage across tracked AIO keywords runs nearly 3x any other non-brand domain (BrightEdge), and YouTube citations inside AIOs are up roughly 25% since early 2024.
  • YouTube enjoys a 200-fold advantage over any competing video source in AI citations broadly (TechEdge AI / 5W Citation Source Index 2026). There is no second place in video.
  • In January 2026, YouTube officially overtook Reddit as the most-cited social platform across AI-generated answers (PikaSEO, GEORaiser). On Google specifically, YouTube and Reddit run near parity, while Perplexity still cites Reddit 6.1x more than YouTube.
  • Google AI Overviews drives 36.6% of all YouTube citations observed across platforms, making it one of the two dominant engines pulling video into answers (OtterlyAI, 100M+ citations).

The execution detail that separates winners from spectators: format and timestamps. 94% of YouTube AI citations go to long-form video, not Shorts (OtterlyAI). And when a YouTube citation includes a specific timestamp, 73% of the time it surfaces in Google AI Overviews (OtterlyAI). Google can reach into a clearly chaptered, timestamped video and lift the exact 40-second segment that answers a sub-query in its fan-out.

On Google, a chaptered, timestamped, long-form YouTube video is not a marketing asset. It is a citation surface. Most brands fund text and ignore the one format Google cites most.

If your GEO budget has zero allocation to YouTube, you have a structural blind spot on the platform with 2.5 billion AI users. This is the clearest "being seen vs. being selected" gap in the entire study. A brand with strong organic rankings and no video can be perfectly visible to Google and still never get selected into the part of the answer Google trusts video to fill.

Chapter 04

AI Overviews vs. AI Mode vs. the Gemini app: three surfaces, three rule sets

These are not interchangeable. They reach similar conclusions and cite almost entirely different sources, which means winning one does not win the others.

AI Overviews are the summaries injected above organic results. They are the default, mass-reach surface: 2.5 billion monthly users as of Google I/O, May 2026. They cite 7.7 domains per query on average, run a ~11% no-citation rate (answers with no external sources), and reference about 1.3 entities per response (Ahrefs, 730K responses).

AI Mode is the conversational, follow-up surface inside Search. It cites 9.2 domains per query, produces responses roughly 4x longer, references 3.3 entities, and has only a 3% no-citation rate (Ahrefs). For a brand that qualifies, AI Mode is the more reliable citation surface because it almost always cites something, and it leans harder on encyclopedic sources: Wikipedia appears in 28.9% of AI Mode responses vs. 18.1% of AIOs (Ahrefs).

The critical finding for anyone allocating effort: AI Mode and AI Overviews reach 86% semantic similarity but share only 13.7% of their citations, and 77% of unique cited domains appear on just one surface, not both (Ahrefs). Appearing in an AI Overview does not put you in AI Mode. They are two separate selection contests that happen to agree on the answer.

The standalone Gemini app is a third channel entirely. It is a conversational assistant where buyers research vendors and make decisions inside the chat, grounded through Grounding with Google Search and now Google Maps. Gemini 3's grounding metadata even added media_id for visual citations and page_numbers to pinpoint where information was found. The Gemini app behaves more like a true chatbot than a SERP feature, which means it pulls more brand-level recommendations than the deliberately commercial-shy AIO.

One more thing changed the map. On May 19, 2026, Liz Reid announced at Google I/O that AI Overviews and AI Mode are merging into "one seamless AI Search experience." The two surfaces will converge. The two citation pools, today only 13.7% overlapping, are the integration risk and the opportunity. Brands that qualify for both today are positioned for whatever the merged surface inherits.

Chapter 05

Why Google barely mentions brands, and what that means for commercial intent

If you sell something, internalize this before you build a Google GEO plan: AIOs are an educational layer, not a commercial one, by design.

BrightEdge's ecommerce data is blunt. Google AI Overviews include a brand in only about 6.2% of ecommerce responses, averaging 0.29 brands per answer. Compare that to ChatGPT at 99.3% of responses and 5.84 brands, Perplexity at 85.7%, and Google's own AI Mode at 81.7%. On the brand-mention axis, AIOs are the minimalist of the entire field. Google routes commercial intent to organic results and ads, and reserves the AI summary for explanation.

This is a strategy choice, not a bug. 62% of ecommerce AIO citations come from sources that rank nowhere in the top 100 organic results (BrightEdge). Google is pulling explanatory, comparison, and how-it-works content into the answer, then handing the actual transaction to the links and ads below. Google Ads now appear on roughly 25% of AIO SERPs, up from under 1% in March 2025.

The contrast with ChatGPT is the whole point. ChatGPT and Google disagree on brand recommendations 62% of the time (BrightEdge), and Google AI Overviews are 44% more likely to criticize a brand than ChatGPT (BrightEdge). Google is cautious, hedged, and educational. ChatGPT is confident and recommendation-first.

Do not optimize Google AIOs to recommend your product. Optimize them to explain the category and cite you as the authority who explained it. The recommendation happens elsewhere on Google.

What does not work on Google, then, is the playbook that works on ChatGPT: product-led, "best X for Y" listicle bait engineered to win a recommendation. AIOs will mostly refuse to recommend at all. What works is owning the educational layer. Be the page Google cites when it explains how the category works, then let your organic listing, your ads, and the standalone Gemini app capture the intent the AIO deliberately won't.

Chapter 06

What this does to clicks: the zero-click squeeze is real, but it is not uniform

The headline number is brutal and well-sourced. The most rigorous controlled study, Pew Research Center's analysis of 68,000 queries, found users are 47% less likely to click a traditional result when an AI Overview appears, and only 8% of visits that include an AIO end in any click. Other large datasets agree on direction: Seer Interactive measured a 61% CTR drop on AIO queries across 2.43 billion impressions, and Ahrefs estimated a 58% lower average CTR for the top-ranking page when an AIO is present.

That is the cost of zero-click. But three findings keep this from being a simple death spiral, and they are where the strategy lives:

  • Citation is now a paid-tier signal. Brands cited inside an AI Overview earn 35% more organic clicks and 91% more paid clicks than non-cited brands on the same SERP (BrightEdge). The click pie shrank, but being cited buys you a far larger slice of what remains.
  • There are early signs of a floor and a rebound. Organic CTR on AIO queries fell to roughly 1.3% in December 2025 and recovered to 2.4% by February 2026 (BrightEdge). Google is actively tuning, and AIOs are now abridged by default with a "Show More" expansion that pushes organic results back into view.
  • The traffic that survives converts. Practitioner data consistently shows AIO click-throughs arrive with higher intent and longer time on site. Fewer clicks, higher quality, which is exactly the conversion economics that defines the AI-search era.

The strategic read: stop measuring success in raw organic clicks on AIO queries. That number is going to keep falling and there is no playbook that reverses it. Measure citation share. Being cited is the new ranking. On Google more than anywhere, the mention is the signal, and the click is a downstream bonus.

Chapter 07

The execution playbook

Eight moves, ordered by leverage on Google's specific selection mechanics.

  1. Earn the top 20, then stop expecting rank to do the rest. 97% of AIOs cite from the top 20 organic results, so ranking is the entry ticket. But with only 38% of citations coming from the top 10, treat strong rankings as necessary, not sufficient. Do not over-invest in moving position 6 to position 3 when the citation contest is decided by other signals.
  1. Build topical clusters, not keyword pages. AIOs fan a query out across sub-questions. A seed page interlinked with a cluster of supporting pages gives Gemini complete coverage to assemble its answer from. Comprehensive beats narrow.
  1. Fund YouTube as a citation surface. Publish long-form (not Shorts), with clean chapters and timestamps. Timestamped video surfaces in AIOs 73% of the time. This is the single highest-leverage move unique to Google and the one most programs skip.
  1. Front-load direct answers. Put a concise, declarative answer near the top of every page, then expand. Use headings, lists, tables, and FAQ blocks. Gemini lifts the clean summary, not the buried one.
  1. Own the educational layer, not the recommendation. With brands in only 6.2% of ecommerce AIOs, write to be cited as the authority that explains the category. Route commercial intent to organic, ads, and the standalone Gemini app, which mention brands far more.
  1. Make E-E-A-T machine-readable. Real author bios with verifiable credentials. Match your structured data to visible content. JSON-LD for author, FAQPage, article, review, and sameAs. No special schema is required, but well-implemented schema raises the probability of correct interpretation; poorly implemented schema demonstrably lowers AI visibility.
  1. Keep content fresh on a schedule. Gemini 3 replaced 42% of previously cited domains. Freshness is not cosmetic. It is how you survive model updates that re-pull the citation pool.
  1. Qualify for AI Mode separately, and watch the merge. With only 13.7% citation overlap, appearing in AIOs does not earn AI Mode. AI Mode rewards longer, more entity-rich, encyclopedic content and almost always cites something (3% no-citation rate). With the two surfaces merging into one experience announced at I/O 2026, qualifying for both now is forward protection.

One thing to skip: do not block Google-Extended thinking it removes you from AI Overviews. Google-Extended only governs AI training use; blocking it has zero effect on AIO inclusion or Search rankings. And there is currently no way to opt out of AIOs while staying in regular Search, a gap now under UK CMA scrutiny. You are in this surface whether you optimize for it or not. The only choice is whether you get selected.

Sources cited

  1. Pew Research Center — controlled study of 68,000 queries; users 47% less likely to click a result when an AIO appears, 8% click rate.[02]Ahrefs — 730K-response study of AI Mode vs. AI Overviews citation overlap (13.7%), domains per query (9.2 vs. 7.7), no-citation rates, entity counts; separate CTR analysis (58% lower for top-ranking page).[03]seoClarity — 432,000-keyword analysis; 97% of AIOs cite a top-20 organic source; Position 1 pages cited >50% of the time; ~475% YoY growth in AIO presence on US mobile.[04]SE Ranking — Gemini 3 impact analysis; 42% of cited domains replaced, 32% more sources per response, 88% of AIOs cite 3+ sources, 1% cite a single source.[05]BrightEdge — ecommerce brand-mention rate (6.2%, 0.29 brands), 62% of citations outside top 100, YouTube ~3x other non-brand domains, 48% AIO trigger rate, +35%/+91% click lift for cited brands, 44% more likely to criticize brands than ChatGPT, ~25% of AIO SERPs carry ads.[06]OtterlyAI — 100M+ citation dataset; AIOs drive 36.6% of YouTube citations, 94% of YouTube citations go to long-form, timestamped video surfaces in AIOs 73% of the time.[07]TechEdge AI / 5W Citation Source Index 2026 — YouTube's 200x advantage over competing video sources; top-15 domains capture 68% of the citation pool.[08]PikaSEO / GEORaiser — YouTube overtook Reddit as the most-cited social platform in AI answers in January 2026.[09]Seer Interactive — 61% organic CTR drop on AIO queries across 2.43 billion impressions.[10]Google I/O 2026 (Sundar Pichai, Liz Reid) — AIOs at 2.5 billion monthly users; AI Overviews and AI Mode merging into one AI Search experience; Gemini 3 default for AIOs Jan 27, 2026.[11]Semrush — 10M-keyword tracking; AIO prevalence from 6.49% (Jan 2025) to 24.61% peak (Jul) to 15.69% (Nov 2025).[12]UK Competition and Markets Authority — January 2026 proposals on publisher opt-out from AI Overviews.

Want this measured against your brand?

FancyAI's AI Readiness Index measures whether Google's AI surfaces actually select and cite you, not just whether you rank. See your citation share across AI Overviews, AI Mode, and the Gemini app before the merge reshapes the field.

Related research

Predictive Signals Original Research Macro Analysis
Back to Research
Live · FancyAI Research Corpus

How to optimize for Perplexity: the answer engine that reads the live web

Perplexity does not rank pages. It reads the web in real time, pulls from thousands of sources, and cites them in the answer. Reddit, fresh content, and structured proof decide who gets named. Most of what works on Google does not work here.

46.7%
Share of Perplexity responses that cite Reddit (Profound)
5.2
Unique domains cited per Perplexity answer, vs 3.1 for ChatGPT (Omniscient Digital)
11%
Domain overlap between what ChatGPT and Perplexity cite (Averi)
3.1×
Perplexity referral conversion vs non-branded Google organic (MarGen)
Chapter 01

Perplexity is a search engine wearing a chatbot's clothes

Treat Perplexity like ChatGPT and you will optimize for the wrong machine. ChatGPT answers mostly from a frozen pretrained model and reaches for the web when it has to. Perplexity does the opposite. Every query triggers a live retrieval against a proprietary index, the model reads what it finds, and it cites the sources inline. There is no answer without sources, and there is no source without a live web fetch.

That architecture is not borrowed. While ChatGPT historically leaned on Bing and Google AI Overviews sit on top of Google's existing index, Perplexity built its own crawling, indexing, and retrieval pipeline from scratch. The result is an index that tracks over 200 billion unique URLs, updated on a near-real-time cadence, with the company claiming roughly 24 to 48 hour average retrieval freshness (Perplexity; Am I Cited, 2026). The model family that grounds answers against that index is Sonar, which runs a hybrid retrieval pipeline blending lexical (keyword) and semantic (meaning) signals to find the most relevant passages at the sub-document level rather than ranking whole pages.

The scale is now serious. Perplexity processes 35–45 million queries per day, up from roughly 30 million in mid-2025, and crossed over a billion queries per month by early 2026 (Business of Apps; Wikipedia, 2026). It serves over 45 million monthly active users, rising past 100 million when its agent and browser products are counted (Demandsage, 2026). Revenue tells the same story: the Financial Times reported annual recurring revenue topped $450 million in March 2026, with Sacra's April estimate putting it near $500 million — a 335% year-on-year increase.

Perplexity does not generate responses from a frozen model. It searches the web in real time, aggregates results, and cites its sources. It is a conversational search engine, not a chatbot.

Market share is the one number that cuts against the growth narrative. Perplexity holds roughly 6.4% to 8% of the AI chatbot market, a distant third behind ChatGPT (~82%) and Microsoft Copilot (~7%), and its U.S. daily-active-user share slipped to around 2.1% in March 2026 from a peak near 6% in October 2025 (multiple statistics aggregators, 2026). The lesson for brands is not that Perplexity is small. It is that Perplexity is the most measurable, most research-intent, and most citation-dense of the major engines, even if it is not the largest. You optimize for it because the people using it are buying.

Chapter 02

Reddit is the front door, and the data is not subtle

Across every credible 2026 study, the single most important fact about Perplexity sourcing is the same: community content wins. Profound's analysis of 10,000 commercial queries found that Perplexity cites Reddit in 46.7% of its responses — by a wide margin the most frequently cited source, with no other domain close. Peec AI's analysis reached the same conclusion from a different angle, finding that Reddit accounts for as many as one in five of all citations on Perplexity, the highest concentration of any single domain on any platform.

This is not a Perplexity quirk. Reddit is the #1 cited source across essentially every major AI engine, sitting near 40% citation frequency across LLMs in the aggregated AI Platform Citation Source Index, which pooled more than 680 million citations harvested from ChatGPT, Google AI Overviews, Perplexity, Gemini, and Claude between August 2024 and April 2026 (5W / techedgeai, 2026). But Perplexity leans on it harder than anyone, because Reddit supplies exactly what Perplexity's source-selection criteria reward: first-person experience, current discussion, and genuine peer opinion.

The mechanism behind the correlation is now measurable. SE Ranking's study of 129,000 unique domains found that domains with millions of brand mentions on Reddit averaged 7 citations versus 1.8 for domains with minimal Reddit presence — a 3.9× multiplier on visibility. Reddit presence does not just earn the direct Reddit citation. It raises the citation rate of your owned domain too.

Reddit presence does not only earn the Reddit citation. It raises the odds the engine names your own site, even when it pulls from a different page.

A note on the headline source debate. Ahrefs' June 2026 most-cited-domains study reported YouTube leading Perplexity by mention share at 32.4%, with Wikipedia third at 8.2%. The apparent contradiction with the Reddit data is a measurement artifact: YouTube ranks high on raw mention share because video is increasingly surfaced and embedded, while Reddit ranks #1 on response coverage (the share of answers that include at least one Reddit citation). For a brand, both readings point the same way. The engine wants community proof and multimedia, not a polished landing page in isolation.

What this means in practice:

  • Be present where the conversation is. Active, genuinely helpful participation in relevant subreddits is the single highest-ROI Perplexity tactic. Self-promotional spam gets buried and does the opposite.
  • Earn discussion, do not fake it. The citation follows real engagement signals, not keyword stuffing in a thread title.
  • Treat YouTube as a citation surface, not just a brand channel. A clear, well-titled explainer video is a retrievable source.
Chapter 03

Freshness is the lever Perplexity pulls hardest

Every AI engine claims to like fresh content. Perplexity acts on it more aggressively than any of them. Because retrieval is live and the index updates continuously, there is no model-retraining bottleneck. A well-structured new page can appear in citations within days, sometimes hours, of publication.

The data backs the bias. Roughly 50% of Perplexity citations come from content under 13 weeks old, and refreshed pages can move from invisible to cited inside a single index cycle (Foglift, 2026). Independent ranking-factor breakdowns put content freshness at roughly 15% of Perplexity's citation weighting, sitting just behind content relevance (~30%), visual placement (~20%), and domain authority (~15%) (Demand Local, 2026). On a platform where the top-cited domain rarely exceeds 5% of total citations, a 15% recency weight is enormous leverage.

This is the inverse of how SEO conditioned marketers to think. In classic search, an aged page with accumulated backlinks is an asset. On Perplexity, an un-updated page is a decaying one. The KnowledgeBase corpus on Perplexity's internal ranking patterns describes an explicit time_decay_rate that begins eroding visibility 2 to 3 days after publication unless the content is refreshed.

The execution implication is concrete and unglamorous:

  • Put a visible "Last updated" date near the top of important pages. Perplexity, ChatGPT with browsing, and Gemini all read last-modified signals when choosing sources (Foglift, 2026).
  • Refresh substantively, not cosmetically. Changing a date stamp without updating the content is the kind of thin signal these systems are increasingly built to discount.
  • Keep a changelog on technical and data pages. A dated revision history reads as ongoing maintenance, which is exactly the freshness proxy the crawler rewards.

On Google, an old page with backlinks is an asset. On Perplexity, an un-updated page is a depreciating one. Maintenance is the strategy.

Chapter 04

What it cites is different from what ChatGPT cites

The most strategically important number in this report is 11%. That is the domain overlap between what ChatGPT and Perplexity cite, per Averi's analysis of 680 million citations. The two platforms are, in practice, separate ecosystems. A page that ChatGPT loves may be invisible to Perplexity, and the reverse is just as true. There is no single "AI SEO" that covers both.

The structural differences explain why:

  • Citation density. Perplexity averages 5.2 unique domains per answer against ChatGPT's 3.1 and Claude's 2.8 (Omniscient Digital, May 2026). BrightEdge measured an even higher 8.79 citations per Perplexity response in its analysis. More citation slots means more room for your brand to be one of them.
  • Citation rate. Superlines measured a 15.43% citation rate on Perplexity versus 2.78% on ChatGPT — Perplexity is the most measurable AI citation channel by a wide margin.
  • Source philosophy. The share of citations from user-generated content ranges from 0.2% to 18% depending on the engine, roughly a 90× spread across engines answering the same questions (BrightEdge, 2026). Perplexity sits at the high end. ChatGPT skews toward concentrated, authoritative knowledge bases and its own parametric memory.
  • The Google comparison. Google AI Overviews mention brands in only about 6.2% of ecommerce responses, while ChatGPT names them in 99.3% (BrightEdge, 2026). Perplexity sits closer to ChatGPT's brand-friendly end while sourcing far more widely.

There is one unifying insight that should shape strategy. BrightEdge's 2026 finding is that engines disagree on where to pull information but agree more on which brands belong in the answer. Sources diverge; brand consensus converges. That is the core FancyAI thesis in the data: the engines are recommending, not ranking, and the mention is the signal. Your job is not to win one ranking. It is to become the brand the engines agree belongs in the answer, then to make sure each engine can find a source it trusts to support naming you.

Chapter 05

What does not work on Perplexity

The fastest way to waste a Perplexity budget is to import a 2019 SEO playbook. Several tactics that still move Google do little or nothing here.

  • Backlink accumulation as an end in itself. Domain authority matters, but at roughly 15% of weighting it is one input among several. Reddit presence, freshness, and structure routinely outweigh raw link count. The SE Ranking data showing Reddit mentions driving a 3.9× citation lift makes the point: earned community signal beats earned links.
  • Keyword density and on-page optimization for a target phrase. Perplexity retrieves at the sub-document level using semantic embeddings. It is matching meaning, not exact strings. Writing for a keyword instead of for a clearly answered question leaves citations on the table.
  • Aging a page and leaving it alone. As Chapter 03 showed, this is actively counterproductive. Time decay erodes an un-maintained page.
  • Blocking the crawlers without understanding them. Perplexity runs two agents. PerplexityBot indexes content for search and respects robots.txt. Perplexity-User handles real-time, user-initiated fetches and generally ignores robots.txt because a human asked for that specific page (Perplexity docs; KnowledgeBase corpus). Blocking PerplexityBot removes you from the index entirely while doing nothing to stop user-initiated reads — the worst of both outcomes.
  • Local "near me" optimization. Perplexity has no equivalent of Google Business Profile and excels at informational and research intent over navigational. For local discovery, Perplexity is a low priority relative to Google AI Overviews.
  • Paid placement. There is none to buy. Perplexity tested sponsored follow-up questions, stopped, and has said it could "never ever need to do ads" (Search Engine Land, 2025). Its product cards in Shopping are explicitly unsponsored. You cannot pay your way into an answer.

You cannot buy a Perplexity answer. There are no ads in the answer, no paid product placement, and the shopping cards are unsponsored. Influence is earned through sources, not spend.

Chapter 06

Shopping and Comet: the answer engine becomes a buying agent

Two 2026 developments turn Perplexity from a research tool into a transaction channel, and both reward the same earned-source discipline.

Perplexity Shopping launched in beta in November 2025 and went free for all U.S. users in February 2026 (eMarketer; Shopify, 2026). It surfaces unsponsored product cards and, through Buy with Pro, lets users check out inside the chat without visiting the merchant's site. The merchant economics are aggressive: the Perplexity Merchant Program sets fees, commissions, and listing charges at zero, versus the 4% transaction fee ChatGPT charges through its Agentic Commerce Protocol, and Perplexity funds free shipping itself (AI Advantage Agency; Stellagent, 2026). In-app checkout expanded to all Shopify merchants in January 2026.

The buyer demographics are why this matters more than its market share suggests. Perplexity's user base skews 80% college-educated and 65% high-income, and shoppers arriving from Perplexity spend 57% more per order than those from other AI platforms (productsup; statistics aggregators, 2026). To appear, brands feed structured product data through the Merchant Program; relevance and quality signals, not ad spend, decide what surfaces.

Comet, Perplexity's AI-native browser, reached iOS in March 2026 and went free on all platforms after originally launching at $200/month (The Keyword; MacRumors, 2026). Its assistant reads the active tab, summarizes pages, and runs multi-step tasks across sites. For brands, Comet matters because it changes attribution: Comet sessions create distinct referral patterns in GA4, and Perplexity's publisher program shares revenue partly from direct visits via Comet (MarTech, 2026).

The traffic that does arrive converts. Perplexity-referred sessions convert at 3.1× the rate of non-branded Google organic and in B2B contexts hit a 10.5% conversion rate, with 4.7× longer session duration and an average of 9-plus minutes on site (MarGen, 2026). 64% of Perplexity users are professionals doing work-related research, and 52% of referrals land on deep pages — research articles, statistics pages, methodology pages — rather than homepages. Perplexity sends fewer visitors than Google, but the ones it sends are further down the funnel and already convinced enough to click a citation.

Chapter 07

The Perplexity playbook

Translate the above into an execution order, sequenced by leverage.

  1. Build genuine Reddit presence. Identify the subreddits where your category is discussed and contribute real expertise over months, not a launch week. This is the highest-ROI move because it earns both the direct Reddit citation (46.7% of responses) and a measured 3.9× lift on your own domain's citation rate.
  2. Date and maintain your cornerstone pages. Add visible "Last updated" stamps, refresh content substantively on a cadence, and keep changelogs on data and technical pages. Freshness is ~15% of weighting and the lever Perplexity pulls hardest.
  3. Structure for extraction. Use comparison tables, step-by-step formats, FAQ blocks, and clear headings. Perplexity retrieves sub-document passages; make the passage that answers the question self-contained and unambiguous.
  4. Write for questions, not keywords. Semantic retrieval matches meaning. Map the real questions your buyers ask and answer each one cleanly on the page.
  5. Treat YouTube as a citation source. A well-titled, well-structured explainer is retrievable and shows up in Video focus and general answers alike.
  6. Let both crawlers in. Allow PerplexityBot in robots.txt and confirm it can reach your priority pages. Blocking it only removes you from the index.
  7. Feed the Merchant Program if you sell products. Zero fees, free shipping, unsponsored cards, and the highest-spend AI shopper. Structured product data is the entry ticket.
  8. Measure the right channel. Isolate Perplexity referrals in GA4 with a custom channel group, label Comet traffic, and track which cited pages drive the research-intent visits that convert at 3×.

The throughline is consistent with everything FancyAI has documented across platforms. Perplexity recommends. It does not rank. It reads the live web, trusts community and freshness over polish and links, and names the brands its sources support. You do not win by climbing a list. You win by being the brand the engine keeps finding good reason to name.

Sources cited

  1. Profound — 10,000-query commercial analysis: Reddit cited in 46.7% of Perplexity responses.[02]Peec AI — citation distribution analysis showing Reddit as ~1 in 5 of all Perplexity citations.[03]5W / techedgeai — AI Platform Citation Source Index 2026, 680M+ pooled citations across five engines.[04]SE Ranking — 129,000-domain study; 3.9× citation lift from Reddit brand mentions.[05]Omniscient Digital (May 2026) — 5.2 unique domains per Perplexity answer vs 3.1 ChatGPT, 2.8 Claude.[06]Averi — 680M-citation analysis; 11% domain overlap between ChatGPT and Perplexity.[07]BrightEdge — 8.79 citations per Perplexity response; brand-mention rates; "different sources, same brands" finding.[08]Superlines — 15.43% Perplexity citation rate vs 2.78% ChatGPT.[09]Foglift (2026) — 50% of Perplexity citations from content under 13 weeks old; last-modified signal use.[10]Demand Local (2026) — Perplexity ranking-factor weighting breakdown (freshness ~15%).[11]Ahrefs (June 2026) — Perplexity most-cited-domains study; YouTube 32.4% mention share, Wikipedia 8.2%.[12]MarGen (2026) — Perplexity referral conversion 3.1×, 4.7× session duration, 312% YoY growth, deep-page landing data.[13]Business of Apps / Demandsage / Wikipedia (2026) — usage, query volume, MAU, market-share statistics.[14]Perplexity / Am I Cited (2026) — Sonar architecture, 200B+ URL index, 24–48hr retrieval freshness.[15]Shopify / eMarketer / AI Advantage Agency / Stellagent / productsup (2026) — Perplexity Shopping, Buy with Pro, Merchant Program economics, shopper demographics.[16]The Keyword / MacRumors / MarTech (2026) — Comet browser launch, free release, GA4 attribution.[17]Search Engine Land (2025) — Perplexity stops testing advertising.[18]FancyAI KnowledgeBase corpus — Perplexity ranking patterns (time_decay_rate, dual-crawler system), compiled March 2026.

Want this measured against your brand?

The FancyAI AI Readiness Index measures whether Perplexity, ChatGPT, and Google name your brand — and which sources they trust to do it. Find out where you stand, then change it.

Related research

Original Research Foundational Methodology Original Research
Back to Research
Live · FancyAI Research Corpus

How to Optimize for Claude: The Enterprise Engine That Cites Differently

Claude reads a different internet than ChatGPT. It runs on Brave's independent index, cites the most conservatively of any major assistant, leans on older long-form and earned media, and sits inside 40% of enterprise LLM spend. Optimizing for it is a separate discipline, not a footnote.

40%
Anthropic's share of enterprise LLM spend, ahead of OpenAI (Menlo Ventures, Dec 2025)
39%
Share of Claude queries that produce a citation, the lowest of any major engine (Discovered Labs / cross-platform audit)
86.7%
Overlap between Claude's cited results and Brave Search's top organic results (Claude web search analysis, 2026)
36%
Share of Claude's journalism citations published in the last 12 months, vs. 56% for ChatGPT (Muck Rack, 2026)
Chapter 01

A different index, a different internet

Most GEO advice quietly assumes every assistant reads the same web. They do not. ChatGPT retrieves through Bing. Google AI Overviews and AI Mode use Google's own index. Perplexity blends its own crawler with multiple backends. Claude runs on Brave Search — an independent index with its own crawler, its own ranking, and its own coverage gaps.

This is not a minor plumbing detail. It is the single most important fact about optimizing for Claude. Content that ranks beautifully on Google but is thin or absent in Brave's index will not surface in Claude's answers, no matter how authoritative it looks elsewhere.

The dependence is unusually tight. A 2026 analysis of Claude's web search found an 86.7% overlap between Claude's cited results and Brave's top non-sponsored organic results — 13 of 15 results matched, with a p-value below 0.0001, meaning the alignment is statistically far beyond coincidence. For contrast, the same style of analysis found ChatGPT's citations overlap with Bing's top organic results only 26.7% of the time. ChatGPT re-ranks and reasons heavily over Bing. Claude, in practice, surfaces what Brave already ranks.

Claude does not have a search index. Brave does. If you want to be in Claude's answer, you have to be in Brave's results first.

That changes the entry condition. Brave builds much of its index through its Web Discovery Project, which contributes anonymized data from opted-in Brave browser users. Pages generally need to be visited by a meaningful number of real Brave users before they become eligible for the index. For high-traffic enterprise sites this threshold is met passively. For new, niche, or low-traffic pages, it is a real barrier — your page can be perfectly optimized and still be invisible to Claude because Brave has never seen it.

The flip side is opportunity. Because Brave's index is not Google's, it surfaces alternative and lesser-known sources that Google-dependent platforms bury. For specialized, regional, and long-tail topics, a page that would sit on page three of Google can be a top Brave result and therefore a Claude citation. Claude is the one major engine where being independent of Google's ranking machine is an advantage rather than a death sentence.

Chapter 02

The most conservative citer in AI search

Claude cites less than any other major assistant, and that is by design.

Across a cross-platform citation audit, Claude produced a citation in only 39% of queries. The same audit put ChatGPT at 56%, Perplexity at 97%, and Google's AI Mode at 98%. Perplexity and Google cite on nearly every answer. Claude cites on fewer than half.

The reason is architectural. Claude evaluates whether its training knowledge is already sufficient before it reaches for the web at all. For well-covered, evergreen topics it answers from memory with no external citations. It searches only when it identifies a genuine knowledge gap — a query that is fresh, specific, or time-sensitive. ChatGPT, by comparison, triggers web search at measurable rates by intent type. Claude's threshold for invoking search is simply higher.

This produces a sharp split in behavior:

  • For discovery and research queries, where freshness matters, Claude's citation rate climbs to roughly 95%. This is the funnel stage where Claude is most cite-happy and most worth optimizing for.
  • For broad informational queries it can answer confidently from training data, Claude often cites nothing at all.

The practical takeaway: you do not win Claude by flooding it with content and hoping for coverage. You win it by being the verifiable, citable source for the specific, current, comparison-driven questions that actually trip Claude into searching. When it does search, it only cites what it can verify, and it explicitly flags uncertainty rather than papering over it.

Claude's conservatism is not a bug to route around. It is a filter. Accurate, specific, verifiable content clears it. Vague, promotional, or unsupported content does not.

In February 2026, Anthropic shipped a dynamic filtering upgrade (tool version web_search_20260209) that lets Claude write and run code to strip boilerplate, navigation, and irrelevant markup out of raw HTML before that content ever enters its reasoning context. The signal for optimizers: clean, content-first pages with low markup noise survive the filter intact. Pages where the substance is buried under interface chrome lose information before Claude even reasons over them.

Chapter 03

What Claude actually cites: older, deeper, more earned

Claude does not just cite less. It cites differently. Three independent data sets in 2026 converge on the same portrait.

Claude leans older and more analytical. Muck Rack's ongoing "What Is AI Reading?" study found that only 36% of Claude's journalism citations were published in the last 12 months, versus 56% for ChatGPT. Claude weights long-form, analytical, and reference-grade content over breaking news. Where ChatGPT chases recency, Claude rewards durability.

Claude favors depth over wire services. The same research found Claude over-indexes on academic journals, technical documentation, industry publications, and government sources. Outlets like Harvard Business Review and TechRadar outperform major newswires in Claude's citation stack. It rewards the publication that explains a topic thoroughly, not the one that reported it first.

Claude leans hardest on user-generated and earned media. Yext's analysis of 17.2 million distinct AI citations found that Claude relies on user-generated content — reviews, forums, social — at rates 2 to 4 times higher than competing models, across every one of the seven sectors studied. Claude's "limited control" citation share ranged from 6.3% in the Organizations sector to 24.4% in Food and Beverage, consistently above its peers. Reddit, review platforms, and community discussion carry more weight in a Claude answer than in a Gemini or ChatGPT answer.

Put together, the Claude source profile is distinct: older, deeper, more academic, and more dependent on what other people say about you than what you say about yourself. This is the practical face of FancyAI's core thesis — the mention is the signal. On Claude that signal is heavily weighted toward third-party validation you do not own and cannot directly author.

This is also why a single cross-platform GEO strategy fails on Claude. As Yext put it, the source mix that makes a brand visible in Gemini is not the mix that makes it visible in Claude. The 2026 AI Platform Citation Source Index, synthesizing more than 680 million citations across the major engines, found that the cited-source overlap between platforms is low — one cross-engine audit measured just 11% overlap between any two engines. Optimizing for "AI" in the aggregate optimizes for an average that does not exist inside Claude.

Chapter 04

Why the enterprise audience changes the math

Claude is a smaller consumer property than ChatGPT. It is the dominant LLM in the enterprise. That inversion is the whole reason Claude deserves a dedicated playbook even at lower query volume.

Menlo Ventures' December 2025 State of Generative AI in the Enterprise report, built on roughly 500 U.S. enterprise decision-makers, found Anthropic now earns 40% of enterprise LLM spend — up from 24% a year earlier and 12% in 2023. Over the same span OpenAI fell to 27% from 50% in 2023, and Google rose to 21%. Anthropic overtook OpenAI as the enterprise leader. In the coding-model segment specifically, estimates put Claude above 50% of enterprise share.

The customer footprint reinforces the point. Anthropic reports serving more than 300,000 business customers, with the number of accounts spending over $1 million annually growing into the hundreds. Annualized revenue reached roughly $30 billion by spring 2026, up from about $1 billion at the start of 2025. Industry reporting puts a large majority of the Fortune 100 among Claude's customers.

The strategic implication is about who is on the other end of a Claude answer. Claude's queries skew toward analysts, engineers, consultants, procurement teams, and operators doing high-consideration work inside companies — people researching vendors, comparing platforms, evaluating technical claims, and drafting recommendations that move budget. A single Claude citation in front of a procurement analyst is worth more than a hundred low-intent consumer impressions.

On Claude, you are not optimizing for traffic volume. You are optimizing for the few high-consideration, high-budget decisions where being the cited, verifiable answer changes the outcome.

This is the trap of judging Claude by consumer reach. Claude's consumer userbase grew sharply through 2026, but it remains well behind ChatGPT's. If you score platforms only by monthly active users, Claude looks optional. If you score them by who is asking and what those answers decide, Claude is where the highest-value B2B and technical-evaluation queries are happening.

Chapter 05

How Claude differs from ChatGPT, Perplexity, and Google

The four engines are not variations on one theme. They retrieve from different indexes, cite at different rates, and reward different content. The contrasts that matter for Claude:

Index. Claude uses Brave's independent index. ChatGPT uses Bing. Google AI Overviews and AI Mode use Google's index. Perplexity blends its own crawler with multiple backends. Being in Google does not put you in Claude.

Citation volume. Claude cites in ~39% of queries. ChatGPT ~56%. Perplexity ~97%. Google AI Mode ~98%. Claude is the engine where many answers carry no link at all, which means the bar for triggering a citation is the design constraint.

Source character. Claude leans older, deeper, more academic, and more reliant on UGC and earned media. ChatGPT leans toward recent, broadly authoritative sources. Perplexity cites prolifically across a wide footer of sources. Google AI Overviews skew toward established high-authority domains and Google's own ecosystem.

Recency posture. ChatGPT and Perplexity reward freshness. Claude rewards durability and analysis. A definitive long-form explainer ages well inside Claude; a thin news post does not.

Personality. Claude behaves like a careful research assistant that flags what it does not know. The others lean toward comprehensiveness and coverage. Claude would rather say less and be right.

The error is treating these as one channel. The 2026 citation research is blunt on this: cross-engine source overlap is low, no single optimization strategy transfers cleanly, and a brand visible in one engine can be entirely absent in another. Claude needs its own work.

Chapter 06

Claude is becoming an agent, not just an answer

Optimizing for Claude is no longer only about appearing in a chat answer. Claude is moving onto the open web as an agent that acts.

In 2026 Anthropic expanded Claude in Chrome from a 1,000-tester research preview into a broader beta extension that lets Claude browse, fill forms, navigate sites, and complete multi-step tasks inside the user's own browser. Anthropic also expanded Claude's computer-use and agentic capabilities, and runs a multi-step Research mode that conducts several chained searches, deciding what to investigate next as it goes.

Two consequences for optimizers:

  • Agentic Claude consumes structure, not persuasion. When Claude is reading a page to complete a task or compare options, clean structure, machine-readable facts, clear pricing and spec tables, and unambiguous claims determine whether it can use your page. Marketing gloss it cannot parse is dead weight. This is the agentic-readiness layer that decides whether you are usable, not just visible.
  • Research mode rewards being the strong link in a chain. Because Research builds answers across multiple searches, being the source that resolves a specific sub-question — the definitive comparison, the authoritative spec, the verifiable benchmark — gets you pulled into a synthesis even when you would not have won a single one-shot query.

The direction of travel is clear. The brands that are clean, structured, and verifiable enough for Claude to act on, not just quote, are the ones positioned for where this is going.

Chapter 07

The execution playbook, and what does not work

Pulling the research into action. What works on Claude:

  • Get into Brave's index first. Confirm key pages are reachable by Brave's crawler, publicly accessible without login gates or session-bound URLs, and not blocked in robots.txt for Brave or for Claude-SearchBot. If Brave cannot see it, Claude cannot cite it.
  • Manage the three crawlers deliberately. Anthropic runs ClaudeBot (training), Claude-User (real-time user fetches), and Claude-SearchBot (search indexing), each controllable separately in robots.txt. Blocking the training crawler does not affect search visibility, but blocking Claude-SearchBot removes you from Claude's search answers. Decide each on purpose.
  • Write for durability, not the news cycle. Claude rewards deep, analytical, reference-grade content that ages well. Comprehensive explainers and definitive comparisons outperform thin, time-stamped posts.
  • Build earned media and community presence. With Claude weighting UGC and earned media 2 to 4 times more than its peers, third-party coverage, expert citations, and authentic Reddit and review presence move the needle more here than anywhere else.
  • Make facts verifiable and structured. Claude only cites what it can confirm. Specific, sourced, checkable claims clear its filter. Clean structure survives dynamic filtering and feeds agentic Claude.
  • Target the search-triggering moments. Optimize hardest for fresh, specific, comparison and research queries, the ~95% citation zone, not the broad informational questions Claude answers from memory.

What does not work on Claude:

  • Relying on Google rankings. Google authority does not transfer to Brave's index. This is the most common and most expensive mistake.
  • Chasing recency for its own sake. Volume-publishing fresh-but-shallow content fits ChatGPT and Perplexity, not Claude.
  • Promotional, unverifiable copy. Claims Claude cannot confirm get dropped. Marketing language is not citation fuel.
  • One global GEO strategy. With ~11% cross-engine source overlap, the playbook that wins Gemini or ChatGPT does not win Claude.
  • Judging Claude by consumer MAUs. Its value is in enterprise and high-consideration queries, not raw reach. Score it on who is asking.

Sources cited

  1. Menlo Ventures — December 2025 State of Generative AI in the Enterprise report; Anthropic at 40% of enterprise LLM spend, OpenAI 27%, Google 21%; coding-model share.[02]Anthropic — enterprise customer count (300,000+ businesses), revenue trajectory, Claude in Chrome expansion, three-crawler documentation, dynamic filtering (web_search_20260209), Research mode.[03]Claude web search analysis (2026) — 86.7% overlap between Claude citations and Brave's top organic results (p < 0.0001); 26.7% ChatGPT/Bing comparison.[04]Muck Rack, "What Is AI Reading?" (2026) — 36% of Claude journalism citations from past 12 months vs. 56% for ChatGPT; Claude's lean toward academic, technical, and reference sources.[05]Yext — AI Citation Behavior research across 17.2 million citations; Claude's 2–4× higher reliance on user-generated and earned media across all seven sectors.[06]Discovered Labs / cross-platform citation audit (2026) — Claude citation rate 39% vs. ChatGPT 56%, Perplexity 97%, Google AI Mode 98%; ~95% on discovery queries.[07]AI Platform Citation Source Index 2026 (synthesizing 680M+ citations) and cross-engine audit — ~11% source overlap between any two engines; no single transferable strategy.[08]BrightEdge, The Ultimate Guide to Claude Search — Claude as conversational research and discovery assistant; Brave-powered inline citations.

Want this measured against your brand?

The AI Readiness Index measures whether your brand is actually citable inside Claude — indexed in Brave, structured for verification, and backed by the earned media Claude weights most. See where you stand before your competitors do.

Related research

Original Research Foundational Methodology Buyer Behavior
Back to Research
Live · FancyAI Research Corpus

How to optimize for Grok: the only engine where a tweet is a ranking signal

Grok answers from a live feed of X conversation that no other AI can touch. Reddit, YouTube, and Facebook supply nearly half its citations, and X engagement directly shapes which sources it picks. Optimizing for Grok is a social and real-time game, not a content-library game.

117M
Monthly Grok users inside X (SpaceX S-1, May 2026)
45.3%
Share of Grok citations from the three biggest social platforms (Ahrefs, June 2026)
240M
Grok-powered searches per day on X (xAI / X, 2026)
15.1%
Share of Grok citations from social media sources, the highest of any major engine (AuthorityTech, 2026)
Chapter 01

The engine that lives inside a social network

Every other AI assistant retrieves from an index of the open web. Grok does that too, but it starts somewhere no competitor can follow: the live X post graph. Grok is built directly into X, and that distribution is the whole story. According to SpaceX's May 2026 S-1 filing, 117 million people use Grok's features every month, out of X's 550 million monthly active users (Ahrefs, citing the SpaceX S-1). xAI and X report that Grok now powers roughly 240 million searches per day on the platform, with the model wired into the search bar, the "Explain" button on posts, and the recommendation layer itself.

That changes what optimization means. On ChatGPT you compete to be in a training corpus and a Bing-backed index. On Perplexity you compete for a proprietary web index that rewards freshness and source diversity. On Grok you compete for two things at once: a web index and a real-time social conversation. The second one has no equivalent anywhere else.

Grok is the only major AI engine where a single social network's engagement functions as a first-class ranking signal. A high-velocity thread on X is not background context. It is a source Grok can quote.

The platform's reach is real but its market position is narrower than the headline distribution suggests. Grok's U.S. mobile chatbot share reached 17.8% in January 2026, up from 1.9% a year earlier (Apptopia data reported by Reuters). Globally it holds an estimated 2-5% of the AI chatbot market, against ChatGPT's 60-80% (multiple 2026 trackers). On the web, Grok pulled roughly 234 million monthly visits in early 2026, ahead of Perplexity (189M) and Claude (180M) (Similarweb-class web-traffic data, 2026). After fourteen straight months of growth, Grok's U.S. mobile share dipped for the first time in March 2026, an early sign the vertical climb is flattening.

The takeaway for brands: Grok is too large to ignore and too distinctive to optimize for with a generic GEO checklist. Its citation behavior is governed by social dynamics the other engines do not share.

Chapter 02

Where Grok's answers actually come from

The cleanest public read on Grok's sourcing is Ahrefs' Brand Radar study, which tracked every domain Grok cited across more than 1.9 million U.S. queries spanning all topics in June 2026. The top of the list looks unlike any other engine:

  • Reddit — 16.3% of mention share, the single most-cited domain
  • YouTube — 15.1%
  • Facebook — 13.9%
  • Instagram — 5.9%
  • Quora — 5.5%
  • Amazon — 5.0%
  • TikTok — 4.8%
  • Wikipedia — 3.4%

Add the three leaders and social and community platforms account for 45.3% of Grok's top-source attention before you reach a single encyclopedia or news outlet (Ahrefs, June 2026). Reddit, YouTube, Facebook, Instagram, Quora, and TikTok together dominate the table. Traditional publishers appear far down: the New York Times sits at 1.2%, Forbes at 1.1%, CNN at 0.3%.

This is the defining fact of Grok optimization. A cross-engine analysis put 15.1% of Grok's citations as coming from social media sources, the highest share of any major engine, with the rest split roughly between earned/news (40.5%) and other (41.6%) (AuthorityTech, 2026). ChatGPT, by contrast, pulls heavily from Wikipedia, government, education, and major media, with one analysis finding 67% of its top 1,000 cited pages are effectively off-limits to brand SEO (AuthorityTech, 2026). Perplexity rewards recency and source diversity, typically citing three to eight sources per answer.

If your brand has no presence on Reddit, YouTube, or Facebook, you are invisible to nearly half of Grok's source attention before the query even runs.

There is a second sourcing layer that the domain table understates. Grok also reads the live X feed directly. One 2026 analysis of citation behavior found that X posts get cited more often for recent topics and opinions, while website content dominates factual queries, and that Grok tends to quote X posts directly but paraphrase website content (Trakkr / cross-platform analysis, 2026). x.com itself climbed 15 places month-over-month in the Ahrefs ranking to 1.4% mention share, one of the fastest movers on the board. The social signal is growing, not shrinking.

Chapter 03

Real-time relevance is Grok's home turf

Grok's structural advantage is speed to the present. xAI's Grok 4.20, released in beta in February 2026 and fully available by March, carries a 2-million-token context window and native, real-time integration with X data (AI/ML API release notes, 2026). Where ChatGPT and Claude operate from a knowledge cutoff plus a web-search bolt-on, Grok has native access to the full X post graph from 600M+ monthly active users as it happens (xAI / 2026 platform reporting). No other major model has a comparable real-time social architecture.

This matters most for a specific class of query. For breaking events, trending topics, live sentiment, and fast cultural context, Grok can synthesize what journalists, witnesses, and experts are posting on X minutes after an event, alongside web articles (multiple 2026 reviews). Grok DeepSearch is the only major deep-research mode that searches X posts in parallel with the open web. For crypto, markets, elections, sports, and any topic where the conversation is part of the story, that is a genuine edge competitors cannot replicate by indexing harder.

The execution implication is blunt: recency wins on Grok more decisively than on any other engine. A page updated this week, discussed on X this week, will outrank an authoritative but stale page. Grok's two search tools, exposed in xAI's API as web_search and x_search, are called iteratively, and the model re-queries as it reasons (xAI developer documentation, 2026). DeepSearch can execute tools up to ten times per query with a minimum of three calls before answering (Profound, 2025), cross-verifying claims across news, web, and X posts.

On Grok, freshness is not a tie-breaker. It is a primary signal. The platform was built to privilege what is happening now over what was true last year.

Chapter 04

What drives inclusion: social velocity, then structure

Strip Grok down to its incentives and three signals dominate.

1. Social velocity on X. Shares, replies, quote-posts, and trending-topic association on X directly influence which sources Grok surfaces. Higher X share velocity produces richer link previews, which produces more Grok citations, a self-reinforcing loop (Metricool, 2025; AEOfix, 2025). This is the single biggest lever and the one no other engine offers. Brands with active, engaged X presence get disproportionate visibility versus brands present only on the open web.

2. Presence on Grok's actual sources. Because Reddit, YouTube, Facebook, Quora, and Wikipedia supply most of Grok's web citations, being talked about in those venues matters more than your owned blog. A recommendation thread on Reddit, a review video on YouTube, an answer on Quora — these are the assets Grok reaches for. This is the same earned-media logic that wins on every engine, but on Grok the social weighting makes it sharper.

3. Direct, factual, declarative content. Grok rewards verifiable facts stated plainly over hedged marketing language, and it answers in a confident, sometimes edgy register (AEOfix, 2025). Pages built as discrete H2 sections, each answering one sub-question, give DeepSearch independent extraction targets. Twitter Card metadata — twitter:card, twitter:title, twitter:description, twitter:image — controls how your pages render in X previews and how Grok reads page intent (Metricool, 2025).

Inclusion on Grok is therefore won mostly off your own site. The mention in a high-engagement X thread or a top Reddit answer is the signal. Your homepage is not.

Chapter 05

The crawler problem nobody warns you about

There is a technical trap specific to Grok. xAI documents crawler user agents like GrokBot/1.0, xAI-Grok/1.0, and Grok-DeepSearch/1.0 — but in practice Grok frequently does not identify itself with them. Multiple 2026 crawler audits report that Grok "almost never sends a clearly identifiable Grok or xAI user agent" and instead spoofs common browser strings — outdated Chrome versions, iPhone Safari, Go-http-client — to avoid being blocked (Promptwatch; Stackfox, 2026).

This cuts two ways. Because DeepSearch performs user-initiated, real-time fetches, it may ignore robots.txt entirely on those requests. But for the index-driven half of Grok's pipeline, bot-detection middleware and default robots.txt rules that key off "bot" or "crawler" can silently miss Grok's documented agents, leaving your pages out of its index without any error you would notice.

The defensive moves:

  • Explicitly allow the documented xAI agents in robots.txt rather than relying on generic allow rules
  • Audit your WAF and bot-detection layer for rules that block unfamiliar or "outdated" browser strings, which can catch Grok's spoofed agents
  • Verify in server logs that xAI traffic is actually reaching your pages, since the spoofing makes it easy to misattribute

Many sites that believe they are open to AI crawlers are quietly invisible to Grok. The user agent does not match the rules they wrote.

Chapter 06

How Grok differs, and what does not work

Set Grok beside the other engines and the contrasts are stark.

SignalChatGPTPerplexityGoogle AI OverviewsGrok
BackendBing + trainingProprietary indexGoogle indexOwn index + live X feed
Top source typeWikipedia, gov, mediaDiverse web + communityTop-10 ranking pagesReddit, YouTube, Facebook, X
Social signalIndirectIndirectIndirectDirect, first-class
Recency weightingModerateHighModerateHighest
Citation overlap with othersLowLowLowLow

A study of 680 million AI citations found only 11-12% domain overlap across engines (AuthorityTech, 2026). Grok's social-heavy profile makes that gap even wider. Optimizing for one platform leaves most of the others, Grok especially, untouched.

What does not work on Grok:

  • Owned-site-only strategies. If your plan is "publish more pages on our domain," you are optimizing for the 1-3% slice while ignoring the 45% that lives on social and community platforms.
  • Polished, hedged marketing copy. Grok favors direct factual claims and a confident voice. Corporate hedging reads as low-signal.
  • Stale evergreen content with no social footprint. Without recency and X conversation, even authoritative pages lose to fresher, more-discussed sources.
  • Treating Grok citations as automatically safe. This is the real caution. Grok's reliance on X data carries documented reliability risk. Global Witness found Grok amplifying conspiracies and toxic content in response to neutral political questions, and through early 2026 Ofcom and the European Commission opened formal investigations into the platform (Global Witness, 2026; RAND, Feb 2026). For YMYL and politically adjacent topics, a Grok citation carries different and higher reputational risk than the same citation on Claude or Perplexity.

The honest limitation: there is still no large-scale, peer-reviewed Grok citation study on the scale of what exists for ChatGPT, Perplexity, and Google AI Overviews. The Ahrefs domain data is the strongest public signal available, but brand-mention rates by vertical and DeepSearch's exact source-selection logic remain undocumented. Optimize for Grok on the evidence we have, and measure your own results rather than trusting industry rules of thumb.

Chapter 07

The Grok execution playbook

A sequenced plan, ordered by leverage.

  1. Build genuine X presence and velocity. Post factual, quotable content. Earn replies and quote-posts. This is the one lever unique to Grok and the highest-impact thing you can do. Engagement on X is a direct citation signal.
  2. Get cited where Grok looks. Prioritize Reddit, YouTube, Quora, and Facebook — they supply the bulk of Grok's web citations. A strong Reddit answer or YouTube review beats another owned blog post.
  3. Fix the crawler gap. Explicitly allow xAI's documented user agents in robots.txt, audit bot-detection for false blocks, and confirm in logs that Grok traffic lands.
  4. Write declaratively and keep it fresh. Lead with verifiable facts in plain sentences. Structure pages as discrete H2 answers. Update high-value pages on a real cadence — recency is weighted more heavily here than anywhere else.
  5. Ship complete Twitter Card metadata on every page so it renders well in X previews and Grok reads intent cleanly.
  6. Manage the risk. For regulated, health, financial, or political topics, monitor how Grok represents your brand, not just whether it mentions you. The data-quality risk is real and specific to this engine.
  7. Measure Grok separately. Given near-zero citation overlap with other engines, track Grok visibility on its own. Do not assume ChatGPT or Perplexity performance transfers.

The mention is the signal. On Grok, the highest-value mention is often not on your website at all. It is a quote-post, a Reddit thread, or a YouTube review the model decided to cite.

Sources cited

  1. Ahrefs Brand Radar — "The 50 Most-Cited Websites in Grok (June 2026)," domain mention-share data across 1.9M+ U.S. queries; SpaceX S-1 user figure[02]SpaceX S-1 filing (May 2026) — 117M monthly Grok-feature users out of 550M X MAU[03]Apptopia / Reuters — Grok U.S. mobile chatbot share (17.8%, January 2026)[04]AuthorityTech (2026) — cross-engine citation analysis; 15.1% social share for Grok; 11-12% domain overlap; 67% of ChatGPT top pages off-limits[05]xAI developer documentation (2026) — web_search and x_search tools, iterative tool-calling, Live Search[06]AI/ML API release notes (2026) — Grok 4.20 specs: 2M-token context, real-time X integration[07]Trakkr / cross-platform citation analysis (2026) — X posts quoted directly vs. web paraphrased; recency vs. factual sourcing split[08]Profound (2025) — Grok WebSearch vs. DeepSearch architecture; tool-call limits[09]Promptwatch & Stackfox (2026) — xAI crawler user agents and browser-string spoofing behavior[10]Global Witness (2026) — Grok amplifying conspiracies and toxic content in political queries[11]RAND (Feb 2026) — regulatory scrutiny of Grok; Ofcom and European Commission investigations[12]Metricool & AEOfix (2025) — Twitter Card metadata, social velocity loop, declarative content guidance[13]Similarweb-class web-traffic data (2026) — Grok ~234M monthly visits vs. Perplexity and Claude

Want this measured against your brand?

The AI Readiness Index (ARI) measures where you stand on Grok specifically, including the social and community footprint that drives nearly half its citations. See your Grok visibility next to your competitors and the exact queries where you appear, or don't.

Related research

Original Research Foundational Methodology Original Research
Back to Research
Live · FancyAI Research Corpus

Being Seen vs. Being Selected: How AI Decides Which Products to Recommend

AI shopping traffic to US retailers grew 393% in a single quarter and now converts 42% better than every other channel. But the product page is no longer the front door, and most retail sites are not even readable by the machines deciding what to recommend. This is the new shelf.

393%
Year-over-year growth in AI traffic to US retail sites, Q1 2026 (Adobe)
42%
How much better AI-referred shoppers convert vs. all other channels, March 2026 (Adobe)
66%
Machine-readability score of the average retail product page (Adobe)
82–85%
Share of AI citations that come from third-party sources, not your own site (multiple 2026 studies)
Chapter 01

The shelf moved, and most brands don't know where it went

For thirty years, ecommerce ran on a single mental model: rank the page, win the click, convert the visitor. Search engine optimization was the discipline of getting a product detail page (PDP) to position one, and the PDP was the front door to the sale. That model is breaking in real time.

In the first quarter of 2026, traffic from AI sources to US retail sites grew 393% year over year, according to Adobe Analytics. In March alone it was up 269%. That continued a holiday surge in which AI referrals to retail sites rose 693% year over year in November and December 2025. The slope is not linear. It is exponential, and it is compounding off a base that was statistically rounding error eighteen months ago.

The more important number is not the traffic. It is the quality of it. In March 2026, AI-referred shoppers converted 42% better than visitors from every other channel combined, including paid search and email. That is a record high, and it represents a roughly 80-percentage-point swing in twelve months. In March 2025, AI traffic converted 38% worse than the average channel. The shoppers who arrive from an AI engine now spend 48% longer on site, browse 13% more pages, and generate 37% higher revenue per visit than non-AI traffic.

Salesforce estimated that AI agents influenced more than 20% of all global online retail sales during the 2025 holiday season, and that AI-referred shopping traffic converted nine times more often than social referral traffic.

This is the inversion that matters. AI does not send you a curious browser. It sends you a shopper who has already been told what to buy. The engine has done the comparison, read the reviews, weighed the alternatives, and arrived at a recommendation. By the time that person reaches your site, the decision is largely made.

Which raises the question this report exists to answer: how does the AI decide? And the uncomfortable answer for most brands is that they have no visibility into, and almost no control over, the process that now precedes the most valuable click in commerce.

Chapter 02

The product page is no longer the front door

Here is the structural problem. AI engines recommend products. They do not rank pages. And the page you have spent a decade optimizing is, to the machine, frequently unreadable.

Adobe's 2026 retail readability analysis found that the average retail homepage scores 75% on machine readability, meaning roughly a quarter of homepage content cannot be parsed by an LLM. Category pages score 74%. And the individual product page, the thing brands consider their crown jewel, scores just 66%. A third of the information a shopper needs to make a buying decision is invisible to the system now making that decision on the shopper's behalf.

The reason is architectural. Modern PDPs are built for human eyes and conversion-rate optimization: JavaScript-rendered specs, image-based feature callouts, tabbed content, pricing injected client-side, reviews loaded in a widget. A human sees a rich page. A crawler sees a near-empty shell. The content that persuades a person is exactly the content the model cannot read.

Meanwhile, the engines have stopped treating your site as the primary source at all. Across multiple 2026 citation studies, 82% to 85% of the sources AI engines cite in product and recommendation answers come from third parties, not the brand's own domain. When ChatGPT, Perplexity, or Gemini answer a "best" or comparison query, they cite Reddit, review aggregators, and editorial publishers, and they trust a Wirecutter guide or a Runner's World review over a brand's own product description.

  • Over 91% of ecommerce queries now trigger an AI-generated result of some kind (Hamster Garage, 2026).
  • Google AI Overviews now appear on 14% of shopping queries, a 5.6x increase in four months (Alhena, 2026).
  • 65% of AI-cited pages use structured data, and pages with proper schema have a roughly 2.5x higher chance of appearing in AI answers (Alhena / Search Engine Land, 2026).

The front door is no longer your PDP. It is a synthesized recommendation assembled from sources you mostly don't own, delivered to a shopper who never saw your homepage.

This is the visibility-versus-selection gap that defines the vertical. You can be perfectly indexed, perfectly ranked in classic SEO, and still never be named when a shopper asks an AI which product to buy. Being seen by Google is not the same as being selected by the model. The two now require different work.

Chapter 03

The signals that get a product recommended

If the AI is not reading your page and not citing your site, what is it using? Four signal classes do the heavy lifting, and they map directly to what a brand can and cannot control.

1. Structured product data and feeds. The single most controllable signal. AI systems pull from Merchant Center feeds, Product and Offer schema, and verified product databases because that data is cleaner and more complete than scraped page content. The structure tells the model, unambiguously, "this is a product, this is its price, this is its rating, this is its availability." In January 2026, Google announced dozens of new product data attributes that go beyond keywords, including answers to common product questions and compatible accessories and alternatives. Feed quality is now a visibility strategy, not a backend chore. The title structure that wins is descriptive and literal: Brand plus Category plus Key Attributes, "lightweight waterproof hiking boots under $150," not "experience the luxury of adventure."

2. Reviews and third-party validation. This is the verification layer, and the engines lean on it hard. Feefo's 2026 analysis found ChatGPT references reviews in 58% of responses and Perplexity in 100%. In Google AI Overviews, 34.5% of answers cite at least one review platform, and the top five review platforms hold 88% of all review citations. Reviews separate brand claims from real experience, and the engine treats depth, stories and specifics, as more credible than star averages.

3. Community consensus, led by Reddit. A 2026 index of 680 million citations found Reddit accounts for roughly 40% of all citations across major AI models, topping the list on every major model. Per Tinuiti's Q1 2026 citations report, social media's share of AI citations climbed past 9%, with Reddit driving the dominant share across nine tracked product categories. As Search Engine Land put it, large language models are not primary sources of truth; they are mirrors reflecting the consensus formed through human conversation. Optimizing for social is now optimizing for AI.

4. Third-party "best of" lists and editorial. The model's shortcut for "best X" queries is to defer to the publishers humans already trust for that judgment. A placement in a credible roundup is worth more to AI visibility than any amount of on-site copy claiming you are the best.

AI doesn't rank your product in isolation. It ranks it relatively, comparing it against alternatives across feeds, reviews, community sentiment, and editorial consensus, then names a winner.

Notice the pattern. Three of the four signal classes live off your domain. The brand's job has shifted from optimizing a page it owns to seeding and shaping a consensus it can influence but not author.

Chapter 04

Science beats storytelling: the beauty proof point

The clearest evidence of how AI selects products comes from beauty, where Yotpo benchmarked 127 brands for AI visibility and found two separate universes.

Clinical, science-backed brands dominate. Paula's Choice scored 101.6. COSRX hit 98.0. The Ordinary, The Inkey List, and Naturium rewrote the category through ingredient transparency and education-first content the model can parse. Traditional brands optimized for retail shelf space and emotional advertising are effectively invisible: Innisfree scored 12.4, Bath & Body Works 39.9, Revlon 47.6, Nivea 52.1.

The differentiators are specific and transferable. "Dermatologist recommended" is worth an estimated $30M in earned media value; CeraVe built its AI dominance on dermatologist partnerships. Skincare outperforms makeup and fragrance because ingredients demand explanation, and explanation is content the model can index. As Yotpo framed it, you can't SEO a red lipstick, but hyaluronic acid spawns a thousand search paths.

Two penalties stand out for every ecommerce category, not just beauty:

  • The fragmentation penalty. Bath & Body Works' 500-plus scent variations, with no use cases attached, scored -10.5. Too many thin, undifferentiated SKUs confuse the model and cannibalize authority. Focused catalogs with deep content win; sprawling catalogs with shallow pages lose.
  • The third-party reality. Only 25% of sources cited in AI beauty answers were brand-managed sites (AvenueZ). Visual UGC correlated only weakly with AI performance (r=0.34). What the model wanted was parseable substance, not aesthetics.

SEO rewarded keywords and shelf space. AI rewards clinical clarity, expert proof, and search-native storytelling. Brands built for the old world are structurally disadvantaged in the new one.

The lesson generalizes. In any category, the brand that explains its product in specific, verifiable, machine-readable terms will be selected over the brand that gestures at a feeling. Substance is the new keyword.

Chapter 05

The conversion economics: why the small channel is the urgent one

The standard objection is that AI traffic is still tiny. It is, in absolute terms. ChatGPT referrals account for roughly 0.2% of total sessions across a 12-month analysis of 973 ecommerce sites (University of Hamburg / Frankfurt School), and Google's organic search remains roughly 200x larger than organic LLM referrals. If you index on volume, AI looks ignorable.

That is the wrong frame. AI traffic is small and converts dramatically better, which makes it the highest-leverage acquisition channel a brand has, and the growth curve removes the "ignore it for now" option.

  • AI-referred shoppers convert 42% better than all other channels (Adobe, March 2026).
  • During the 2025 holiday, AI referrals converted 31% more than other sources overall, 54% more on Thanksgiving, and 38% more on Black Friday (Adobe).
  • AI shopping traffic converted 9x more often than social referral traffic (Salesforce).
  • Revenue per visit from AI is now 37% higher than non-AI traffic, reversing a position where human traffic was worth 128% more just a year earlier (Adobe).
  • Perplexity shoppers carry an average order value 57% higher than other AI platforms (Similarweb).

The reason is intent. The engine pre-qualifies the shopper. It has already filtered for fit, compared options, and surfaced your product as the answer to a specific need. There is no top-of-funnel browsing to discount. The person who clicks through has been handed a recommendation, and recommendation traffic closes.

Layer on the macro forecast and the urgency clarifies. McKinsey projects $900 billion to $1 trillion in US retail revenue from agentic commerce by 2030, and $3 to $5 trillion globally. Salesforce reported AI agents influenced 20% of global online retail sales in the 2025 holiday season already.

A channel that converts 42% better, carries higher order values, and is growing triple digits per quarter is not a side experiment. It is where the next margin lives.

One caution against over-rotating on any single platform's mechanics: OpenAI quietly abandoned its in-chat Instant Checkout feature in March 2026 after low adoption, with fewer than a dozen Shopify merchants ever fully integrating it. The platforms and the buy buttons will keep shifting. The durable bet is not a specific checkout integration. It is being the product the engine names, on whatever surface the shopper happens to be standing.

Chapter 06

The execution playbook for ecommerce and DTC brands

The work divides into what you own, what you influence, and what you measure. None of it is optional anymore.

Make your data machine-readable first. This is the foundation and the fastest win. With product pages averaging just 66% readability, the cheapest visibility gains are in fixing what the crawler can't currently see.

  • Implement complete Product, Offer, Review, AggregateRating, and FAQPage schema, and keep it dynamic so price, availability, and ratings update automatically. Stale or contradictory schema gets downranked.
  • Render critical product information server-side. If specs, price, and reviews only appear after JavaScript executes, assume the model never sees them.
  • Enrich your feed: descriptive titles (Brand + Category + Key Attributes), complete structured attributes including the optional ones, multi-angle white-background images with real alt text, and consistent pricing across every surface.

Earn the third-party signals you can't author. Roughly 82–85% of the citations deciding your fate live off your domain, so you have to go where they are formed.

  • Build genuine review depth. Solicit reviews that tell stories, not just star ratings, since depth is what the model treats as credible verification.
  • Participate authentically in community, especially Reddit, the single largest AI citation source. The strategy is real contribution, not seeded spam; engines detect and discount coordinated mention campaigns.
  • Pursue earned placements in the editorial roundups and "best of" lists the engines defer to. One credible third-party recommendation outweighs a page of self-description.

Write for explanation, not persuasion. Describe products the way you would to a knowledgeable friend. Specific, literal, verifiable. Add use-case content, comparison content, and Q&A content that answers the long-tail questions shoppers actually ask the AI. Lead with substance the model can quote: ingredients, materials, specs, certifications, expert validation.

Prune fragmentation. Consolidate thin, near-duplicate SKUs and attach a clear use case to every variation that remains. A focused catalog with deep content beats a sprawling one with shallow pages.

Measure selection, not just traffic. Classic analytics tell you who arrived. They cannot tell you whether the engine named you in the first place, and the recommendation happens before any click. You need to track share of voice in AI answers, the prompts you appear and don't appear in, and which competitors are being selected when you are not.

The old job was ranking a page you control. The new job is being selected by a system you don't, built on signals you mostly don't own. Visibility was a placement. Selection is a reputation.

The brands winning this transition are not the ones with the biggest ad budgets. They are the ones whose product truth is the most machine-readable, the most third-party-validated, and the most specifically described. The shelf moved. The question is whether the engine can read you well enough to put you on it.

Sources cited

  1. Adobe Analytics (2026) — Q1 AI retail traffic growth (393%), March conversion lift (42%), revenue-per-visit (37% higher), engagement and dwell-time figures, holiday AI referral surge (693%), and product-page machine-readability scores (66%).[02]Salesforce (2025–2026) — AI agents influenced 20% of global online retail holiday sales; AI shopping traffic converted 9x better than social referrals; record holiday spend data.[03]McKinsey (2026) — Agentic commerce revenue forecast: $900B–$1T US, $3–5T global by 2030.[04]Yotpo (2025) — 127-brand beauty AI visibility benchmark, brand scores, "dermatologist recommended" earned-media value, fragmentation penalty.[05]Tinuiti AI Citations Trends Report, Q1 2026 — Social media's rising share of AI citations (past 9%), Reddit's category dominance.[06]2026 AI citation index (680M citations) — Reddit at ~40% of all citations across major models.[07]Feefo (2026) — Review reference rates in AI answers (ChatGPT 58%, Perplexity 100%).[08]Alhena / Search Engine Land (2026) — Structured data adoption (65% of cited pages), schema appearance lift (2.5x), AI Overviews on 14% of shopping queries.[09]Similarweb (2026) — Perplexity average order value 57% higher; AI platform usage and referral share data.[10]University of Hamburg / Frankfurt School — ChatGPT referrals ~0.2% of ecommerce sessions across 973 sites; relative scale of organic search vs. LLM referrals.[11]AvenueZ (2025) — 25% of AI beauty citations from brand sites; visual UGC correlation with AI performance.[12]Google / OpenAI announcements (2026) — New product data attributes; OpenAI Instant Checkout deprecation, March 2026.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which products and prompts the engines name you in, where competitors are selected instead, and which feed, schema, and third-party gaps are keeping you off the AI shelf.

Related research

Foundational Methodology Business Case Original Research
Back to Research
Live · FancyAI Research Corpus

Being Seen vs. Being Selected: How AI Decides Which Local Businesses to Recommend

Consumer use of AI to find local businesses jumped from 6% to 45% in a single year. But AI recommends only a fraction of the locations that win Google's map pack, and the business closest to the searcher no longer wins. Proximity got replaced by reputation, and most local owners are optimizing for a game that is no longer being played.

45%
Share of consumers now using AI to find local businesses, up from 6% a year ago (BrightLocal LCRS 2026)
1.2%
Share of locations ChatGPT recommends, vs. 35.9% in Google's local 3-pack (SOCi 2026 Local Visibility Index)
0.001
Correlation between distance and ranking position inside Google AI Overviews (Local Falcon)
68%
Accuracy of local business data on ChatGPT and Perplexity, vs. 100% on Gemini (SOCi)
Chapter 01

The map moved, and proximity stopped winning

For two decades, local search ran on one dominant variable: distance. You ranked in the map pack partly on relevance and reviews, but proximity to the searcher was the gravitational force. A plumber three blocks away beat a better plumber across town. The entire discipline of local SEO was built to optimize the signals around that core, accurate listings, category selection, review volume, and a tightly managed Google Business Profile.

That model is now breaking in the surface most consumers are migrating to. According to BrightLocal's Local Consumer Review Survey 2026, a representative panel of 1,002 US adults, the share of consumers using AI tools to find a local business climbed from 6% in 2025 to 45% in 2026. AI assistants have surged into third place for local discovery, behind only Google and Facebook, and Google's own share of local discovery dipped from 83% to 71% in the same year. The audience is moving, and it is moving fast.

When that audience asks an AI for a local recommendation, distance largely stops mattering. Local Falcon's whitepaper on Google AI Overviews, built on 60,000 real-world simulations across 4,423 businesses in 20 countries, found effectively no correlation between distance and ranking position inside AI Overviews. The correlation coefficient was 0.001. In a traditional local pack, distance is one of the strongest predictors of who shows up first. In AI answers, it is statistical noise.

In Google's local pack, proximity is the dominant ranking force. In AI Overviews the correlation between distance and rank is 0.001. The map collapsed into a recommendation, and the recommendation does not care how close you are.

Proximity has not vanished entirely. Local Falcon found a faint inclusion effect: businesses within roughly one mile appeared in AI Overviews 72.0% of the time versus 68.5% for those slightly farther out. Distance can still get you into the candidate pool. But once you are in, it does nothing to determine whether you are the business the AI names. Content authority and reputation take over completely. This is the single most important shift in local for owners to internalize: the nearest business no longer wins by default, and the better-documented one frequently does.

Chapter 02

Visibility is not selection, and the gap is enormous

Here is the structural problem. In classic local SEO, visibility and selection were nearly the same event. If you ranked in the 3-pack, you were seen, and being seen was most of the battle. AI severs the two. Being indexed, listed, and even ranked on Google no longer means an AI will name you.

The most comprehensive measurement of this gap is SOCi's 2026 Local Visibility Index, which analyzed nearly 350,000 locations across 2,751 multi-location brands. The headline numbers are stark. AI platforms recommended only 1.2% of locations on ChatGPT, 11% on Gemini, and 7.4% on Perplexity, compared with 35.9% appearing in Google's local 3-pack. SOCi's own summary: AI visibility is three to 30 times harder to achieve than ranking well in traditional local search.

The deeper finding is the overlap failure. Fewer than half of the brands leading in traditional local search also appeared among the most visible brands in AI results. In retail, only 45% of the top 20 brands by traditional local visibility overlapped with the top 20 in AI. Strong Google rankings did not carry over. Brands that had spent years winning the map pack discovered they were nearly invisible in the answer.

  • 1.2% of locations recommended by ChatGPT vs. 35.9% in Google's 3-pack (SOCi).
  • AI visibility is 3x to 30x harder to achieve than traditional local ranking (SOCi).
  • Only 45% of top-20 traditional retail brands also ranked top-20 in AI (SOCi).
  • Local Falcon found Google AI Overviews now appear in 40.2% of local business searches.

Being seen by Google is not being selected by the model. Fewer than half the brands winning traditional local search win in AI. The 3-pack you fought for does not transfer to the answer.

This is the visibility-versus-selection gap that defines the vertical, and it is wider in local than in any other category we have measured. A roofer can dominate the map pack in his city and never be mentioned when a homeowner asks ChatGPT who to call. The work that produced the first outcome does not produce the second.

Chapter 03

The signals that get a local business recommended

If proximity is dead and Google rank does not transfer, what does the AI actually use? Four signal classes do the heavy lifting, and they reorder the priorities most local owners are used to.

1. The Google Business Profile, as a data spine. The profile still matters, but its job changed. It is no longer a ranking lever you tune for the 3-pack. It is the structured source of truth that AI systems lean on to confirm a business exists, what it does, where it is, and how it is rated. Gemini achieves 100% business profile accuracy because it is grounded directly in Google Maps data, which is exactly why Gemini recommends local businesses at nearly ten times the rate of ChatGPT. A complete, accurate, fully populated profile is now table stakes for being eligible at all.

2. Review volume, recency, and sentiment, as a filter not a gradient. This is the most misunderstood signal in local GEO. AI does not rank you by star count the way a human might scan a list. It uses reviews as a confidence threshold. SOCi found that locations recommended by ChatGPT averaged 4.3 stars, Gemini 3.9, and Perplexity 4.1. Businesses with ratings near 3.4 stars and review response rates below 5% were effectively invisible, excluded entirely rather than ranked lower. In traditional search, a 3.4-star business could still surface on proximity. In AI, it does not surface at all. Recency and response rate are part of the filter: BrightLocal found 97% of consumers still rely on reviews, 41% now "always" read them (up from 29%), and slow or generic review responses are increasingly read as a red flag.

3. Directory and citation consistency, as a trust signal. The model cross-references your information across the web, and contradictions erode confidence. Business profile data is only 68% accurate on ChatGPT and Perplexity versus 100% on Gemini (SOCi), and that accuracy gap directly explains the recommendation gap between platforms. Consistency is measurable and it pays: Yext's analysis of 6.9 million citations found meaningful correlation between unstructured citation volume and AI search visibility, and Yext data from 620,000 locations showed those syncing 50 to 75% of a 200-plus publisher network saw a 95% average increase in website clicks from Google. NAP consistency is no longer hygiene. It is the difference between a model that trusts your data and one that hedges.

4. Third-party "best of [city]" lists and editorial. When an AI answers "best tacos in Austin," it defers to the publishers humans already trust for that judgment. A placement in a credible local roundup, a city magazine's annual list, or a respected industry directory is worth more than any amount of on-site copy claiming you are the best. The model treats independent editorial as verification it cannot get from your own domain.

AI does not rank reviews. It uses them as a gate. Below the sentiment threshold you are not ranked lower, you are excluded. A 3.4-star business that survives on proximity in the map pack simply disappears in the answer.

Notice the reordering. Two of the four signals, citations and editorial, live almost entirely off your own property. The local owner's job has shifted from tuning a profile he controls to shaping a consensus he can influence but not author.

Chapter 04

Why local GEO is not local SEO

The temptation is to treat this as local SEO with extra steps. It is not. The two disciplines share inputs but optimize for different mechanics, and confusing them is why so many well-ranked businesses are invisible in AI.

Classic local SEO accumulates signals to win a proximity-weighted ranking. You pile up reviews, citations, and category relevance, and Google sorts the nearby candidates. Volume and distance do real work. Local GEO is a qualification problem, not an accumulation problem. The AI is not sorting a list of nearby options by score. It is deciding whether your business is a legitimate, well-documented answer to a specific question, then naming a small set of winners.

Three concrete differences follow:

  • Proximity is a ranking factor in SEO and a near-irrelevance in GEO. Distance correlation with AI Overview rank is 0.001 (Local Falcon). The map pack rewards the closest acceptable option. The answer rewards the best-documented one.
  • Reviews are a gradient in SEO and a filter in GEO. More and better reviews lift you in the pack. In AI, below-threshold sentiment removes you entirely (SOCi). The first dollar of review work, getting above the bar, is worth more than the hundredth.
  • Query patterns invert. Local Falcon found that queries without a location name are 11.1 percentage points more likely to trigger an AI Overview than queries with one (46.1% vs. 35.0%), and longer, informational queries trigger AI far more often (queries of 100-plus characters at 60.5% vs. 19.7% for short queries). The detailed, conversational, "near me who can fix X today" question is exactly where AI intervenes, and exactly the kind of query classic local SEO never optimized for.

That last point cuts both ways and creates a defensive opportunity. Because location-named queries trigger AI Overviews less often, businesses that genuinely dominate "[service] [city name]" searches retain more traditional traffic. The exposure to AI is highest in the service categories where consumers research before buying. Local Falcon found the highest AI Overview trigger rates in window cleaning (65.0%), carpet cleaning (64.6%), personal injury attorneys (62.1%), and wedding photographers (61.8%). These are considered purchases with real research behind them, which is precisely where the AI inserts itself between the consumer and the business.

Local SEO accumulates signals to win a proximity-sorted list. Local GEO qualifies you as a legitimate answer, then names a winner. One rewards being close and abundant. The other rewards being documented and trusted.

Chapter 05

The structural advantage local businesses are sleeping on

There is good news buried in the disruption, and most owners are missing it. Local businesses may hold a structural advantage over national chains in GEO, for reasons that have nothing to do with budget.

A neighborhood business is already hyperlocal. It understands its market, knows the specific questions its customers ask, and can move in days rather than the months a national brand needs to push a change through committee. More importantly, it has raw material a chain lacks: genuine specialty expertise, named local projects, real customer stories, and the kind of specific, verifiable detail the model prefers over generic national copy. A single coffee shop that explains its roast, sourcing, and brewing method gives the AI more to quote than a chain with a thousand interchangeable listings and generic reviews.

The multi-location finding sharpens this. SOCi's data showed that strong performers were not the biggest brands but the most disciplined ones. Culver's, a restaurant chain, beat category benchmarks with 30.0% ChatGPT and 45.8% Gemini recommendation rates, driven by strong ratings and complete profiles. Liberty Tax, after improving profile coverage, ratings, and data accuracy, reached 68.3% Google 3-pack visibility alongside 19.2% Gemini and 26.9% Perplexity recommendation rates. Meanwhile, brands with strong traditional rankings but weaker data, including some household names, underperformed badly. Execution beat size.

A national brand has scale. A local business has specificity, speed, and genuine expertise. In a system that rewards documented substance over generic abundance, the small operator is not the underdog. The undocumented one is.

For multi-location operators the risk is the inverse of the opportunity. Each location needs consistent, accurate data across every platform the AI touches, and a single location with bad data or a sub-threshold rating can drag the brand's overall trustworthiness. Consistency at scale is the hard part, and it is where most multi-location brands lose.

Chapter 06

The execution playbook

The work divides cleanly by business type, but the foundation is shared. None of it is optional once 45% of your customers are asking an AI before they ask you.

Shared foundation: become the most trustworthy data on the web.

  • Complete and verify the Google Business Profile fully. Every field, accurate hours, correct categories, real photos, populated attributes. This is the spine Gemini and AI Overviews ground themselves in, and incompleteness is why ChatGPT and Perplexity get you wrong 32% of the time.
  • Fix citation consistency across the directory ecosystem. Identical name, address, and phone across Google, Bing Places, Apple Maps, Yelp, and the major directories. Contradictions make the model hedge, and hedging means exclusion. Yext's 6.9M-citation data ties citation volume and consistency directly to AI visibility.
  • Get above the review threshold, then stay recent and responsive. The first job is clearing the sentiment gate, roughly the 4-star band where AI starts recommending. Solicit reviews continuously so recency stays fresh, and respond to all of them. Response rates below 5% correlate with invisibility (SOCi); slow responses are now a consumer red flag (BrightLocal).
  • Add LocalBusiness schema, matched to your subtype. Restaurant, AutoRepair, Attorney, and the rest, with name, address, phone, and hours that match every citation exactly. When schema contradicts your other data, Google discounts it entirely, so consistency beats cleverness.

Single-location businesses: out-specify, don't out-spend.

  • Document specific local expertise the model can quote: named projects, regional content ("winterizing pipes in Minnesota"), transparent pricing for common jobs, before-and-after results, and staff credentials.
  • Win the third-party "best of [city]" placements and local press that AI defers to. One credible editorial mention outweighs a page of self-description.
  • Write for the long, conversational, informational queries that trigger AI most, in plain question-and-answer form.

Multi-location brands: enforce consistency, then differentiate locally.

  • Centralize location-data management so every profile is complete and identical in its core facts across every platform. One bad location erodes brand-level trust.
  • Lift the laggards. SOCi's evidence is that improving the weakest locations' ratings and data accuracy moves the brand's whole AI visibility, as Liberty Tax demonstrated.
  • Give each location genuine local substance, not boilerplate. Localized content beats templated pages the model reads as thin and duplicative.

Everyone: measure selection, not just rankings.

Classic rank tracking tells you where you sit in the map pack. It cannot tell you whether ChatGPT, Gemini, or Perplexity named you when a customer asked, which is the event that now precedes the visit. You need to track share of voice in AI answers, the local prompts you appear and do not appear in, and which competitors get recommended when you do not.

The old job was ranking near the searcher. The new job is being the answer the engine trusts enough to name. Proximity was a placement you earned with distance and volume. Selection is a reputation you earn with accuracy, sentiment, and proof.

The local businesses winning this transition are not the closest or the biggest. They are the ones whose data is the most consistent, whose reviews clear the gate and stay fresh, and whose expertise is documented specifically enough for a machine to repeat. The map moved. The question is whether the engine trusts you enough to recommend you when the customer never sees the map at all.

Sources cited

  1. BrightLocal Local Consumer Review Survey 2026 (1,002 US adults) — AI local discovery up from 6% to 45%; Google share down 83% to 71%; 97% of consumers rely on reviews, 41% "always" read them; review-response speed as a red flag.[02]SOCi 2026 Local Visibility Index (349,000+ locations, 2,751 multi-location brands) — Recommendation rates (ChatGPT 1.2%, Gemini 11%, Perplexity 7.4% vs. 35.9% 3-pack); 3x–30x harder than traditional ranking; 68% vs. 100% data accuracy; review-threshold filter and star averages (4.3 / 3.9 / 4.1); Culver's and Liberty Tax case data; 45% top-20 retail overlap.[03]Local Falcon AI Overviews Whitepaper (60,000 simulations, 4,423 businesses, 20 countries) — AI Overviews in 40.2% of local searches; distance-rank correlation 0.001; inclusion effect (72.0% vs. 68.5%); location-name and query-length trigger effects (46.1% vs. 35.0%; 60.5% vs. 19.7%); service-category trigger rates (window cleaning 65.0%, carpet cleaning 64.6%, personal injury 62.1%, wedding photography 61.8%).[04]Yext Research (6.9M citations; 620,000 locations) — Citation volume and consistency correlate with AI search visibility; publisher-network sync drove 95% average increase in Google website clicks.[05]Search Engine Land (2026) — Reporting on the SOCi Local Visibility Index and AI-versus-traditional local visibility gap.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which local prompts the engines name you in, where competitors get recommended instead, and which profile, review, citation, and editorial gaps are keeping you out of the answer.

Related research

Foundational Methodology Original Research Original Research
Back to Research
Live · FancyAI Research Corpus

GEO for Healthcare: How AI Picks Its Sources When the Answer Could Hurt Someone

In every other vertical, AI weighs visibility. In health it weighs liability. The authority bar is higher, the citation pool is narrower, and a single hospital with fifty years of clinical expertise can be entirely invisible to the model deciding what a patient does next.

32%
Share of U.S. adults who used an AI chatbot for health information in the past year (KFF, 2026)
~39%
Share of health-domain AI Overview citations going to a single source: NIH (Surfer, 46M citations)
49.6%
Share of AI chatbot answers to health questions judged "problematic" in a BMJ Open audit
44.1%
AI Overview trigger rate for medical queries — the highest of any YMYL category (SE Ranking data, Nov 2025)
Chapter 01

The trust tax: why health is the hardest place to be selected

Generative engines do not treat all questions the same. Ask for a dinner recipe and the model will improvise from a wide pool of sources. Ask whether a chest pain warrants an ER visit and the model behaves differently. It narrows. It hedges. It reaches for institutions it can defend.

This is the YMYL effect — Your Money or Your Life — and health sits at its center. Google's own Search Quality Rater Guidelines flag health topics as the category where "low-quality content could potentially harm a person's health," and the AI systems built on top of search inherit that posture. The data shows it plainly. Medical queries trigger AI Overviews 44.1% of the time, the highest rate of any YMYL category, according to SE Ranking data from November 2025. Other audits put the trigger rate for broad health queries above 82%. Whichever number you use, the conclusion is the same: when a patient asks a health question, an AI answer is now the default, not the exception.

That default is consequential because of who is asking. 32% of U.S. adults — about one in three — turned to an AI chatbot for health information in the past year, per the KFF Tracking Poll on Health Information and Trust (fielded February 24 to March 2, 2026, n=1,343). KFF found 29% used AI for physical-health information and 16% for mental-health information. That share now equals the share of adults who use social media for health information. A Pew Research Center survey from late 2025 put the figure lower, around 2 in 10 adults, but the trajectory across both is the same direction at speed.

Most who turned to AI for health information say they were in search of quick and immediate advice, though challenges affording and accessing health care also play a role, particularly for younger adults." — KFF Tracking Poll on Health Information and Trust, 2026

The behavioral shift matters more than the raw adoption number. Patients are not arriving at AI to confirm a decision they already made. They are arriving at the symptom-awareness stage, before they have chosen a provider, a treatment, or a brand. The AI answer is increasingly the first frame of the entire care journey. Whoever the model cites at that moment shapes everything downstream.

And the model is selective about whom it cites. That selectivity is the whole game.

Chapter 02

The citation pool is narrow, and it is owned by institutions

In most verticals, the question for a brand is "can I get into the answer." In health, the more honest question is "can I get into a citation set that is already dominated by a handful of institutions and aggregators."

The concentration is stark. Surfer analyzed 46 million citations across 36 million AI Overviews between March and August 2025. In the health domain, NIH alone commanded roughly 39% of citations, with Healthline at ~15%, Mayo Clinic at ~14.8%, Cleveland Clinic at ~13.8%, and ScienceDirect at ~11.5%. Read those numbers together: a small set of government bodies, peer-reviewed repositories, and national medical institutions absorb the overwhelming majority of health citations before anyone else competes for what remains.

ChatGPT shows the same pattern from a different angle. A January 2026 framework study posted to arXiv ("Authority Signals in AI Cited Health Sources," Dwivedi et al.) analyzed 615 sources cited in ChatGPT health responses and found over 75% came from established institutional sources — Mayo Clinic, Cleveland Clinic, Wikipedia, the NHS, PubMed. The remaining quarter came from sources lacking established institutional backing. The study classified sources across four authority domains: Institutional Affiliation, Author Credentials, Quality Assurance, and Digital Authority. Those four domains are, in effect, the rubric the model applies to decide who is trustworthy enough to repeat.

Perplexity behaves consistently with this. It surfaces a tight, visible citation set and favors recognized clinical sources — Mayo Clinic, Cleveland Clinic, NIH, CDC in the US; NHS, NICE, and CQC guidance in the UK — re-ranking thin or shallow content out of the answer even when it contains the right keywords.

The structural lesson for everyone who is not the NIH:

  • Aggregators and institutions win on coverage and structure, not warmth. A 2026 healthcare analysis found AI cites health aggregators and government institutions over individual hospitals "by a massive margin." A regional health system with deep clinical expertise can be invisible while Healthline answers the question for it.
  • The pool is self-reinforcing. Authoritative sources earn more citations, which raises their authority, which earns more citations. SE Ranking and ContentWriters both describe this as a closing door for newcomers in YMYL.
  • Source alignment with traditional search is weak. Only about 36% of AI-cited health pages appear in Google's top 10 organic results, per 2026 analysis, and YouTube ranks first in AI health citations despite ranking 11th organically. Winning the blue links no longer guarantees winning the answer.

Clinical expertise that took decades to build becomes invisible when AI systems can't verify, extract, or cite it." — upGrowth, "Provider vs Aggregator: Who AI Cites for Healthcare," 2026

This is the contrast that defines healthcare GEO. The work is not ranking against a list of competitors. It is becoming the kind of verifiable, structured, institutionally-credible entity that the model is willing to place inside a high-stakes answer at all.

Chapter 03

The accuracy problem is your brand-safety problem

Health GEO carries a risk that no other vertical does at this magnitude: the model can get it wrong, and your name can be attached to the wrong thing.

The accuracy data is sobering. A BMJ Open audit published April 14, 2026 tested five leading chatbots — ChatGPT, Gemini, Grok, Meta AI, and DeepSeek — against 250 health questions spanning cancer, vaccines, stem cells, nutrition, and athletic performance. 49.6% of the answers were judged "problematic," and 19.6% were "highly problematic" or potentially harmful. No chatbot produced a fully accurate reference list. Many answers drifted from scientific consensus or used "hedging language that provided false balance between scientific and non-scientific information."

A second, quieter failure mode is the citation-support gap. 2026 research found that AI medical answers frequently were not fully supported by the very sources they cited. The system attaches a reputable URL, but the generated text misinterprets, oversimplifies, or contradicts it — manufacturing the appearance of credibility while leading the reader astray. For a brand, this is the dangerous scenario: your page is cited, but the summary around it distorts your guidance, and the patient never clicks through to see the original.

Newer models are improving on narrower benchmarks. OpenAI's HealthBench — a physician-curated set of 5,000 realistic clinical conversations graded against 48,562 clinician-developed criteria, built with 262 clinicians across 60 countries — shows meaningful gains generation over generation. GPT-5 with thinking mode reportedly cut hallucination on HealthBench to 1.6%, against 15.8% for GPT-4o. But HealthBench Hard remains genuinely hard, with frontier scores in the 0.40 to 0.46 range, and PMC reviewers have cautioned that strong benchmark performance is "not yet clinically ready." The gap between a controlled benchmark and a patient typing a panicked question at 2 a.m. is wide.

Patients sense this. KFF found that only about a third of adults have a "great deal" or "fair amount" of trust in AI for health information, and 56% are not confident they can tell true from false in AI health answers. About three-quarters are concerned about the privacy of the health information they share with these tools.

Most adults cannot distinguish accurate from inaccurate AI health answers. That makes the source the model chooses to cite the de facto safety mechanism — and a brand-safety exposure for every name attached to it.

The brand-safety implication is direct. When AI misattributes a claim to your organization, or summarizes your clinical content into something you would never have published, the reputational and regulatory exposure is real. The defensive move is the same as the offensive one: publish content so clear, so structured, and so authoritatively sourced that the model has little room to distort it, and monitor relentlessly for where it does.

Chapter 04

The signals that make you eligible

Eligibility in health GEO is earned through signals the model can verify. Marketing language does not qualify you. Verifiable authority does. Four signal categories matter most.

1. Institutional authority and entity clarity. AI cites entities it recognizes and trusts. That means a well-defined brand entity — consistent name, specialties, locations, and affiliations across your own site, Wikipedia, Wikidata, news, and recognized directories. The Authority Signals framework names "Institutional Affiliation" and "Digital Authority" as two of its four credibility domains. Being listed on verified healthcare directories (in the UK: Doctify, Top Doctors, CQC; in the US: comparable verified provider listings) measurably increases the likelihood of citation in Perplexity, per UpMedico's analysis.

2. Named, credentialed medical review. The single highest-leverage on-page signal is content written or reviewed under a named, credentialed clinician, ideally linking to that clinician's credential page. This maps directly to Google's E-E-A-T — Experience, Expertise, Authoritativeness, Trustworthiness — which functions as the quality gate for YMYL content and, increasingly, the filter AI uses to select sources. An authority-gap analysis of practices with poor AI visibility found that 78% had no physician credentials on content pages, 82% had no quantifiable patient success metrics, and 89% had no published research or speaking engagements (InfluxMD). Those absences are not cosmetic; they are the missing signals that disqualify a site from the citation pool.

3. Structured, machine-readable clinical content. AI cannot cite what it cannot parse. MedicalOrganization, Physician, MedicalCondition, and FAQPage schema let engines extract who you are, what you treat, who your clinicians are, and what your credentials are without guessing. Audits found 73% of practices with poor AI visibility had inadequate medical schema, 68% failed Core Web Vitals, and 54% blocked at least one AI crawler (InfluxMD). Format matters too: question-optimized, conversational Q&A content earned a 43% citation rate versus 12% for keyword-optimized content, and direct answers of roughly 45 to 80 words perform best for FAQ-style health content (Intrepy).

4. Third-party validation and consensus. The model reads the wider web, not just your site. Peer-reviewed references, citations from credible sources, recognized media mentions, and genuine review volume marked up with schema all raise eligibility. The Princeton GEO study found that adding citations from credible sources produced a 115.1% visibility improvement for a fifth-ranked page; direct quotations added 30 to 40%. The inverse is equally important: in wellness and other YMYL health categories, keyword stuffing, mass influencer seeding, and coordinated PR syndication now actively hurt visibility, because LLMs detect synthetic hype and exclude brands that spike unnaturally (So Sloane).

AI engines rank clinics based on how well they explain, not just how well they advertise." — UpMedico, 2025

This is the inversion every health marketer needs to internalize. Tactics that produced lift in classic SEO — keyword density, link velocity, promotional tone — are neutral or negative in health GEO. 35% of GEO failures stem from treating it like traditional SEO (InfluxMD). Eligibility comes from proof, not promotion.

Chapter 05

Pharma, payers, and the regulatory edge

For pharmaceutical brands, digital-health companies, and payers, health GEO collides with regulation in a way no other vertical experiences.

Pharma faces a structural opening and a structural hazard at once. The opening: AI Overviews do not simply repeat the top-ranking page; they surface the best-structured, most authoritative snippet, which lets a well-structured brand source leapfrog older rankings on questions like how a drug works or how two therapies differ. The hazard: the same answers can pull from outdated, off-label, or competitor content, and AI misquoting a brand claim creates real legal exposure.

The regulatory environment tightened sharply in 2025 and 2026. The FDA sent thousands of letters to pharmaceutical companies in September 2025 demanding the removal of misleading ads, issued roughly 100 cease-and-desist letters, and began using AI-enabled tools to proactively review drug advertising. The agency's posture is unambiguous: if a communication is false or misleading, the FDA does not care that a machine drafted it. In April 2026, the FDA issued its first cGMP warning letter citing inappropriate AI use in drug manufacturing. The FDA and EMA still have not issued clear guidance on how AI-generated summaries intersect with promotional content, which leaves brands exposed to a gap they must manage proactively. Many are now routing content through revised MLR (Medical, Legal, Regulatory) review and running LLM simulators during development to test how models interpret their messaging before publication.

Payers and health insurers sit in a parallel YMYL pressure zone. AI tools are increasingly used to compare plans by ZIP code, budget, and preference, and the engines apply extreme scrutiny, favoring established comparison sources and government sites like Healthcare.gov. Stanford researchers warned in January 2026 that limited transparency in AI-based insurance decisions could lead to wrongful care denials and worsen existing disparities — a reminder that visibility without accuracy is not a win in this category.

Digital-health and telehealth brands face the eligibility problem without the geographic anchor that local practices use for "near me" queries. They must compete on structured, credible, conversational content around symptoms, state licensing, insurance acceptance, and access. And trust signals now extend to security: a telehealth company flagged for a data breach saw its trust signals and search visibility damaged, evidence that data transparency is directly tied to AI eligibility in health.

The throughline across pharma, payers, and digital health: the regulatory bar and the AI authority bar now point the same direction. Verifiable, accurate, well-governed content is simultaneously the compliance requirement and the visibility strategy.

Chapter 06

The execution playbook

Health GEO is buildable. The work is unglamorous and verifiable, which is exactly why it compounds. Sequence it in three phases.

Phase 1 — Build the verifiable foundation (weeks 1 to 6).

  • Establish entity clarity: consistent name, specialties, locations, and affiliations across your site, Wikipedia/Wikidata, recognized directories, and major news. The model has to know who you are before it can trust you.
  • Implement MedicalOrganization, Physician, MedicalCondition, and FAQPage schema across core pages so engines parse your clinical reality without guessing.
  • Fix the crawl: unblock legitimate AI crawlers, pass Core Web Vitals, and render critical medical content without heavy JavaScript dependence.
  • Put named, credentialed clinician bylines and linked credential pages on every clinical page. This is the highest-leverage single fix given how many sites lack it.

Phase 2 — Build authority and consensus (weeks 6 to 16).

  • Add real proof to content pages: procedure volumes, years in practice, teaching roles, publications, media mentions, and quantifiable outcomes. These are the exact signals the authority-gap data shows are missing.
  • Reference credible sources directly. The Princeton data shows citations and direct quotations are among the strongest levers for getting pulled into AI answers.
  • Earn third-party validation: verified directory listings, legitimate reviews marked up with schema, and genuine earned media. Do not manufacture it — coordinated hype is detected and penalized in YMYL.

Phase 3 — Engineer answers and govern risk (ongoing).

  • Restructure content into conversational Q&A with direct 45-to-80-word answers to the questions patients actually ask, not keyword headers.
  • For pharma and regulated brands, route AI-facing content through MLR review and test it against LLM simulators before publishing.
  • Monitor continuously. Track which questions surface you, which competitors dominate, and — critically — where the model misquotes or misattributes your content, so you can correct the source material that is feeding the error.

The economic case for doing the work is strong where it can be measured. AI search visits to healthcare grew 42.8% year over year, from 15.6 billion in Q1 2025 to 27.4 billion in Q1 2026, and AI-sourced healthcare leads have been reported to convert at multiples of traditional organic traffic. Those conversion multiples come from agency datasets and should be treated as directional rather than universal, but the direction is consistent across sources: AI-referred patients arrive warmer because they trust the answer that sent them.

The healthcare brands that win in AI are not the loudest. They are the most verifiable. In a domain where the model is built to avoid harm, provable authority is the only currency that buys a citation.

Sources cited

  1. KFF Tracking Poll on Health Information and Trust (Feb 24–Mar 2, 2026, n=1,343) — AI health adoption (32%), trust levels, and the finding that 56% cannot distinguish true from false AI health answers.[02]Surfer "AI Citation Report" (46M citations across 36M AI Overviews, Mar–Aug 2025) — health-domain citation concentration: NIH ~39%, Healthline ~15%, Mayo Clinic ~14.8%, Cleveland Clinic ~13.8%.[03]SE Ranking YMYL keyword tracking (cited via ContentWriters), Nov 2025 — medical AI Overview trigger rate of 44.1%, highest among YMYL categories.[04]BMJ Open chatbot accuracy, referencing, and readability audit (Apr 14, 2026) — 49.6% of answers problematic, 19.6% highly problematic, across five leading chatbots and 250 health questions.[05]Dwivedi et al., "Authority Signals in AI Cited Health Sources" (arXiv 2601.17109, Jan 2026) — 615-source analysis; 75%+ institutional citation share; four-domain authority framework.[06]OpenAI HealthBench and HealthBench Professional — physician-curated benchmark (5,000 cases, 48,562 criteria, 262 clinicians); GPT-5 hallucination and HealthBench Hard scores.[07]Pew Research Center (Oct 2025) — approximately 2 in 10 U.S. adults using AI tools for health information.[08]Princeton GEO study (Aggarwal et al.) — citation lift (115.1% for a fifth-ranked page) and quotation lift (30–40%).[09]InfluxMD authority-gap and technical audit data — missing-credential (78%), missing-metric (82%), schema (73%), and crawler-blocking (54%) findings; 43% vs 12% citation rates for question- vs keyword-optimized content.[10]Intrepy Healthcare Marketing (2026) — two-layer AI ranking logic, FAQ answer-length guidance, and local-vs-informational query behavior.[11]UpMedico (2025) — Perplexity re-ranking behavior and the role of verified directories in health citations.[12]upGrowth / Pharma Marketing Network / Stanford / FDA reporting (2025–2026) — provider-vs-aggregator citation gap, pharma AI Overview dynamics, FDA enforcement actions, and AI-driven insurance risk.

Want this measured against your brand?

The AI Readiness Index (ARI) scores how AI engines see your health brand — which questions surface you, which institutions outrank you, and where the model misquotes your clinical content. In a YMYL domain, that visibility map is the difference between being cited and being invisible.

Related research

Foundational Methodology Original Research Brand Risk
Back to Research
Live · FancyAI Research Corpus

The Shortlist Forms Before the Demo: How AI Decides Which Software to Recommend

Half of B2B software buyers now begin their search inside an AI chatbot, and the winning vendor is already on the buyer's shortlist 95% of the time before a single sales call. The new battleground isn't your pricing page. It's whether the machine names you when a buyer asks for the best tool in your category.

51%
Share of B2B software buyers who start research in an AI chatbot more often than Google (G2, 2026)
95%
Of deals, the winning vendor was already on the buyer's Day One shortlist (6sense)
69%
Of buyers chose a different vendor than planned based on AI chatbot guidance (G2, 2026)
44%
Of benchmarked SaaS companies are functionally invisible to AI buyers (DerivateX, 2026)
Chapter 01

The shortlist is the product

For two decades, the SaaS go-to-market motion ran on a clean funnel. Generate awareness, capture a lead, nurture it through gated content, route it to sales, run the demo, close. Every stage had a metric, every metric had an owner, and the website was the hub the whole machine turned around. That funnel is being compressed into a single, decisive moment that happens before any of those metrics fire.

In 2026, 51% of B2B software buyers say they start their research with an AI chatbot more often than with Google, according to G2's Answer Economy report. That is up from 29% in April 2025, roughly a doubling in twelve months. 71% now rely on an AI chatbot somewhere in the software research process, and just 3% say AI hasn't meaningfully changed how they research. The behavior is no longer early-adopter. It is the default.

What the buyer does with that AI is the part that should reorder every SaaS marketing roadmap. They do not ask the model to send them to a website. They ask it to decide. They type "best customer support platform for a 40-person fintech" and the engine returns a shortlist of three to five named tools, with reasoning attached. The buyer treats that shortlist as the starting point, not a suggestion to be verified.

The question is no longer whether a buyer can find you. It is whether the machine names you when someone asks for the best tool in your category. If it doesn't, you were never in the running.

The hard data on what that shortlist means is unambiguous. 6sense's analysis of nearly 4,000 B2B buyers found that the winning vendor was already on the buyer's Day One shortlist 95% of the time. Buyers fill roughly four shortlist spots on the first day of the journey and purchase from that initial set 85% to 95% of the time. The consideration set was always sticky. AI just moved its formation earlier and put a machine in charge of populating it.

This is the inversion at the center of this report. The demo used to be where the deal was won. Now the demo is a confirmation ritual for a decision the AI already shaped. If you are not on the shortlist the engine assembles, you do not get the demo, you do not get the objection-handling, and you do not get the chance to out-sell a better-positioned competitor. Being seen by Google was never the goal. Being selected by the model is.

Chapter 02

How AI builds a software shortlist

To win the shortlist you have to understand how it is assembled. AI engines do not rank your category page and serve the top result. They synthesize a recommendation by cross-referencing what you say about yourself against what the rest of the internet says, then they name the tools where those two stories agree.

Backlinko's analysis of SaaS in AI search frames the mechanic as two forces: consensus (many credible places describe your product the same way) and consistency (the details match everywhere they appear). When ChatGPT answers "best project management tool for small teams," it is not consulting a ranking. It is checking whether a tool's self-description on its homepage matches its Capterra listing, its Reddit threads, and the "best of" listicles that map the category, then surfacing the tools where that description is dense, repeated, and undisputed.

This is why Asana routinely appears in AI project-management answers: its core description is consistent across its website, Capterra, Reddit, and editorial roundups, so the model treats it as a settled fact of the category. It is also why review-platform presence functions as a gate. Quoleady's 2026 research analyzing SaaS tools cited in ChatGPT found:

  • 100% of tools named in ChatGPT answers had Capterra reviews.
  • 99% had G2 reviews.
  • 78.8% had a Wikipedia page.

Those are not differentiators. They are the price of admission. A tool without a presence on these surfaces is missing the data the engine uses to verify it exists, and an unverifiable tool does not make the shortlist.

But the most important finding in that research is what does not predict placement. Review count showed only weak correlation with ChatGPT ranking, and review scores showed near-zero correlation. A 4.7-star average and 30,000 reviews will not lift you above a tool with a few hundred reviews if the model has stronger contextual signals for the smaller player. Coda, with 97 Capterra reviews, outranked ClickUp's 4,490 for "Notion alternatives," because relevance and consistency beat raw volume.

The signal that did correlate was authority. Quoleady found Domain Rating was the strongest predictor of ChatGPT placement, at -0.40 (the strongest of every variable tested). Tools with a DR of 80 or above averaged rank 4.97; tools below DR 80 averaged 7.04. High-authority mentions and the backlinks that build domain authority still matter, but they now matter because they raise the credibility of the sources the model reads about you, not because they win a search ranking.

AI cross-checks what you say against what the internet says. Where the two agree, you get consensus. Where they conflict, you get hallucination. The shortlist is built from agreement.

Chapter 03

The signals that get a tool selected

If reviews are the gate and authority is the floor, what actually moves a tool from "verifiable" to "recommended"? Four signal classes do the work, and only one of them lives on a domain you control.

1. Review platforms as the verification layer. G2, Capterra, and TrustRadius are not just present in AI answers; they are weighted heavily because they separate vendor claims from real user experience. Across the tech and SaaS category, G2 ranks as the 4th most-cited source for ChatGPT and 6th for Google AI Mode (Semrush AI Visibility Index). And the platforms have consolidated their grip: in early February 2026, G2 acquired Capterra, Software Advice, and GetApp from Gartner for roughly $110 million, combining the four most influential software-review properties under one roof. G2's stated vision is to "become the data source AI trusts for software recommendations," spanning 6 million verified reviews and 2,000-plus categories. Omniscient Digital projected the acquisition could increase G2's AI citation share in bottom-of-funnel prompts by up to 76%. The verification layer of SaaS discovery now has a single owner, and it is optimizing explicitly for the engines.

2. Community consensus, led by Reddit. AI engines treat community discussion as authentic, experience-based evidence. A 2026 study of 30 million sources across ChatGPT, Google AI Mode, Gemini, Perplexity, and AI Overviews found Reddit was the single most-cited source on every major engine, with YouTube and LinkedIn close behind (Search Engine Land). Separate research from 5W found Wikipedia and Reddit together drive over 25% of ChatGPT citations in the US. For SaaS specifically, Perplexity leans on Reddit, LinkedIn, and G2 for B2B queries. When a buyer asks an engine for "honest opinions on" a tool, the model is reading the threads where real users described it, not your messaging.

3. Third-party "best of" listicles and category roundups. These hand the model a ready-made map of the category, a list of the players, and a set of differentiators. A placement in a credible roundup is the closest thing to a direct vote for the shortlist, and it carries more weight than any volume of on-site copy claiming you are the leader.

4. Owned content, but only as corroborating reference. AI does use your own comparison pages, documentation, and knowledge bases, but as one input it cross-checks, not as the verdict. Backlinko documented Omnisend's own comparison blog post shaping ChatGPT's narrative about Omnisend versus Mailchimp, and Google AI Mode pulling from Semrush's enterprise landing page, press release, blog, and case study together to answer "Is Semrush good for enterprise?" The lesson is that owned content works when it agrees with everything else, and backfires when it contradicts it.

Three of the four signal classes that decide your shortlist placement live off your domain. The SaaS marketer's job shifted from optimizing a page they own to shaping a consensus they can influence but cannot author.

Chapter 04

Why the shortlist precedes the demo

The structural reason this matters more for SaaS than almost any other category is the buying committee. The decision is not made by one person you can re-sell. Forrester puts the typical B2B technology decision at 22 stakeholders, and 89% of B2B buyers now use generative AI for self-guided research before engaging a vendor. For AI-related purchases specifically, the buying group roughly doubles to 20-plus participants. Each of those stakeholders is running their own AI research, and each is arriving at the table with a shortlist already in hand.

This compresses the funnel from both ends. On the front end, 67% of B2B buyers now prefer a rep-free experience (Gartner, March 2026), and the typical buyer arrives at the first sales call already 60 to 80 percent decided. On the back end, even the human validation that remains works against latecomers: 69% of buyers turn to sales reps to validate AI-generated insights (Gartner, May 2026), meaning the rep's job is increasingly to confirm a shortlist the AI built, not to introduce a new option into it.

The displacement risk is the part that should alarm SaaS leadership. The AI shortlist is not just a new top of funnel. It is a mechanism that actively reroutes deals away from incumbents.

  • 69% of buyers chose a different software vendor than they originally planned based on AI chatbot guidance (G2, 2026).
  • One-third of buyers purchased from a vendor they had never heard of before the AI surfaced it (G2, 2026).

Read those two numbers together. The AI shortlist breaks brand-recall advantages that took a decade and tens of millions in marketing spend to build. A buyer who would have defaulted to the category leader out of familiarity now asks an engine, gets a shortlist that includes a tool optimized for their exact use case, and switches. Incumbency is no longer a moat if the engine does not consistently name you, and challenger brands that win the consensus game can leapfrog into consideration sets they could never have bought their way into.

69% of buyers chose a different vendor than they planned, and one-third bought from a company they had never heard of. The AI shortlist is the most efficient brand-displacement machine ever built.

Most consideration sets the engines produce name only three to six brands. Below that line there is no second page, no "view more," no organic spillover. You are on the shortlist or you are invisible in the single most important conversation in the buyer's journey.

Chapter 05

The invisibility problem most SaaS brands have

Knowing the signals does not mean SaaS companies are acting on them. The benchmark data shows a category that understands the stakes intellectually and has done almost nothing structurally.

CommonMind's 2026 State of AI Visibility in B2B SaaS, surveying 169 respondents from late 2025 through early 2026, found a stark awareness-action gap: 93% of B2B SaaS marketers say AI search visibility is critically important, but only 14% have a mature strategy to address it. Meanwhile the traditional channel they have always relied on is eroding underneath them: 59% of B2B SaaS brands report Google organic traffic is flat or declining.

The independent measurement is worse than the self-assessment. DerivateX's 2026 benchmark ran 50 SaaS companies across 1,400 buyer-intent prompts and found 44% of them functionally invisible to AI buyers. Only 8 of the 50 scored above 80, and 22% scored below 40, the threshold the study uses to mark substantial invisibility. Nearly half of the SaaS companies tested simply did not surface when a buyer asked the engine for a recommendation in their own category.

Pricing transparency is the clearest self-inflicted wound. CommonMind found 57% of B2B SaaS companies don't publish pricing, the highest non-disclosure rate of any industry surveyed. That matters because the engine answers pricing questions whether or not you supply the answer. When a SaaS brand leaves pricing blank, the model fills the gap with community speculation, competitor framing, and outdated third-party estimates, often tied to negative sentiment (Backlinko). Among SaaS brands with emerging or mature AI strategies, 70% publish pricing. The correlation is not subtle: the companies taking AI visibility seriously are the ones making sure the authoritative voice on their pricing is their own, not the most convincing user on a Reddit thread.

93% of SaaS marketers call AI visibility critical. 14% have a mature strategy. 44% are invisible to AI buyers. The gap between knowing and doing is the entire opportunity.

The encouraging read on this data is that the bar is low because almost no one has cleared it. In a category where 86% of competitors lack a mature strategy and nearly half are invisible, the work required to make the shortlist is unglamorous and entirely achievable. It is consensus-building, not magic.

Chapter 06

The conversion case for moving now

The standard objection from SaaS finance is that AI-sourced demand is still a small slice of pipeline. In raw volume, that is true. As leverage, it is the wrong way to read the number.

AI-sourced demand converts at a structurally higher rate because the engine pre-qualifies the buyer. An Amsive study found LLM-referred traffic converts at 3.76% versus 1.19% for organic search, a 216% improvement. G2 reported AI search-driven leads convert 40% better than traditional search, and a Croud and ImpactSense study found AI-assisted buyers spend roughly twice as much per transaction, because the AI gives them the confidence of a clear comparison before they ever talk to sales.

The reason is identical to the shortlist mechanic. The person who clicks through from an AI recommendation has already had their fit assessed, their alternatives compared, and your tool surfaced as an answer to a specific need. There is no top-of-funnel tire-kicking to discount. That buyer is closer to a decision than any lead your old funnel produced, and they arrive pre-sold on the category and pre-positioned toward whichever tools made the shortlist.

Now layer the trajectory. Buyer adoption doubled in twelve months, ChatGPT remains the dominant research engine at 63% share of B2B software research (G2), and Gartner projects that by 2028, 60% of B2B seller work will run through conversational interfaces powered by generative AI. A channel that converts two to three times better, drives larger deals, and is compounding at this slope is not a side experiment to revisit next planning cycle. It is where the next cohort of category leaders is being decided right now.

AI-sourced SaaS leads convert 216% better than organic and drive roughly 2x deal value, because the engine hands you a buyer who has already been told you are the answer. Small channel, highest leverage.

Chapter 07

The execution playbook for SaaS marketing teams

The work divides into what you own, what you influence, and what you measure. For a SaaS team, none of it requires a new ad budget. It requires fixing the consensus.

Own your verifiable facts, starting with pricing. This is the fastest, highest-leverage fix in the entire playbook, because 57% of your competitors are leaving it blank.

  • Publish transparent pricing, or at minimum publish enough structured pricing logic that the engine quotes you instead of a Reddit guess. Silence cedes the answer to your competitors and your detractors.
  • Make your core product description identical everywhere it appears: website, G2, Capterra, docs. Consistency is a ranking signal; contradiction is a hallucination trigger.
  • Keep documentation and knowledge bases current and parseable. AI reads them directly to understand what your product does, who it is for, and what it integrates with.

Win the gate, then the consensus. Review-platform presence is table stakes, but presence is not placement.

  • Establish and maintain G2, Capterra, and TrustRadius listings. Missing them keeps you off the shortlist entirely. Now that G2 owns all four major properties, a single coordinated review strategy covers the verification layer the engines trust most.
  • Prioritize review depth and recency over chasing raw counts. The engines weight contextual, specific, recent reviews over a high star average on stale volume.
  • Earn placements in the third-party "best of" roundups that map your category. One credible inclusion in a trusted listicle outweighs a page of self-description.

Shape the community truth. Reddit and developer communities are among the most-cited sources the engines read, and they are forming opinions about you with or without you.

  • Participate authentically in the subreddits, forums, and community threads where your category is discussed. The strategy is genuine contribution; engines detect and discount coordinated seeding.
  • For developer-facing tools, treat documentation quality, GitHub presence, and Stack Overflow answers as visibility infrastructure. Structured technical content is exactly what the model can parse and trust.

Write for the specific question, not the category keyword. AI queries are long and specific. Perplexity searches average 10 to 11 words versus 2 to 3 on Google.

  • Build use-case content for the precise queries buyers ask: "best CRM for a remote team of 50 on a budget," not "best CRM software." Niche specificity is where smaller tools beat category leaders in AI answers.
  • Create your own honest comparison and alternatives pages. AI uses owned comparison content as reference, and it is better that your framing exists than that only your competitor's does.

Measure selection, not sessions. Your analytics tell you who arrived. They cannot tell you whether the engine named you in the first place, and the shortlist forms before any click.

  • Track share of voice in AI answers: the prompts you appear in, the prompts you don't, and which competitors are named when you are absent.
  • Audit for hallucination and competitor interference, especially on pricing, features, and integrations, where stale or hostile third-party data does the most damage.

The old job was ranking a page you control. The new job is being named by a system you don't, built on signals you mostly don't own. Visibility was a placement you could buy. Selection is a reputation you have to earn.

The SaaS brands that win the next five years will not be the ones with the largest demand-gen spend. They will be the ones whose product truth is the most consistent across every surface the engine reads, the most validated by real users, and the most specifically described against the use cases buyers actually have. The shortlist forms before the demo. The only question is whether the machine knows enough about you to put you on it.

Sources cited

  1. G2 Answer Economy / 2026 AI Search Insight Report — 51% start in an AI chatbot (up from 29% in April 2025), 71% rely on AI in research, 3% report no change, ChatGPT 63% research share, 69% chose a different vendor on AI guidance, one-third bought from an unknown-before vendor.[02]6sense 2025 Buyer Experience Report — Winning vendor on the Day One shortlist 95% of the time; ~4 shortlist spots filled on day one; 85–95% purchase from the initial set.[03]Quoleady (2026) — 100% of ChatGPT-cited tools had Capterra reviews, 99% G2, 78.8% Wikipedia; review count and score weak/near-zero correlation; Domain Rating strongest predictor (-0.40); Coda vs. ClickUp example.[04]Semrush AI Visibility Index (via Backlinko) — G2 ranks 4th most-cited for ChatGPT, 6th for Google AI Mode in the tech/SaaS category.[05]G2 / PR Newswire / Omniscient Digital (2026) — G2 acquired Capterra, Software Advice, GetApp from Gartner (~$110M); 6M reviews, 2,000+ categories; "data source AI trusts" vision; projected up to 76% BOFU citation-share lift.[06]Search Engine Land / 5W (2026) — Reddit the most-cited source across major engines (30M-source study); Wikipedia + Reddit over 25% of US ChatGPT citations.[07]Forrester (2026) — 22-stakeholder buying committee; 89% of B2B buyers use generative AI for self-guided research.[08]Gartner (2026) — 67% of buyers prefer a rep-free experience; 69% validate AI insights with reps; by 2028, 60% of B2B seller work runs through conversational interfaces.[09]CommonMind State of AI Visibility in B2B SaaS (2026) — 93% call AI visibility critical / 14% have a mature strategy; 59% report flat-or-down Google organic; 57% don't publish pricing (70% of AI-mature brands do).[10]DerivateX AI Visibility Benchmark (2026) — 50 SaaS companies, 1,400 prompts; 44% functionally invisible; only 8 scored above 80.[11]Amsive / G2 / Croud & ImpactSense — LLM traffic converts 3.76% vs. 1.19% organic (+216%); AI leads convert 40% better; AI-assisted buyers spend ~2x per transaction.[12]Backlinko — Consensus and consistency framework; Asana, Omnisend, Semrush consistency examples; pricing-gap hallucination; review platforms as verification layer.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which category and comparison prompts the engines name you in, where competitors are selected onto the shortlist instead, and which review, pricing, and consensus gaps are keeping you out of the buyer's Day One consideration set.

Related research

Buyer Behavior Foundational Methodology Original Research
Back to Research
Live · FancyAI Research Corpus

Reddit is the citation engine: a playbook for earning AI mentions through real community presence

Reddit is the single most-cited source across every major AI engine. It got there because licensing deals piped its conversations straight into the models, and because community discussion is the closest thing the web has to honest first-person experience. You cannot buy your way in. You earn it by being genuinely useful in the threads where your category is decided.

40.1%
Reddit's citation frequency across AI models, the #1 source (Visual Capitalist / Profound)
$130M+
Annual value of Reddit's AI-licensing deals with Google and OpenAI (Columbia Journalism Review)
How much more unlinked mentions correlate with AI citations than backlinks do (Ahrefs, 75,000 brands)
70%
AI-cited Reddit posts that have fewer than 20 comments (Semrush, 248K posts)
Chapter 01

Why Reddit dominates AI citations

Start with the one fact that reframes the entire channel. Across credible 2026 studies, Reddit is the most-cited domain in AI answers, full stop. Visual Capitalist's analysis put Reddit at 40.1% citation frequency across AI models, the top source by a wide margin. Profound's data has Reddit as the number one most-cited domain for AI across all models, ahead of YouTube, Forbes, TechRadar, and PCMag, and reported it being cited twice as often as Wikipedia in the top ten most-cited domains for the three months ending June 30 (Press Gazette / Profound). On Perplexity the concentration is even more extreme: Reddit accounts for 46.7% of its top-ten citations, more than three times its next-closest source (Profound).

The newer cross-platform numbers refine the picture without changing the conclusion. A 30-million-source study covering ChatGPT, Google AI Mode, Gemini, Perplexity, and AI Overviews found Reddit cited in 10% of all LLM answers, with YouTube ahead on raw share at 16% and LinkedIn at 11% (Search Engine Land, 2026). The leaderboard shuffles by methodology, but Reddit, YouTube, and LinkedIn are always at the top, and community discussion is always the throughline. Peec AI's 30-million-source analysis ranks Reddit #1 across ChatGPT, Google AI Mode, Gemini, Perplexity, and AI Overviews.

Why does an unpolished forum beat the entire professionally-published web? Because Reddit supplies exactly what the models are reaching for. It is the largest concentrated archive of first-person experience, real product comparisons, and genuine peer opinion anywhere online. When a buyer asks an engine "what actually works for X," the model wants a human who used it, not a brand that sells it.

Reddit is the closest thing the open web has to honest, first-person experience at scale. That is precisely what an AI engine wants when a user asks what to actually buy.

There is a structural reason too, and it is the foundation of this entire playbook. Google has explicitly prioritized "authentic discussion forums" in its ranking systems, and that preference flows directly into AI Overviews and AI Mode. The engines were tuned to reward exactly the kind of signal Reddit produces. That is not an accident you can game around. It is the design.

Chapter 02

The licensing deals that wired Reddit into the models

Reddit did not become the citation engine by luck. It was wired in by contract.

In February 2024, Reddit signed a content-licensing deal with Google reportedly worth $60 million a year, giving Google access to real-time, user-authored Reddit content for training and for surfacing in products like AI Overviews. A few months later it struck a similar deal with OpenAI, estimated at around $70 million a year (Columbia Journalism Review; Search Engine Land). Together these AI-licensing agreements make up roughly 10% of Reddit's revenue, about $130 million, and Reddit disclosed total data-licensing agreements worth $203 million in 2024 (Slashdot; Columbia Journalism Review).

This is the mechanism most "Reddit for AI" advice skips. The reason Reddit threads show up so often is not only that the content is good. It is that two of the three dominant engines have paid, sanctioned, real-time access to the corpus, while most of the rest of the web is being crawled, blocked, litigated, or deprecated. Reddit is privileged supply.

The engines do not cite Reddit only because the content is good. They cite it because Google and OpenAI pay nine figures a year for legal, real-time access to it. Reddit is privileged supply.

The arrangement is also getting more sophisticated, which matters for anyone planning a multi-year community strategy. Reddit is pushing both Google and OpenAI toward dynamic, usage-based pricing that pays more for the quality of the data rather than just its quantity (Bloomberg; Columbia Journalism Review). The strategic read for brands: the platform that most directly feeds the engines is being incentivized to favor high-quality, high-signal discussion. The conversations that read as genuine expertise are the ones with the most durable value, to the engines and to Reddit's own business model.

Chapter 03

The mention is the signal, not the link

The most expensive misconception in community marketing is that you need a link. You do not. In AI search, the brand mention is the unit of influence, and the data on this is now unambiguous.

Ahrefs studied 75,000 brands and found that unlinked web mentions correlate with AI citations at 0.664, while backlinks correlate at only 0.218 — roughly a 3× gap in favor of mentions (Ahrefs, 2026). AI engines understand the web through language, not link graphs. A sentence that says "we switched to Acme and it cut onboarding time in half" carries entity signal whether or not the word "Acme" is hyperlinked. The model reads the co-occurrence: brand, plus topic, plus outcome, plus context.

This is why community presence outperforms link building for AI. SE Ranking's research found that domains with millions of brand mentions on Reddit averaged 7 citations versus 1.8 for domains with minimal Reddit presence, a 3.9× multiplier. Critically, that lift is not just the direct Reddit citation. Reddit presence raises the citation rate of your own owned domain too. The engine sees the brand validated in community discussion and grows more confident naming it from any source.

SE Ranking's own AI Visibility Tracker is built around this reality: it logs both linked citations and unlinked brand mentions inside AI answers, because, in its words, AI engines "reference brands by name far more often than they hand out clickable links," and counting only links underestimates real brand presence "by a wide margin." Four channels carry almost all of the 2026 brand-mention gains, per SE Ranking: community conversations (Reddit, Quora, niche forums), creator and YouTube content, third-party reviews (G2, Capterra, Trustpilot), and earned editorial PR.

In AI search the unlinked mention beats the backlink three to one. You are not building a link profile. You are building a record of being named, in context, by people who are not you.

The execution consequence is freeing. You do not need a moderator to allow a link. You need someone to mention your brand, by name, in a relevant, helpful, true sentence. That is a far lower bar, and a far more authentic one.

Chapter 04

What actually drives a Reddit citation: the data

Most brands approach Reddit like a billboard or a vote-rigging exercise. The data says both instincts are wrong. Semrush analyzed 248,000 Reddit posts cited in Google AI Mode, Perplexity, and ChatGPT, across 217,000 unique prompts, and the findings overturn the standard playbook.

Engagement barely matters. This is the headline. 70% of AI-cited Reddit posts have fewer than 20 comments, and the median upvote count is just 5 to 8 (Semrush). A post with 8 upvotes that answers a specific question precisely will out-cite a post with 1,000 upvotes that does not. The engines are selecting for relevance, structure, and clarity, not virality. The entire logic of "get the post to the top of the subreddit" is misaligned with how citations are actually awarded.

Format is decisive. Over half of all cited Reddit content comes from Q&A threads, followed by comparison and discussion posts. Together, Q&A, comparison, and discussion formats make up nearly three-quarters of all cited content (Semrush). The threads that get cited are the ones that look like the question a buyer is asking: "X vs Y for Z," "what do you use for," "is X worth it."

Age is an asset, not a liability. The average AI-cited Reddit post is roughly 900 days old — about 2.5 years — with posts from 2022 still being cited in 2025 (Semrush). A genuinely useful comment is a compounding asset. Community contributions do not decay the way an un-maintained landing page does; they accrue authority as the thread keeps answering the same recurring question.

The Reddit comment that gets cited is two and a half years old, has eight upvotes, and answers one specific question cleanly. Virality is not the goal. Being the clearest honest answer to a recurring question is the goal.

The practical translation:

  • Answer the question that is actually being asked. Find the recurring "what should I use for X" threads in your category and contribute the genuinely best answer, including your product only where it honestly fits.
  • Write in Q&A and comparison shapes. Structure beats reach. A clear, specific, self-contained answer is extractable.
  • Play the long game. The citation value of a great comment shows up over years, not in a launch week.
Chapter 05

The authentic-participation playbook

Here is what works, stated plainly and ethically. None of it is a growth hack. All of it is just being a real, useful member of a community that happens to feed the models.

1. Pick the right rooms. Identify the subreddits where your category is genuinely discussed and where buyers ask real questions. Depth in three relevant communities beats shallow presence in thirty.

2. Contribute expertise before you ever mention your product. Reddit Rule 2 requires participating authentically in communities where you have a personal interest, and explicitly prohibits spam and content manipulation. Build a real account with a real history of helpful answers. Most of your contributions should mention no product at all.

3. Disclose your affiliation when you reference your own product. This is both an ethics line and a survival tactic. Reddit communities punish undisclosed self-promotion harshly, and moderators have begun banning entire topic categories — r/biohackers stopped allowing new posts about peptides and hormone-replacement therapy specifically because of manipulation by the companies selling them (404 Media). A transparent "disclosure: I work on X, but for your use case Y is honestly the better fit" earns trust. A disguised plug gets you banned and your brand associated with manipulation.

4. Win comparison threads with honesty, including your weaknesses. The most-cited format is the comparison. The contributor who says "we are great at A, but if you need B, look at a competitor" is the one the community upvotes and the engine trusts. Honest comparison is the highest-leverage authentic move available.

5. Seed genuine discussion, do not manufacture it. You can legitimately ask real customers what they would tell a peer, host a real AMA, or point your community to a useful resource. You cannot script reviews, run sockpuppets, or buy upvotes. The line is whether a real human with real experience is speaking in their own voice.

6. Treat LinkedIn, Quora, and niche forums as the same discipline. Reddit is the largest community surface, not the only one. LinkedIn is the #2 most-cited domain across ChatGPT, Google AI Mode, and Perplexity, appearing in an average of 11% of AI responses — and 14.3% on ChatGPT Search specifically (Semrush, 89,000 LinkedIn URLs). The same rules apply there: Semrush found LinkedIn articles drive 50–66% of cited LinkedIn content, ~95% of cited posts are original rather than reshares, and cited authors post consistently with only moderate engagement (15–25 reactions). Notably, ChatGPT and Google AI Mode cite individual creators 59% of the time over company pages. The person, posting real expertise consistently, beats the brand account broadcasting.

The authentic move and the effective move are the same move. Be a genuinely useful, transparent expert in the rooms where your category is decided. There is no second, secret playbook that works better.

Chapter 06

The manipulation trap and why it is doomed

There is, of course, an industry forming around faking all of this. It is worth understanding precisely why it fails, because the failure is structural, not a matter of getting caught.

The tactic, sometimes branded "AI-engine optimization," is to publish AI-generated posts and synthetic "user reviews" on Reddit to influence what ChatGPT and Google return (404 Media; Memeburn). The fake-review pattern is recognizable: accounts built to post positive reviews disguised as personal experience, low activity elsewhere but heavy focus on one brand, and generic AI language stuffed with "game-changer," "life-changing," and "must-have" (Recho). It is exactly the kind of signal the system is now built to find and discount.

Three forces make this a losing strategy:

  • The platform is hardening fast. Reddit rolled out mandatory human verification for suspected bot accounts in March 2026, introduced APP labels for verified bots and privacy-first biometric verification, and reported removing 100,000 accounts per day (Recho). The supply of fake accounts is being actively drained.
  • The communities self-defend. Moderators ban manipulative topics wholesale, as r/biohackers did. When a community detects coordinated marketing, it does not just remove the posts. It can blacklist the brand and the entire product category, doing lasting damage to your AI visibility rather than building it.
  • The economics point the other way. Reddit's pivot toward quality-weighted, dynamic licensing pricing (Bloomberg) means the platform is incentivized to identify and elevate genuine high-signal discussion and suppress synthetic noise. You would be optimizing against the explicit business interest of the platform that feeds the engines.

The deeper point is the one in FancyAI's brand-authority research: you cannot game entity recognition the way you could game PageRank. The system rewards genuine presence in genuine conversations. A manufactured thread has no real co-occurrence history, no corroborating mentions across independent communities, no consistent author entity behind it. It is a single fragile artifact in a system that triangulates across thousands of independent signals. Authentic influence compounds because it is corroborated. Manufactured influence collapses because it is not.

Astroturfing does not fail because you get caught. It fails because a fabricated signal has nothing to corroborate it, in a system whose entire job is to cross-check. The fake is structurally weak before anyone reports it.

Chapter 07

How to measure community-driven AI visibility

Community work is hard to measure with old tools, which is why so many teams under-invest in the channel that matters most. The fix is to measure the right thing: not links, not upvotes, but whether the engines name you.

Track mentions, not just citations. Most tools report only that a brand was "mentioned." SE Ranking's tracker reports which URL got the citation and logs unlinked mentions alongside linked ones — essential, since engines name brands far more than they link them. Counting only clickable links undercounts your real presence dramatically.

Use share of citation, not share of voice. These are different metrics and conflating them is a common, costly error. Share of voice measures how often you appear across media relative to competitors. Share of citation measures whether AI engines choose your brand when answering category questions (AuthorityTech). The two do not predict each other: Profound found that 80% of sources cited by AI platforms do not appear in Google's top 10 for the same query. A strong SEO footprint is not a proxy for AI citation. You have to measure the citation directly.

Define a prompt set and measure AI share of voice against it. AI share of voice is the percentage of AI-generated responses that mention, recommend, or cite your brand across a defined set of prompts, relative to competitors (Alex Birkett; Profound). Build the set from the real questions buyers ask in your category, run it across ChatGPT, Perplexity, and Google AI Mode on a recurring cadence, and watch the trend.

Expect inconsistency, and design around it. There is a less than 1-in-100 chance that asking the same question 100 times produces the same brand list (Wellows / Ahrefs). One measurement is noise. The signal is the trend across many prompts over time. Community-driven visibility shows up as a rising floor — your brand appearing in more answers, across more sources, more reliably — not as a single perfect result.

A practical measurement loop:

  • Baseline. Run your prompt set across the three major engines and record citation rate and mention rate for your brand and your top three competitors.
  • Instrument referrals. Isolate Reddit, LinkedIn, and AI-engine referral traffic in GA4 with custom channel groups so you can connect community presence to downstream behavior.
  • Track the lift, not the post. Watch whether your owned domain's citation rate rises as your community presence grows. The SE Ranking 3.9× multiplier is the effect you are looking for.
  • Re-run on a cadence. Weekly or biweekly. Citation patterns shift fast, and the trend is the truth.

The wrong metric is "did our Reddit post go viral." The right metric is "are the engines naming us more often, across more sources, when buyers ask the category question." Measure the recommendation, not the post.

Chapter 08

The community playbook, in order

Sequenced by leverage, here is the execution order.

  1. Map the rooms. Identify the three to five subreddits, plus the LinkedIn and Quora surfaces, where your category is genuinely discussed and buyers ask real questions.
  2. Build real accounts with real history. Contribute genuinely useful answers, mostly product-free, until you are a known, trusted voice. This is the unfakeable foundation.
  3. Win the recurring Q&A and comparison threads. These formats drive nearly three-quarters of citations. Be the clearest, most honest answer to the questions buyers keep asking, weaknesses included.
  4. Mention your brand by name, in context, with disclosure. The mention is the signal, not the link. Honest, disclosed, well-fitted mentions carry the entity weight and the 3× advantage over backlinks.
  5. Extend the same discipline to LinkedIn. Post original, advice-driven articles consistently as an individual expert. LinkedIn is the #2 cited source and rewards exactly the same authenticity.
  6. Never manufacture. No sockpuppets, no scripted reviews, no bought upvotes. The platform is hardening, the communities self-defend, and the signal is structurally weak. It does not work and it is not worth the brand risk.
  7. Measure share of citation, not share of voice. Define a prompt set, run it across engines on a cadence, track mention rate and the lift on your owned domain, and isolate community referrals in analytics.

The throughline is the same one FancyAI has documented across every platform and vertical. The engines recommend; they do not rank. The mention is the signal. And on the single most-cited surface in AI search, the only mention worth having is the one a real person made because you were genuinely worth mentioning. Authentic influence is not the safe alternative to gaming the system. On Reddit, it is the only thing that works.

Sources cited

  1. Visual Capitalist / Profound — Reddit at 40.1% citation frequency across AI models; Reddit as #1 most-cited domain, ~2× Wikipedia in the top-ten.[02]Press Gazette / Profound — Reddit claims top spot as most-cited domain in AI answers; 46.7% of Perplexity's top-ten citations.[03]Peec AI — 30M-source analysis ranking Reddit #1 across ChatGPT, Google AI Mode, Gemini, Perplexity, and AI Overviews.[04]Search Engine Land (2026) — 30M-source study: Reddit cited in 10% of LLM answers, YouTube 16%, LinkedIn 11%.[05]Columbia Journalism Review — Reddit–Google ($60M/yr) and Reddit–OpenAI (~$70M/yr) licensing deals; ~10% of revenue, $203M total data licensing in 2024.[06]Bloomberg (Sept 2025) — Reddit pursuing dynamic, quality-weighted AI content pricing with Google and OpenAI.[07]Slashdot — AI licensing deals with Google and OpenAI make up ~10% of Reddit's revenue (~$130M).[08]Ahrefs (75,000-brand study) — unlinked mentions correlate with AI citations at 0.664 vs 0.218 for backlinks (~3× gap).[09]SE Ranking — Reddit-mention domains average 7 citations vs 1.8 (3.9× lift); tracker logs unlinked mentions; four brand-mention channels.[10]Semrush — 248,000 Reddit posts / 217,000 prompts: 70% of cited posts under 20 comments, median 5–8 upvotes, Q&A/comparison formats ~75% of cited content, ~900-day average age.[11]Semrush — 89,000 LinkedIn URLs / 325,000 prompts: LinkedIn #2 source, 11% average (14.3% ChatGPT), articles 50–66% of cited content, ~95% original, individual creators cited 59% on ChatGPT/Google.[12]404 Media — companies using AI-generated Reddit posts and synthetic reviews to manipulate AI search; r/biohackers topic bans.[13]Recho — Reddit human-verification rollout March 2026, APP labels, biometric verification, 100,000 daily account removals; fake-review patterns.[14]Memeburn — Reddit AI spam manipulating ChatGPT and Google AI search.[15]AuthorityTech — share of citation vs share of voice distinction.[16]Profound — 80% of AI-cited sources do not appear in Google's top 10 for the same query.[17]Alex Birkett — AI share-of-voice measurement methodology across a defined prompt set.[18]Wellows / Ahrefs — <1-in-100 chance the same prompt yields the same brand list across 100 runs.[19]FancyAI KnowledgeBase corpus — brand-mention authority and citation-acquisition research (entity recognition cannot be gamed; mentions are the new backlinks), compiled March 2026.

Want this measured against your brand?

The FancyAI AI Readiness Index measures whether ChatGPT, Perplexity, and Google name your brand, which community sources they trust to do it, and where competitors are out-cited. Find out where you stand, then change it.

Related research

Foundational Methodology Original Research Manipulation
Back to Research
Live · FancyAI Research Corpus

Structured Proof: How to Make Your Content Extractable When Schema Alone Won't Save You

The biggest empirical test of schema markup to date found it moved AI citations by roughly zero. Yet pages with statistics get cited 40% more, tables 2.5x more, and tight answer passages 3x more. The lever is not the markup. It is the structure of the proof underneath it.

~0%
Change in AI citations after 1,885 pages added schema markup (Ahrefs, May 2026)
+40%
Visibility lift from adding statistics to a page (Princeton/Aggarwal et al., KDD 2024)
2.5x
Citation rate for content with tables vs. plain text (multiple 2026 GEO analyses)
10.1%
Share of sites that now publish llms.txt, with near-zero AI crawler visits (SE Ranking, 2026)
Chapter 01

What "structured proof" actually means

There is a comfortable story circulating in the GEO industry that goes like this: add the right schema, declare your content machine-readable, and the AI engines will reward you with citations. It is a tidy story because schema is something you can ship. It lives in your control. You can validate it, audit it, and check it off a list.

The 2026 evidence does not support that story, at least not in the direct form it is usually told. The thing that actually gets content extracted and cited is not a JSON-LD wrapper. It is the proof inside the page: a specific statistic, a clean comparison table, a self-contained answer passage, a named source. Structured proof is the discipline of writing facts in the shapes that a retrieval system can lift cleanly out of your page and drop into an answer without rewriting them.

This distinction matters because it separates two things the industry constantly conflates. Being readable means a crawler can parse your HTML. Being extractable means the model can pull a discrete, verifiable claim out of your page and stand behind it in an answer. Almost every site is readable. Very few are extractable. The gap between the two is where citations are won and lost.

Schema tells the machine what your page is. Structured proof gives the machine something it can quote. Only one of those reliably earns the citation.

The rest of this report is organized around that contrast. We will look hard at what the schema evidence actually shows, including the limits the industry tends to skip. Then we will move to the formatting patterns that carry real, measured citation lift: statistics, tables, passages, and Q&A. Then entity and knowledge-graph presence, which is the off-page half of structured proof. Then llms.txt and the technical accessibility layer that determines whether the machine ever sees your proof at all. We close with a prioritized execution checklist that sequences the work by evidence strength, not by how easy it is to ship.

Chapter 02

The schema reckoning: what the evidence actually shows

For two years, "add schema" has been the default opening move in nearly every AEO and GEO checklist. In May 2026, Ahrefs ran the largest controlled test of that advice to date and the result was uncomfortable.

Ahrefs tracked 1,885 pages that added JSON-LD schema markup between August 2025 and March 2026, matched them against 4,000 control pages, and measured citation changes across Google AI Overviews, Google AI Mode, and ChatGPT in the 30 days before and after the markup went live. The deltas: Google AI Mode +2.4%, ChatGPT +2.2%, Google AI Overviews -4.6%. The first two are statistically indistinguishable from zero across thousands of URLs. The third is statistically significant (odds of chance roughly one in 2,500) and points in the wrong direction. Adding schema did not lift citations. On Overviews it slightly hurt.

This was not a one-off. A peer-style cross-platform study by Kurt Fischman, published on SSRN, collected 730 AI citations from ChatGPT and Gemini across 75 commercial queries in five categories. Its pooled analysis initially showed schema presence negatively associated with citation (odds ratio 0.546, p < .001). On inspection that was a methodological artifact: Google's ranking algorithm systematically enriches top-10 organic results with schema-bearing pages, which inflates schema prevalence in the control set. Once corrected, the cleaner finding emerged, and it is the one that matters.

Generic schema (Article, Organization, BreadcrumbList) provided zero measurable citation advantage. Only attribute-rich schema carried weight: Product and Review markup populated with real pricing, ratings, and specifications was cited 61.7% of the time versus 41.6% for generic implementations.

Read those two studies together and the picture resolves. The markup tag itself is not the signal. What sometimes correlates with citation is the structured data inside the markup when that data is specific, populated, and proof-bearing. A Product schema with real prices and ratings helps because it carries facts the model can lift. An empty Organization block helps nothing because it carries nothing quotable.

The platforms muddy this further by speaking out of both sides. Microsoft confirmed in March 2025 that Bing's LLMs use structured data to interpret content for Copilot, and Google has repeatedly said structured data "gives an advantage in search results." Both statements are true and neither contradicts Ahrefs. Schema helps classical ranking, classical ranking feeds Bing, Bing feeds ChatGPT Search, and roughly 92% of AI Overview citations come from pages already ranking in the top 10 (per 2026 entity-authority analyses). Schema is a second-order contributor through the ranking path, not a direct citation lever you can pull in isolation.

The honest position for 2026: keep schema, because it underpins ranking and rich results and costs little. Stop selling it as the thing that earns AI citations. It is table stakes, not a differentiator. Where it does move the needle is exactly where it stops being generic and starts being proof.

Chapter 03

Statistics, tables, and the formatting that actually lifts citations

If schema is the overrated lever, content structure is the underrated one. And unlike schema, the structural evidence is consistent across independent studies, replicates across platforms, and ties back to a peer-reviewed source.

The anchor is the Princeton GEO study, "GEO: Generative Engine Optimization," published at KDD 2024 by Aggarwal and colleagues from Princeton, Georgia Tech, the Allen Institute for AI, and IIT Delhi. Across nine optimization methods tested on a benchmark of real generative-engine queries, the standouts were not about markup at all. They were about what you put in the prose:

  • Adding statistics lifted visibility by up to 41% — the single most effective technique in the study.
  • Adding quotations from named sources lifted citations by 28%.
  • Adding authoritative source citations produced significant gains, especially for lower-ranked pages, where citing sources improved visibility by as much as 115%.

These are not formatting flourishes. They are proof artifacts. A statistic is a discrete, verifiable, quotable unit. A quote from a named expert is attributable. A cited source is checkable. Generative engines are risk-minimizing systems; they preferentially lift content that carries its own verification. The reason "we reduced onboarding friction by 28%" beats "we dramatically improved onboarding" is that the first one is extractable and the second one is air.

The table evidence is just as strong and even more practical. Across multiple 2026 GEO analyses, content with tables is cited roughly 2.5x more often than plain text, and content built around comparison tables (the "X vs. Y" layout) reaches 2.8x the citation rate of text-only equivalents. SISTRIX's analysis of the 100 most-cited websites identified "data collection (comparison tools and tables)" as one of just four dominant cited formats. The mechanism is obvious once you picture the retrieval step: a clean HTML <table> is already structured proof. Each cell is a labeled, atomic fact. The model does not have to infer relationships from a paragraph; the relationships are the grid.

A statistic, a table, and a named quote are not stylistic choices. They are the three shapes of proof a retrieval system can lift without rewriting. Prose it has to summarize. Proof it can quote.

The passage evidence completes the formatting picture and explains why structure beats length. Generative engines do not read whole pages and rank them. They chunk a page into passages, score each passage independently for relevance to the query, and cite the strongest individual passages. A 2025 analysis of 10,000 AI citations found that passages of 40–75 words were cited 3.1x more often than longer passages and 2.4x more often than shorter ones. The retrieval system favors a passage that answers a question completely in one self-contained paragraph, with no dependence on surrounding context.

This is the technical foundation of the "structure beats length" thesis, and it is backed by Ahrefs' finding of near-zero correlation (0.04 Spearman) between word count and AI citation, with 53.4% of AI Overview citations going to pages under 1,000 words. A 5,000-word guide and a 600-word answer compete as collections of passages, not as documents. The long guide wins only if its individual passages are tighter, more self-contained, and more proof-dense than the short one's. Usually they are not, because length and chunk-level discipline tend to trade off against each other.

The practical translation is a set of writing rules that have nothing to do with schema:

  • Lead every section with a self-contained answer of roughly 40–75 words that resolves the question implied by the heading without requiring the reader to have read anything above it.
  • Convert any comparison, spec sheet, or "best for" judgment into an actual HTML table, not a paragraph describing the table.
  • Replace vague claims with specific numbers. One real statistic outperforms three sentences of confident adjectives.
  • Attribute. Name the source of every quote and stat, because attribution is what lets the model stand behind the lift.
  • Format real questions as real Q&A, matching the natural-language phrasing people actually type into an AI, because conversational Q&A mirrors the prompt structure the engine is resolving.

One caution on Q&A, to stay honest about the evidence. The format of question-and-answer prose helps because it mirrors prompts. The FAQ schema markup wrapped around it does not reliably help: SE Ranking's AI Mode analysis found FAQ schema had no measurable impact on AI citations, and Google has restricted FAQ rich results to a narrow set of authoritative government and health sites since 2023. Write the Q&A. Do not assume the schema around it is doing the work. This is the schema-versus-structure distinction in miniature: the readable, extractable text earns the citation; the markup around it is, at best, neutral.

Chapter 04

Entity presence: the structured proof that lives off your page

Structured proof is not only an on-page discipline. The other half lives in whether the machine recognizes your brand, your people, and your concepts as entities it already trusts. This is where the knowledge graph comes in, and it is consistently the most under-built lever in the GEO stack.

Generative engines do not just match strings; they resolve entities. When someone asks an AI about your category, the model is reasoning over a graph of people, companies, and concepts it has already mapped. If your brand is a strongly defined node in that graph, you are a candidate for the answer. If you are an ambiguous string the model cannot disambiguate, you are not. Google's Knowledge Graph now holds over 500 billion facts on more than 5 billion entities, and Gemini is trained directly on it, which means entity establishment is no longer a large-brand specialty. It is the substrate that decides whether you are even eligible to be cited.

The correlation evidence is striking. In 2026 entity-authority analyses, brand mentions showed a 0.664 correlation with AI Overview visibility, versus 0.218 for backlinks — entity authority is roughly three times more predictive of AI visibility than the classic link signal. This is the same pattern the CXL research captured when it said search engines are no longer just ranking pages, they are ranking who is behind the pages. Mentions are the new backlinks, and the entity is the asset.

The model cites entities it can resolve. If it cannot tell who or what you are with confidence, no amount of on-page schema rescues you. Disambiguation is the precondition for citation.

The work of entity optimization has a clear priority order, and the first three steps cost almost nothing but implementation time:

  • Entity home. A single canonical "about" surface — an About page, a fully populated Organization profile — that consistently states who you are, what you do, and how you connect to your people and products.
  • Wikidata entry. A structured, machine-readable record of your brand as an entity, with properties linking to your concepts. Wikidata feeds Google's Knowledge Graph and is one of the cheapest ways to establish a disambiguated node. Implementing mentions markup pointing to Wikidata entities for named concepts helps engines resolve relevance faster.
  • sameAs linking. The sameAs field in Organization and Person schema is the one piece of schema that earns its keep here, because it does carry proof: it ties your brand to Wikipedia, LinkedIn, and other trusted nodes, resolving the "Apple the company vs. apple the fruit" ambiguity that otherwise sinks you.
  • Entity linking in content, then PR and earned mentions that reinforce the node across the wider web.

Note what is happening. The sameAs and mentions fields are schema, but they are the proof-bearing kind, not the decorative kind. They do not describe your page; they assert verifiable relationships between entities. That is why they belong in the entity layer and not in the "schema does nothing" bucket. Entity signals, consistently applied, typically trigger a knowledge panel within 3–6 months — a visible confirmation that the machine now recognizes you as a resolved entity.

For people, the same logic applies through Person/Author schema and cross-platform identity consistency: the same name spelling, the same bio, the same headshot, the same credentials everywhere. Pages with no identifiable author are treated as lower-trust, and in YMYL categories the model actively looks for who is behind the claim before it will repeat it.

Chapter 05

Technical accessibility: can the machine see your proof at all?

You can have the best statistics, the cleanest tables, and a perfectly resolved entity, and still earn nothing if the crawler cannot read the page. The accessibility layer is unglamorous and decisive, and 2026 produced two clear verdicts: one standard to ignore, and one failure mode to fix immediately.

The standard to ignore, for now, is llms.txt. The proposed file — a curated Markdown index of your important content, placed at the site root — has had eighteen months of industry hype and almost no machine reality. SE Ranking's 2026 analysis of 300,000 domains found a 10.13% adoption rate, roughly one site in ten. But adoption is not the same as use. Monitoring of over 500 million AI bot visits across a 90-day window found only 408 that targeted llms.txt directly — a rounding error. Google's Gary Illyes confirmed in 2025 that Google does not support llms.txt and is not planning to, and John Mueller compared it to the discredited keywords meta tag, noting you can tell from server logs the AI services do not even check for it. Search Engine Land found 8 of 9 sites saw no measurable traffic change after implementing it. The verdict for 2026 is unchanged: llms.txt is optional experimentation, not a priority. Spend the hour elsewhere.

The failure mode to fix immediately is JavaScript-dependent content. Most AI crawlers read basic HTML and do not execute JavaScript. Vercel's crawler analysis confirmed the pattern: content that only appears after client-side rendering is, for practical purposes, invisible to the model. This is the single most common way good proof gets hidden. A page can render a beautiful comparison table, a precise statistic, and a complete answer for a human — all injected client-side — while the crawler sees an empty shell. The fix is server-side rendering or pre-rendering of all proof-bearing content. If your statistics, tables, specs, and answers do not exist in the raw HTML, assume the engine never saw them.

The order of operations is brutal but simple. First the crawler has to read the page. Then it has to find the proof. Then it has to be able to lift it cleanly. Skip step one and the rest is theater.

The rest of the accessibility checklist is well established and worth verifying:

  • Do not block search crawlers. The distinction between training crawlers (GPTBot, Google-Extended, CCBot) and search crawlers (OAI-SearchBot, PerplexityBot, ChatGPT-User) is critical. Blocking training crawlers does not measurably harm visibility; blocking search crawlers removes you from AI answers entirely. Check robots.txt for accidental disallows.
  • Keep dateModified honest and current. The JSON-LD freshness field is one of the few schema fields with a clear job, and AI systems lean on recency. Pages not updated in 90+ days see citation rates fall.
  • Ship a clean sitemap with <lastmod> dates. Optimized sitemaps correlate with faster AI platform recognition; they are a structured discovery signal even though crawlers no longer depend on them.
  • Avoid nosnippet on valuable pages. It prevents your content from being used in AI Overviews — you are opting out of the citation you want.
  • Put critical paths in HTML, not menus. AI bots cannot interact with JavaScript menus or tabbed widgets; anything hidden behind a click is at risk of being unseen.

None of this is about being clever. It is about removing the silent ways your proof becomes invisible.

Chapter 06

The prioritized execution checklist

The mistake most teams make is sequencing this work by what is easy to ship rather than by what the evidence supports. Schema is easy to ship, so it goes first, and then teams are puzzled when citations do not move. Below is the sequence ordered by evidence strength and leverage. Do it in this order.

Tier 1 — Make the proof extractable (highest evidence, highest leverage).

  • Add a real statistic to every page that makes a claim. Princeton put the lift at up to 41%, and it is the most replicated finding in GEO. If you have proprietary data, lead with it; original data earns roughly 4x the citations of generic content and cannot be paraphrased away.
  • Convert every comparison into an HTML table. Tables are cited 2.5x more than plain text, comparison tables 2.8x. This is among the cheapest high-yield changes available.
  • Rewrite section openings as self-contained 40–75 word answers. Passages in that range are cited 3.1x more than longer ones. Front-load the answer; put depth below it.
  • Attribute every stat and quote to a named source. Quotations lifted citations 28% in the Princeton study; source citations lifted lower-ranked pages by up to 115%.

Tier 2 — Establish the entity (high evidence, compounding returns).

  • Build the entity home, create or claim a Wikidata entry, and implement sameAs and mentions schema linking your brand and people to trusted nodes. Brand mentions correlate with AI visibility at 0.664 versus 0.218 for backlinks.
  • Standardize author identity across every surface — same name, bio, headshot, credentials — and apply Person schema. Expect a knowledge panel within 3–6 months of consistent signals.

Tier 3 — Clear the accessibility blockers (necessary, not sufficient).

  • Server-side render all proof-bearing content. If statistics, tables, and answers only appear after JavaScript, the crawler does not see them.
  • Audit robots.txt to confirm search crawlers (OAI-SearchBot, PerplexityBot) are allowed.
  • Keep dateModified current and refresh high-value pages on a 90-day cycle; stale pages lose citations.

Tier 4 — Schema and standards (table stakes, do it, don't over-invest).

  • Implement Product, Review, and other attribute-rich schema populated with real pricing, ratings, and specs — the only schema type the Fischman study found carries citation lift (61.7% vs. 41.6% generic).
  • Keep generic Article and Organization schema for ranking and rich-result reasons, but do not expect it to move AI citations on its own. The Ahrefs test put that effect at roughly zero.
  • Skip llms.txt unless you are explicitly running an experiment. AI crawlers fetched it in 408 of 500M+ monitored visits.

The teams that win are not the ones with the most schema. They are the ones whose facts are written in shapes a machine can lift, attached to an entity the machine recognizes, on pages the machine can actually read.

The through-line of every credible 2026 study is the same. Readability is necessary and nearly universal. Extractability is rare and decisive. Schema, despite being the loudest item on most checklists, is the weakest direct lever in the set, useful mainly where it stops describing your page and starts carrying proof. The statistic, the table, the tight passage, the named source, and the resolved entity are the structured proof that actually gets quoted. Build those, in that order, and the citations follow.

Sources cited

  1. Ahrefs (May 2026) — "We Tracked 1,885 Pages Adding Schema": controlled study of 1,885 schema-adding pages vs. 4,000 controls; citation deltas of +2.4% (AI Mode), +2.2% (ChatGPT), -4.6% (AI Overviews). Also the 0.04 word-count/citation correlation and 53.4% of citations under 1,000 words.[02]Princeton / Aggarwal et al. (KDD 2024) — "GEO: Generative Engine Optimization": +41% visibility from statistics, +28% from quotations, up to +115% for source citations on lower-ranked pages.[03]Kurt Fischman / SSRN (2026) — "Does Schema Markup Predict AI Citation?": 730 citations across ChatGPT and Gemini; generic schema no advantage, attribute-rich Product/Review schema cited 61.7% vs. 41.6%; Google-control schema-enrichment artifact.[04]SE Ranking (2026) — llms.txt adoption study across 300,000 domains (10.13% adoption); 408 of 500M+ AI bot visits targeting llms.txt; FAQ schema no measurable AI Mode impact.[05]SISTRIX (2025) — Top 100 most-cited websites analysis; "data collection (tables/comparison tools)" as one of four dominant cited formats.[06]Multiple 2026 GEO analyses (Discovered Labs, Digital Bloom, Presence AI) — content with tables cited ~2.5x and comparison tables ~2.8x more than plain text.[07]2025 analysis of 10,000 AI citations — passages of 40–75 words cited 3.1x more than longer passages, 2.4x more than shorter.[08]2026 entity-authority analyses (Digital Applied, upGrowth) — brand mentions 0.664 vs. backlinks 0.218 correlation with AI Overview visibility; 92% of citations from top-10 pages; Knowledge Graph 500B+ facts on 5B+ entities; 3–6 month knowledge-panel timeline.[09]Microsoft / Google (2025) — Bing/Copilot use of structured data; Google statement that structured data "gives an advantage in search results."[10]Google (Illyes, Mueller, 2025) — confirmation that Google does not support or check llms.txt.[11]Vercel (2025) — AI crawler analysis showing most crawlers do not execute JavaScript; SSR required for crawlability.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which of your pages carry extractable proof, where your entity is unresolved in the knowledge graph, and which JavaScript and crawler issues are hiding your best content from the engines.

Related research

Foundational Methodology Academic Paper Platform Deep-Dive
Back to Research
Live · FancyAI Research Corpus

Citation acquisition: the off-site system for becoming the source AI cites

Most of what AI engines say about your brand is shaped by content you do not own. Earned media, listicles, reviews, and reference entries carry the overwhelming majority of the signal. This is the playbook for engineering those mentions on purpose, and the case for why earning beats buying every time.

84%
Share of AI citations that come from earned media, not owned content (Muck Rack, 25M+ links)
6.5×
How much more likely a strong off-site presence is to earn AI visibility than owned content (AirOps)
How much more unlinked mentions correlate with AI citations than backlinks do (Ahrefs, 75,000 brands)
0.3%
Share of AI citations traceable to paid or advertorial content (Muck Rack)
Chapter 01

The mention is the signal, and you do not own most of it

Start with the fact that reorganizes the entire budget. When an AI engine answers a question about your category, the language it draws from is overwhelmingly not yours.

Muck Rack's Generative Pulse study analyzed more than 25 million links cited by ChatGPT, Claude, and Gemini across 17 industries and found that earned media drives 84% of AI citations. Journalism alone accounts for 27% of cited sources. Paid and advertorial content, the stuff you can buy, accounts for 0.3% (Muck Rack, 2026). And the pattern is not a single-snapshot fluke. Across three editions of the study going back to July 2025, earned media has held in a tight band of 82% to 89%, and journalism citations have stayed between 25% and 27% (Muck Rack). The engines are not experimenting. This is how they source.

AirOps' 2026 State of AI Search reaches the same conclusion from the brand side: companies that invest in a strong off-site presence are 6.5× more likely to earn AI visibility than they are through their owned content, and roughly 85% of brand mentions in early commercial discovery come from external domains (AirOps; Nobori). Your homepage is not where the answer gets written. It gets written on the sites that talk about you.

A brand describing itself is recognized by the model as self-interested. A third party describing the brand is recognized as a credible signal. That asymmetry is the entire reason earned media dominates AI citations.

This is the contrast that runs through every FancyAI report, applied to the off-site world. AI does not rank pages; it recommends entities, and it builds its confidence in an entity from the web of independent references that surround it. The mention is the signal. Citation acquisition is the discipline of manufacturing those mentions deliberately, ethically, and at scale, because if you leave them to chance, the engines will describe you using whatever the open web happens to have said.

Chapter 02

Why earned mentions outweigh links

For two decades, the off-site currency was the backlink. In AI search, it is the mention, and the data separating the two is now unambiguous.

Ahrefs studied 75,000 brands and measured what actually correlates with appearing in AI Overviews. The top three factors are all off-site brand signals, and none of them is a backlink. Unlinked brand web mentions correlate with AI citations at 0.664. Backlinks correlate at only 0.218 (Ahrefs, 2026). That is roughly a 3× gap in favor of mentions. Brand search volume (0.392) and branded anchor text (0.527) also outrank raw backlinks. The marginal dollar of off-site budget buys more AI visibility when it is spent earning conversations than when it is spent placing links.

The reason is mechanical. AI engines understand the web through language and entity relationships, not through a link graph. A sentence in a trusted publication that reads "we moved to Acme and cut onboarding time in half" carries entity signal whether or not the word "Acme" is hyperlinked. The model reads the co-occurrence: brand, plus category, plus outcome, plus the authority of the page it sits on. The hyperlink is optional decoration. The mention is the payload.

This does not mean links are worthless. They still help AI crawlers discover your content, they still move organic rankings that feed retrieval, and SE Ranking found that backlinks remain a top-3 factor for Google AI Mode citations specifically. But the hierarchy has inverted. Andrew Holland, writing in Search Engine Land, put it directly: building a strong presence through brand mentions should be a top priority, "alongside (but potentially more important than) traditional link building" (Search Engine Land, 2025). And not all links are equal even within link building. Editorial backlinks, the kind embedded in a journalist's sentence about you, carry rich, multi-layered trust signals: topical relevance, E-E-A-T cues, entity relationships. Directory and sidebar placements carry none of that, and AI systems discard them as noise (Authority Builders, 2025).

You are not building a link profile anymore. You are building a record of being named, in context, by people who are not you. The link is a nice-to-have. The named, contextual mention is the asset.

The execution consequence is liberating. You no longer need a webmaster to grant a follow link. You need a writer, an analyst, or a reviewer to mention your brand by name in a relevant, true sentence on a page the engines trust. That is a lower bar, a more honest one, and it is where 84% of the citation signal actually lives.

Chapter 03

The hierarchy of citation sources

Not all earned mentions are worth the same. AI engines weight sources by a rough hierarchy of independence and authority, and your targeting should follow it. Based on the cross-platform citation data, here is the order that matters in 2026.

Tier 1 — Editorial journalism. Coverage in established news and trade press is the single most valuable citation type. Journalism alone is 27% of all AI citations (Muck Rack), and within ChatGPT specifically, news outlets like Reuters and the Financial Times sit near the top of the cited-domain list. A pickup by an editorial wire compounds, because secondary outlets re-report it, multiplying your mention count across independent domains.

Tier 2 — Reference and knowledge bases. Wikipedia is ChatGPT's single most-cited domain, accounting for 47.9% of citations among ChatGPT's top-ten sources in one analysis of 680 million citations (Aug 2024–June 2025), and ChatGPT cites Wikipedia in 2.49% of all its responses (Profound; qvery.ai). For factual questions about a founder, product, or category, a Wikipedia presence is effectively a prerequisite for appearing in the answer. Wikidata sits one rung below with a lower notability bar and feeds backend knowledge graphs.

Tier 3 — Listicles, roundups, and "best of" pages. When a buyer asks "what's the best tool for X," the engine almost never returns a vendor homepage. It returns a synthesized list drawn from third-party rankings. Listicles are the single most common citation type in AI Mode, ChatGPT, and Perplexity at 21.9%, ahead of articles (16.7%) and product pages (13.7%) (AirOps). Nearly 90% of off-site brand mentions originate from listicles, comparison pages, and review roundups (AirOps).

Tier 4 — Third-party reviews and directories. For B2B especially, review platforms are an engine of validation. Companies with active profiles on at least two review sites are 3.4× more likely to be mentioned in ChatGPT than those without (AISO System; EPC Group). G2 carries the single highest influence for software queries at 22.4%, and after its January 2026 acquisition of Capterra, Software Advice, and GetApp from Gartner, G2 now controls an estimated 55–58% of global software review influence (Omniscient Digital; EPC Group).

Tier 5 — Community discussion. Reddit is the most-cited single domain across AI models at 40.1% citation frequency (Visual Capitalist / Profound). It is enormously important, but it operates by its own rules and is covered fully in a separate FancyAI playbook. Treat it as its own discipline, not a sub-task of PR.

The hierarchy is not arbitrary. It tracks independence. A journalist has no stake in your success, a Wikipedia editor enforces neutrality, a reviewer paid you nothing. The further a source sits from your marketing department, the more an AI engine trusts it.

Buyer journey also shifts the weighting. The citation data shows earned media dominating early, top-of-funnel discovery, user-generated content and reviews rising in the middle as buyers compare options, and owned and competitor content entering only late, for product specifics (xfunnel, 2025). A complete citation strategy covers all five tiers, but it front-loads Tiers 1 through 3, because that is where the category-defining questions get answered.

Chapter 04

The digital-PR-for-AI playbook

Here is the executional core. Each of these motions earns the named, contextual mentions the engines reward, ordered by leverage.

1. Run data-driven PR. Original research is the highest-yield earned-media play, because journalists need data and AI engines reward statistics. The Princeton GEO study (Aggarwal et al.) found that adding statistics lifted AI visibility by roughly 41% and adding citations by 30%. Commission a survey, publish a benchmark, release a proprietary index. A single original dataset can earn dozens of editorial pickups, each one an independent mention on a trusted domain, and the secondary reporting compounds the effect across the web.

2. Source yourself into journalists' stories. Reporter-query platforms turn expert commentary into earned mentions at volume. HARO (relaunched as a free service by Featured in 2025), Qwoted, Featured, and Source of Sources connect your subject-matter experts to reporters at outlets from niche trade blogs to Business Insider and Entrepreneur. Being quoted in a TechCrunch or Forbes piece does double duty: it builds ranking authority and supplies the exact kind of editorial brand mention AI engines parse to decide who gets cited (Muck Rack; VisibilityStack). Assign named experts, respond fast, and make the quote genuinely useful.

3. Win placement in the listicles and roundups. Because listicles are the most-cited format and 80% of mentioned brands appear in the first three positions of those pages (AirOps), getting included, and ranked well, in third-party "best of" content is among the highest-leverage off-site moves available. Identify the roundups the engines already cite for your category, then earn inclusion the honest way: pitch the writer with a genuine differentiator, supply data and a clear use-case, and make your product easy to evaluate. Do not pay for placement on pages that disclose it as sponsored. The engines discount advertorial, and paid content is 0.3% of citations for a reason.

4. Build the secondary sources that make you Wikipedia-notable. You cannot, and must not, write your own Wikipedia page; conflict-of-interest rules will get it deleted. Notability is earned through significant coverage in independent, reliable sources (Presenc AI). So the path to Wikipedia runs through Tier 1: earn the journalism, the academic citations, and the industry recognition first, and the encyclopedia entry becomes defensible. In parallel, create a well-sourced Wikidata item, which has a far lower bar and feeds the knowledge graphs that resolve your brand as an entity across languages.

5. Earn third-party validation: reviews, awards, analyst recognition. Cultivate active profiles on the review platforms that matter for your vertical, G2 and Capterra for B2B SaaS, Trustpilot for consumer, because review presence carries a measured 3.4× citation multiplier (AISO System). Pursue legitimate industry awards and analyst mentions. Each is an independent third-party endorsement the engine can attribute to someone other than you.

6. Publish on respected niche platforms for speed. Contributed articles on high-trust industry sites, and posts on LinkedIn from individual experts, can surface in AI answers within hours (Search Engine Land). These are not the most authoritative tier, but they are the fastest, and they widen the surface area of contexts in which your brand co-occurs with your category.

Digital PR for AI is the highest-synergy motion in the off-site toolkit. One well-placed piece of original research earns links, brand mentions, and entity signals simultaneously, on the exact tier of sources the engines weight most heavily.

Chapter 05

How to prioritize targets: the gap analysis

You cannot pursue every source. The disciplined way to choose is citation gap analysis: find the prompts where your competitors are cited and you are not, then work backward to the specific sources doing the citing.

Sort your category's sources into four buckets (Similarweb; Peec AI):

  1. Not found and not cited. The engine does not know you exist for this topic. The fix is foundational: earn baseline coverage and entity recognition.
  2. Found but not cited. The engine can see you but chooses someone else. This bucket is where gap analysis pays. A competitor holds the listicle slot, the review-site authority, or the editorial mention you lack. Target it directly.
  3. Found and sometimes cited. You appear inconsistently. Reinforce with more independent mentions until the engine's confidence stabilizes.
  4. Found and often cited. Defend the position. Early movers earning citations are increasingly hard to displace, so protect the sources that win for you.

Run the analysis against a defined prompt set built from the real questions buyers ask, across ChatGPT, Perplexity, Google AI Mode, and Gemini, because their source preferences diverge sharply. AI Overviews and AI Mode share only 13.7% of their citations (Ahrefs). A source that wins you a ChatGPT citation may do nothing on Perplexity, which leans on expert review sites, or on Google, which leans on community and blogs. Prioritize sources that appear repeatedly across multiple engines for your category. Those are the highest-ROI targets, because one earned mention there pays off in more than one place.

The output is a ranked target list: the specific publications, listicles, review platforms, and reference entries that the engines already cite for your money prompts, where a competitor occupies the slot you want. That list, not a generic "do more PR," is the brief.

Chapter 06

What does not work: the buying trap

A real industry is forming around shortcuts. It is worth being precise about why they fail, because the failure is structural, not a matter of getting caught.

Low-quality link building is dead weight. Private blog networks, mass directory submissions, and cheap link packages produce nothing for AI visibility and risk active penalties. Google's SpamBrain detection penalizes PBNs with deindexing, and GEO discards directory and sidebar links as noise (rankz; Authority Builders). These tactics target a link graph the engines have already deprioritized. You are buying the wrong currency.

Paid and advertorial content barely registers. Recall the headline: paid placement is 0.3% of AI citations (Muck Rack). The engines are built to recognize and discount content a brand paid to publish about itself, for the same reason a human discounts an ad. A sponsored "best of" listing that discloses the sponsorship is, to the model, a brand describing itself through an intermediary. It carries near-zero independent signal.

Synthetic mentions and fake reviews collapse. Manufacturing posts, planting fake reviews, or astroturfing community threads is structurally weak in a system whose entire job is to triangulate across thousands of independent signals. A fabricated mention has nothing to corroborate it, no consistent author entity, no independent re-reporting. It is a single fragile artifact in a cross-checking system. Authentic influence compounds because it is corroborated; manufactured influence fails because it is not.

You cannot buy your way into AI citations the way you once bought links. The system is engineered to reward independence, and independence is, by definition, the one thing you cannot purchase. Earned is not the ethical alternative. It is the only thing that works.

The deeper point is the one in FancyAI's brand-authority research: you cannot game entity recognition the way you could game PageRank. PageRank counted links and could be flooded. Entity recognition cross-references mentions, context, authority, and consistency across the whole web. The shortcut economy is optimizing against the explicit design of the thing it is trying to influence.

Chapter 07

How to measure citation acquisition

Off-site work is hard to measure with link-era tools, which is why so many teams under-invest in the channel that matters most. Measure the right things.

Track mentions, not just links. Most legacy tools count clickable citations and miss the unlinked brand mentions that carry the bulk of the entity signal. SE Ranking's AI Visibility Tracker logs both, on the explicit logic that engines "reference brands by name far more often than they hand out clickable links," and that counting only links underestimates real presence "by a wide margin." Instrument for the named mention, hyperlinked or not.

Use share of citation, not share of voice. These are different metrics and conflating them is a costly error. Share of voice measures how often you appear across media relative to competitors. Share of citation measures whether AI engines actually choose your brand when answering category questions (AuthorityTech). They do not predict each other: Profound found 80% of sources cited by AI platforms do not appear in Google's top 10 for the same query. A strong SEO or PR footprint is not a proxy for AI citation. Measure the citation directly.

Measure citation velocity. Citation velocity is the rate at which your brand is mentioned, referenced, and validated across authoritative sources over time (Conductor). It is the truest read on whether your acquisition program is working, because a single measurement is noise. There is a less than 1-in-100 chance that asking the same question 100 times produces the same brand list (Wellows / Ahrefs). The signal is the trend across many prompts, run on a cadence.

A practical measurement loop:

  • Baseline. Run your prompt set across ChatGPT, Perplexity, Gemini, and Google AI Mode. Record citation rate and unlinked mention rate for your brand and your top three competitors.
  • Map the sources. For every prompt where a competitor wins, record which specific source the engine cited. That is your target list from Chapter 05.
  • Instrument the lift. Watch whether your owned domain's citation rate rises as your off-site mention count grows. The off-site presence and the owned-domain citation rate move together, because earned validation raises the engine's confidence to name you from any source.
  • Re-run on a cadence. Weekly or biweekly. Citation patterns shift fast, and the trend is the truth.

The wrong metric is "how many links did we build." The right metric is "are the engines naming us more often, across more independent sources, when buyers ask the category question." Measure the recommendation, not the link.

Chapter 08

The citation-acquisition playbook, in order

Sequenced by leverage, here is the execution order.

  1. Run the gap analysis first. Define your prompt set, run it across all four major engines, and build the ranked list of specific sources where competitors are cited and you are not. Work the list, not a vague PR brief.
  2. Lead with original data. Commission research, publish a benchmark, release an index. It is the highest-yield earned-media motion and the one journalists most want to cite. Statistics lift AI visibility ~41% on their own.
  3. Source your experts into the press. Use HARO, Qwoted, and Featured to turn named experts into editorial mentions in trusted outlets, the single most-weighted citation tier.
  4. Win the listicles and roundups. Earn inclusion, and good placement, in the third-party "best of" content the engines already cite. Top-three position is where the citations cluster.
  5. Build third-party validation. Maintain active review profiles on the platforms that matter for your vertical, pursue legitimate awards, and earn analyst recognition. Each carries a measured citation multiplier.
  6. Earn your way onto Wikipedia and into Wikidata. Build the independent secondary coverage that makes you notable, then let the encyclopedia entry follow. Create a sourced Wikidata item in parallel.
  7. Never buy the shortcut. No PBNs, no paid listicle slots, no synthetic mentions. They are the wrong currency, they carry 0.3% of citations, and they are structurally doomed in a system built to reward independence.
  8. Measure share of citation and velocity. Track named mentions, not just links, across a prompt set on a recurring cadence, and watch the lift on your owned domain.

The throughline is the one FancyAI has documented across every platform and vertical. The engines recommend; they do not rank. The mention is the signal. And in citation acquisition, the only mention worth having is the one an independent source made because you earned it. You cannot buy the thing the engines are built to trust. You can only become worth citing, deliberately and at scale, and then measure whether they noticed.

Sources cited

  1. Muck Rack (Generative Pulse, May 2026) — analysis of 25M+ links across ChatGPT, Claude, and Gemini in 17 industries: earned media drives 84% of AI citations, journalism 27%, paid/advertorial 0.3%; earned media held at 82–89% across three editions since July 2025.[02]AirOps (2026 State of AI Search) — off-site presence 6.5× more likely to earn AI visibility than owned content; ~85% of early commercial brand mentions from external domains; listicles the most-cited content type at 21.9% (vs articles 16.7%, product pages 13.7%); ~90% of off-site mentions from listicles/comparisons/roundups; 80% of mentioned brands in top-three positions.[03]Ahrefs (75,000-brand study, 2026) — unlinked brand mentions correlate with AI citations at 0.664 vs 0.218 for backlinks (~3× gap); brand anchors 0.527, brand search volume 0.392; AI Overviews and AI Mode share only 13.7% of citations.[04]Search Engine Land (Andrew Holland, 2025) — brand mentions a top priority "alongside (but potentially more important than) traditional link building"; niche-site and LinkedIn posts can surface in AI answers within hours.[05]SE Ranking — AI Visibility Tracker logs unlinked mentions alongside links; backlinks remain a top-3 factor for Google AI Mode citations.[06]Authority Builders (2025) — editorial backlinks carry rich trust signals; directory/sidebar placements discarded as noise by AI systems.[07]Profound / qvery.ai — Wikipedia is ChatGPT's most-cited domain, 47.9% of ChatGPT's top-ten citations across 680M citations (Aug 2024–June 2025), cited in 2.49% of all ChatGPT responses; 80% of AI-cited sources absent from Google's top 10.[08]Presenc AI (2026) — Wikipedia notability requires significant independent coverage; Wikidata a lower-bar entry point feeding knowledge graphs.[09]AISO System / EPC Group (2026) — review profiles on 2+ platforms yield 3.4× higher ChatGPT mention likelihood; G2 at 22.4% influence for software queries.[10]Omniscient Digital / EPC Group — G2's Q1 2026 acquisition of Capterra, Software Advice, and GetApp from Gartner; G2 controls an estimated 55–58% of global software review influence.[11]Princeton GEO study (Aggarwal et al.) — adding statistics lifts AI visibility ~41%, adding citations ~30%.[12]Muck Rack / VisibilityStack — reporter-query platforms (HARO/Featured, Qwoted, Source of Sources) as expert-sourcing pathways for earned mentions; HARO relaunched free by Featured in 2025.[13]Visual Capitalist / Profound — Reddit the most-cited single domain at 40.1% citation frequency (covered in a separate FancyAI community playbook).[14]xfunnel (2025) — earned media dominates early-funnel citations; UGC rises mid-funnel; owned/competitor content enters late.[15]Similarweb / Peec AI — four-bucket citation gap analysis framework; "found but not cited" as the highest-value target bucket.[16]Conductor — citation velocity as the rate of brand mention and validation across authoritative sources.[17]AuthorityTech — share of citation vs share of voice distinction.[18]Wellows / Ahrefs — <1-in-100 chance the same prompt yields the same brand list across 100 runs.[19]rankz / building backlinks — PBNs, directory spam, and cheap link packages penalized by Google SpamBrain and discarded in GEO.[20]FancyAI KnowledgeBase corpus — citation-acquisition and link-building-for-AI research (entity recognition cannot be gamed; mentions outweigh links), compiled March 2026.

Want this measured against your brand?

The FancyAI AI Readiness Index measures whether ChatGPT, Perplexity, and Google name your brand, which off-site sources they trust to do it, and exactly where competitors are out-cited. Find out where you stand, then change it.

Related research

Foundational Methodology Original Research FancyAI Research
Back to Research
Live · FancyAI Research Corpus

How to Measure AI Visibility: The Right Metrics for the Recommendation Era

Rank tracking was built for a world where the same query returned the same list. That world is gone. Ask an AI the same question twice and the odds of getting the identical list are under 1 in 100. This report sets the measurement standard for a system that recommends instead of ranks, and builds the four signals behind the FancyAI AI Readiness Index.

<1 in 100
Odds of getting the same AI recommendation list twice from one prompt (SparkToro / Gumshoe, 2025)
40–60%
Of cited domains change month over month for identical questions (Profound, 2025)
20–40
Prompts in a representative tracking set — a brand panel, not a keyword list (SE Ranking, 2026)
2x
More buyers named generative AI their most meaningful research source vs. any other (Forrester, 2026)
Chapter 01

Why rank tracking breaks the moment you point it at an AI

For twenty years, SEO measurement rested on one quiet assumption: that the system was stable. You typed a query, the engine returned a ranked list, and that list held still long enough to be tracked. Position 4 today was position 4 tomorrow, give or take. The whole apparatus of rank tracking, share-of-search, and CTR-by-position depended on the answer being a fixed object you could check on a schedule.

Generative engines violate that assumption at the root. They do not retrieve a fixed list. They generate a fresh answer each time, token by token, and that generation is probabilistic. The same prompt produces a different answer on the next run, and the difference is not noise around a stable mean. It is the mean moving.

The clearest evidence comes from SparkToro and Gumshoe, who ran the most comprehensive public test of AI recommendation consistency to date. Rand Fishkin and Patrick O'Donnell had 600 volunteers run 12 brand-recommendation prompts through ChatGPT, Claude, and Google's AI a combined 2,961 times in late 2025. The categories were ordinary commercial questions: chef's knives, headphones, cancer-care hospitals, digital marketing consultants, science fiction novels. The result is the single most important measurement finding of the year.

Across tools and prompts, the odds of getting the same list of recommended brands twice were under 1 in 100. The odds of getting that same list in the same order were closer to 1 in 1,000.

Sit with what that does to rank tracking. If a brand's "position" in an AI answer changes on essentially every run, then any single observation of position is a coin flip dressed up as a metric. A tool that reports "you rank #3 in ChatGPT for this prompt" is reporting the result of one die roll and calling it a standing. Run it again and you might be #1, #7, or absent. The position is not a property of your brand. It is a property of that one generation.

The instability is not confined to the order of brands. It reaches the sources too. Profound's 2025 volatility study measured "citation drift," the share of domains that appeared in a platform's answers one month but not the next for the same prompts. The drift is enormous: Google AI Overviews 59.3%, ChatGPT 54.1%, Microsoft Copilot 53.4%, Perplexity 40.5%. Roughly 40 to 60 percent of the domains an AI cites this month will be different next month, for identical questions. The ground under any citation-rank metric is moving by half every thirty days.

Two forces drive this. The first is non-determinism at inference: unless a model is run at temperature zero with strict deterministic settings, which commercial closed models almost never expose, the model introduces randomness selecting each next token. The second is the retrieval layer: live web grounding means the corpus the model reads is itself changing as the web changes, as the index refreshes, and as the platform reweights its sources. You are tracking a moving target with a moving ruler.

This is why the central error of 2026 GEO measurement is importing the SEO scoreboard wholesale. Teams stand up an "AI rank tracker," watch the number jump around, and conclude either that the tool is broken or that AI search is unmeasurable. Neither is true. The tool is faithfully reporting a quantity that was never stable enough to track at the resolution of a single observation. The fix is not a better rank tracker. It is a different unit of measurement.

Chapter 02

From position to probability: what actually holds still

If a single answer is a coin flip, the way you measure a coin is not to flip it once and record "heads." You flip it many times and report the rate. The same move rescues AI visibility measurement. Stop asking "what position am I in," which is unstable by run, and start asking "across many runs of the prompts that matter, how often do I appear, and how prominently." That rate is stable even when every individual answer is not.

This is the conceptual core of the recommendation era. AI does not hand a buyer a ranked list to scroll. It hands them a short consideration set, composed on the spot, of the brands the model most strongly associates with that intent. Your job is not to hold a rank. It is to be in the set often enough that you are part of the buyer's shortlist before a human ever describes you. Appearing often beats appearing first, because there is no durable "first" to win.

The metrics that survive this shift are aggregate, not point-in-time. Four of them carry the load.

Presence rate, also called recommendation rate. The share of runs, across your tracked prompt set, in which your brand appears at all. AirOps frames the equivalent as the North Star: answers mentioning your brand divided by total relevant answers. This is the metric the SparkToro data validates, because while any one answer is random, presence rate across dozens or hundreds of runs is a real, repeatable signal of how strongly the model associates you with the intent. Benchmark from the measurement corpus: aim to appear in 30% or more of answers for your core category prompts; the strongest brands clear 50%.

Share of citation, or AI share of voice. Your presence relative to competitors, not in isolation. This is the metric the tooling has standardized on. Semrush defines AI share of voice using both how often your brand is mentioned and how prominently it sits within each answer, and updated the metric in October 2025 to weight prompts by how often they are actually searched. Ahrefs Brand Radar defines it as the percentage of brand impressions you capture out of total impressions across responses that mention any tracked brand, modeled over a database of 199 million search-backed prompts across six engines. Share of citation answers the only question an executive cares about: in the answers buyers see, what fraction of the recommendation is us versus them.

Sentiment. Presence is necessary but not sufficient. Being mentioned as a cautionary tale is not the same as being recommended. Sentiment tracks how positively or negatively the model frames you when it does mention you. The measurement corpus benchmark is 70% or more positive across platforms, with recurring negative themes flagged for content and PR response. Semrush's 2026 enterprise stack pushes this further with an LLM sentiment analyzer that simulates thousands of customer personas and reports the percentage of those conversations that ended in a recommendation for your brand versus a competitor. Sentiment is where presence converts to preference.

Source attribution. The list of domains the model actually pulled from when it built the answer. This is the bridge between measurement and action, because it tells you why you are present or absent. If competitors are being assembled from Reddit threads, G2 reviews, and a Wikipedia entry you do not have, your absence is explained and your fix is named. Attribution turns a score into a worklist.

Position asks where you placed in one answer. Presence, share of citation, sentiment, and attribution ask how the system thinks about you across thousands of answers. Only the second question has a stable answer.

Note the discipline these four share. Each is computed over a body of runs, not a single response. Each is a rate or a distribution, not a rank. That is what makes them robust to the non-determinism that destroys position tracking. You are no longer measuring the outcome of a die roll. You are measuring the loading of the die.

Chapter 03

The prompt set is the instrument, not an afterthought

Every aggregate metric in the previous chapter inherits its validity from one thing: the set of prompts you run. Get the prompt set wrong and presence rate, share of citation, and sentiment all measure the wrong universe with perfect precision. The prompt set is the measurement instrument. It deserves the same rigor a survey methodologist gives a sampling frame.

The first principle is that a prompt set is a panel, not a keyword list. SE Ranking's 2026 guidance frames a tracking set as a representative panel of 20 to 40 prompts covering core topics, personas, and journey stages, functioning like a brand-tracking panel that yields directional visibility data. The number is deliberately modest. The goal is not coverage of every possible phrasing; it is a stable, repeatable sample you can run on a cadence and trust to move only when reality moves.

The second principle is that prompts are not keywords, and translating one to the other naively breaks the measurement. People do not type "best CRM" into an AI. They type full, qualified, conversational questions. The average conversational query in 2026 runs about 23 words, dense with the qualifiers that push a model out of explanation mode and into recommendation mode. "What's a good CRM for a 15-person agency that already uses Slack and hates long onboarding" is a prompt. "CRM software" is a keyword. Only the first one triggers the consideration-set behavior you are trying to measure.

The third principle is fan-out. Modern AI does not answer the prompt you typed. It decomposes it into sub-questions, retrieves against each, and synthesizes. The 2026 tracking practice reflects this: the majority of brand citations come from the secondary and tertiary fan-out branches, not the literal root query. Ahrefs Brand Radar models this explicitly, expanding seed prompts into related sub-questions using Google's People Also Ask corpus and a fan-out system before measuring. A prompt set that tracks only root queries measures the front door and misses the rooms where most citations are actually decided.

The practical construction follows from these principles:

  • Source prompts from real language. Pull phrasing from sales-call transcripts, support tickets, and community forums, not from a keyword tool. The words buyers actually use are where the recommendation behavior lives.
  • Span the journey. Cover problem-aware prompts ("how do I reduce onboarding churn"), category prompts ("best onboarding tools for B2B SaaS"), comparison prompts ("X vs. Y vs. Z"), and branded prompts ("is X any good"). Each stage exposes a different facet of presence and sentiment.
  • Include unbranded competitive prompts. Share of citation is only meaningful on prompts where you are not named. Branded prompts tell you how the model talks about you; unbranded prompts tell you whether the model reaches for you at all.
  • Run each prompt many times, per platform, per market. Because any single run is a coin flip, the unit of measurement is the rate across runs. AI answers also vary by geography and language, so multi-market brands must sample per country.
  • Hold the set stable. The panel only yields trend data if it stays fixed. Changing prompts every month makes month-over-month comparison meaningless. Version the set; change it deliberately and document when you do.

A prompt set is a measurement instrument. Twenty to forty real, conversational, journey-spanning prompts, run repeatedly across platforms and markets, is the difference between a brand panel and a vanity dashboard.

This is also where most "AI visibility scores" quietly go wrong. A score computed over a thin or unrepresentative prompt set is precise and meaningless. The number can be tracked to two decimal places and still describe a universe your buyers never inhabit. The instrument has to be right before the readings can matter.

Chapter 04

The four signals behind the AI Readiness Index

Presence, share of citation, sentiment, and attribution tell you what is happening. They are the outcome layer: lagging, observed, and largely outside your direct control on any given run. They answer "are we visible." They do not answer "why," and they do not tell you what to change tomorrow. That gap is the entire reason the FancyAI AI Readiness Index exists.

ARI measures the inputs that cause visibility, not just the visibility itself. It is built from four signals, each chosen because it is something a team can actually execute against, and each grounded in the citation-factor evidence rather than in the volatile output.

Signal one: entity clarity. Generative engines resolve entities before they cite strings. The model has to know with confidence who you are, what category you belong to, and how you connect to your people and products, before it can reach for you in a consideration set. The evidence that this is the gating factor is direct: in 2026 entity-authority analyses, brand mentions correlate with AI visibility at 0.664 versus 0.218 for backlinks — roughly three times more predictive. Entity clarity measures whether you are a resolved node the model trusts or an ambiguous string it cannot place. If the model cannot tell who you are, nothing downstream matters.

Signal two: citation density. Visibility flows through the corpus the model reads, and that corpus is concentrated. LLMs typically cite only 2 to 7 domains per response, far fewer than Google's ten blue links; ChatGPT averages around five. Citation density measures how present your brand is across the specific third-party surfaces that feed those few slots — the review sites, the community threads, the reference pages the models actually pull from. The structural reality makes this decisive: 57% of branded-query citations go to reviews, listicles, forums, social media, and case studies, not to brand-owned pages. Density off your own domain is where most citations are won.

Signal three: structured proof. Once the model reaches your content, it has to be able to lift a claim cleanly. The Princeton GEO study established the levers: adding statistics lifted visibility by up to 41%, the single most effective technique tested; adding quotations from named sources lifted citations 28%; citing authoritative sources lifted lower-ranked pages by as much as 115%. Structured proof measures whether your facts are written in shapes a retrieval system can quote without rewriting — discrete statistics, clean tables, self-contained answer passages, named attribution. Prose has to be summarized. Proof can be quoted, and the quotable content is what gets cited.

Signal four: corroborating mentions. No model trusts a single source for a recommendation. The dominant new signal is co-occurrence: an analysis of 2.2 million prompts found that models now cross-reference multiple independent sources before citing, and roughly 60% of AI answers come from training data with no live web search at all, meaning brands not consistently mentioned across authoritative sources before the training cutoff are simply absent from the majority of answers. Corroborating mentions measure whether your brand shows up, consistently, across enough independent and authoritative places that the model treats you as a safe, well-attested recommendation rather than an unverified claim.

The outcome metrics tell you whether you are being recommended. ARI's four signals — entity clarity, citation density, structured proof, corroborating mentions — tell you why, and which one to fix first. One is a scoreboard. The other is a control panel.

The reason ARI is built on inputs is the same reason rank tracking fails on outputs. The outputs are volatile by design; you cannot pull a lever directly on a number that changes every run. But the inputs are stable and controllable. You can build the entity, earn the corroborating mentions, restructure the proof, and seed the surfaces that feed citation density. Move those, and the volatile outputs move with them, on average, over time. ARI measures the things you can actually change.

Chapter 05

A measurement framework teams can adopt

Putting this together yields a framework that any team can run without a research department. It has three layers — instrument, outcomes, drivers — and a cadence that matches how fast each layer actually moves.

Layer one: build and freeze the instrument. Construct a representative prompt set of 20 to 40 conversational, journey-spanning prompts, sourced from real buyer language, including unbranded competitive prompts and fan-out sub-questions. Decide your platforms (ChatGPT, Google AI Overviews and AI Mode, Perplexity, Gemini, Copilot, increasingly Grok) and your markets. Freeze it. Version it. This is your measurement instrument and it must hold still to be trusted.

Layer two: measure outcomes as rates, never as ranks. Run the prompt set repeatedly — many runs per prompt, per platform, per market — and report the four aggregate metrics:

  • Presence / recommendation rate — share of runs you appear in. Target 30%+ on core prompts, 50%+ to lead.
  • Share of citation — your presence versus competitors on unbranded prompts. The competitive number that travels to the executive team.
  • Sentiment — how you are framed when mentioned. Target 70%+ positive; flag recurring negatives.
  • Source attribution — the domains feeding each answer, which doubles as your action map.

Never report a single-run position as a metric. If a number cannot be expressed as a rate or distribution over many runs, it is a coin flip, not a measurement.

Layer three: diagnose with the four ARI signals. When an outcome metric is low, the input signals tell you why. Low presence with strong content usually means an entity-clarity or corroborating-mention gap — the model does not resolve you or does not trust you yet. Present-but-not-cited usually means a citation-density gap on the third-party surfaces that feed the slots. Cited-but-paraphrased-away usually means a structured-proof gap — your facts are not in liftable shapes. The diagnosis points straight at the lever.

The cadence follows the physics of each layer. Outcomes move slowly and noisily, so do not overreact to weekly wobble; the SparkToro and Profound data say a single bad week is mostly randomness. A sensible rhythm:

  • Weekly: presence rate, share of citation, sentiment, per platform. Watch trend, not single points.
  • Monthly: competitive share-of-citation deep dive, source-attribution analysis, ARI signal review.
  • Quarterly: full strategic audit. Benchmark: a 10%+ quarter-over-quarter improvement in visibility indicates the strategy is working.

A final honesty about testing. Classic A/B testing largely does not work here. Feedback loops are slow, model updates arrive in batches rather than continuously, and the output noise is large enough to swamp small effects. The practical substitute is before-and-after measurement on the ARI input signals — change the entity, the proof, the mentions — with enough time between measurement windows (4 to 8 weeks) to let the noise average out. You are not running split tests on the answer. You are improving the inputs and watching the rates move over quarters.

Track the instrument so it holds still. Measure outcomes as rates, never ranks. Diagnose with the input signals. React on quarters, not weeks. That is the entire discipline.

Chapter 06

Why this is the standard, not a preference

It would be easy to read all of this as one vendor's opinion about which dashboard to buy. It is not. It is the only measurement model consistent with how these systems actually behave, and the behavior is now thoroughly documented.

The case is short. AI generation is non-deterministic, so any single answer is one sample from a distribution. SparkToro's 2,961-run study put the odds of the same list twice at under 1 in 100. The sources are volatile, so any single citation snapshot decays fast. Profound's drift study put the monthly turnover at 40 to 60% of cited domains. Together these two facts make point-in-time rank tracking not merely imperfect but structurally invalid — it measures a quantity that does not hold still long enough to be a measurement. The only signals that survive are aggregates: rates and distributions computed over many runs of a stable, representative prompt set.

The stakes make the precision worth it. This is no longer a fringe channel. ChatGPT crossed 800 million weekly users in late 2025 and 900 million by early 2026. Forrester's 2026 buyers'-journey survey found twice as many buyers named generative AI or conversational search their most meaningful research source as named any other source, outranking vendor websites, product experts, and sales reps. 73% of B2B buyers now use AI tools in purchase research. The consideration set the model assembles is, increasingly, the consideration set the buyer brings to the table. Measuring your place in it badly is worse than not measuring it, because a confident wrong number drives confident wrong decisions.

This is the contrast that runs through everything FancyAI publishes. Tracking a rank is measuring a thing you cannot change, that does not hold still, in a system that does not rank. Measuring presence, share of citation, sentiment, and attribution — and driving them through entity clarity, citation density, structured proof, and corroborating mentions — is measuring the things you can actually influence, in the units the system actually produces.

Visibility is what you measure. Influence is what you execute. The right metrics make the second one possible, because you cannot improve what you have mismeasured.

The recommendation era did not make AI visibility unmeasurable. It made the old measurements obsolete and demanded better ones. The brands that adopt the new standard — rates over ranks, a frozen prompt panel as the instrument, input signals as the diagnosis — will know exactly where they stand and exactly what to change. The brands still running an AI rank tracker and refreshing it daily will be staring at noise, mistaking the jitter of a probabilistic system for a scoreboard. The mention is the signal. Measure the signal, not the roll of the die.

Sources cited

  1. SparkToro / Gumshoe (Fishkin & O'Donnell, 2025) — 600 volunteers, 12 prompts, 2,961 runs across ChatGPT, Claude, and Google AI; odds of the same list twice under 1 in 100, same list in same order closer to 1 in 1,000. The foundational evidence that single-run position tracking is invalid.[02]Profound (2025) — "AI Search Volatility" citation-drift study; 40–60% of cited domains change month over month for identical prompts (AI Overviews 59.3%, ChatGPT 54.1%, Copilot 53.4%, Perplexity 40.5%); 100,000 distinct prompts across ChatGPT and Perplexity.[03]SE Ranking (2026) — prompt-set methodology; representative tracking set of 20–40 prompts as a brand panel; average conversational query ~23 words; journey- and persona-spanning prompt construction.[04]Semrush (2025–2026) — AI share-of-voice definition (mention frequency × prominence), October 2025 prompt-volume weighting update, and the LLM sentiment analyzer simulating customer personas to measure recommendation rate.[05]Ahrefs Brand Radar (2026) — AI SoV as share of brand impressions; 199M search-backed prompts across six engines; People Also Ask and fan-out expansion of seed prompts; six-platform coverage including Grok.[06]AirOps (2026) — Brand Visibility Score / presence rate as the North Star: answers mentioning your brand ÷ total relevant answers; 30%+ presence benchmark for core prompts, 50%+ to lead.[07]Princeton / Aggarwal et al. (KDD 2024) — "GEO: Generative Engine Optimization": +41% visibility from statistics, +28% from quotations, up to +115% for source citations on lower-ranked pages — the structured-proof signal.[08]2026 entity-authority analyses — brand mentions 0.664 vs. backlinks 0.218 correlation with AI visibility (entity-clarity signal); co-occurrence across 2.2M prompts as the dominant new citation signal (corroborating-mentions signal).[09]Citation-structure research (Profound, Ahrefs, 2025–2026) — LLMs cite 2–7 domains per response (ChatGPT ~5); 57% of branded-query citations go to reviews, listicles, forums, and case studies; ~60% of answers drawn from training data with no live retrieval (citation-density signal).[10]Forrester 2026 Buyers' Journey Survey — twice as many buyers named generative AI / conversational search their most meaningful research source as any other; 73% of B2B buyers use AI tools in purchase research.[11]OpenAI / TechCrunch (2025–2026) — ChatGPT 800M weekly users (Dec 2025), 900M (Feb 2026) — the scale that makes measurement urgent.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) runs a representative prompt panel across every major engine, reports your presence rate, share of citation, sentiment, and source attribution as stable rates, and diagnoses exactly which of the four input signals — entity clarity, citation density, structured proof, corroborating mentions — is holding you back.

Related research

Foundational Methodology Original Research Myth-Busting
Back to Research
Live · FancyAI Research Corpus

Who Actually Uses AI Search: The Consumer Has Already Moved

Nearly a billion people now open an AI chatbot every week, half of consumers treat it as their starting point for buying decisions, and the answer they get back is a short, opinionated recommendation, not ten blue links. The consumer didn't wait for marketing to catch up. The only question left is whether the machine names your brand when they ask.

900M
ChatGPT weekly active users in February 2026, more than double a year earlier (OpenAI)
50%
Of consumers now intentionally seek out AI-powered search engines (McKinsey)
8%
Of searches with an AI summary produce a click to a website, down from 15% without (Pew Research, 2026)
5%
Of consumers move directly from an AI answer to a purchase, the rest verify first (Yext, 2026)
Chapter 01

The adoption curve already bent

The debate about whether AI search is "real yet" is over. The consumer settled it.

In February 2026, OpenAI confirmed ChatGPT had reached 900 million weekly active users, up from 800 million in October 2025 and roughly double the 400 million reported a year earlier. By June 2026 it crossed 1 billion monthly active users, the fastest any application in history has hit that mark, and it now processes more than 2 billion queries a day. It is not alone. Google's Gemini app surpassed 750 million monthly active users in Google's Q4 2025 earnings, up from 650 million the prior quarter and 400 million in May 2025. Google's AI Mode, the conversational search surface, hit 100 million monthly users across the US and India in Q1 2026 and has since passed a billion, while AI Overviews now reach roughly 2 billion monthly users globally. Perplexity processes on the order of 780 million queries a month.

These are not pilot-program numbers. They are population-scale numbers. At 900 million weekly users, ChatGPT alone touches something close to one in nine people on Earth every seven days.

The behavioral data underneath the user counts is what matters for brands. McKinsey's consumer research found that 50% of consumers now intentionally seek out AI-powered search engines, and among the people who use them, 44% say AI is their primary and preferred source of insight, ahead of traditional search at 31%, retailer and brand websites at 9%, and review sites at 6%. The same research found 62% of consumers use AI to compare options across brands, models, prices, and reviews, and 55% use it to learn about a category or product before they decide.

Half of consumers now reach for an AI search engine on purpose, and for the people who use it, AI has already overtaken Google as the preferred source of insight for buying decisions. This is mainstream behavior, not early adoption.

The frequency is the part that should reset planning assumptions. 43% of consumers use AI search tools daily, according to Yext's 2026 Consumer Search Behaviors research. Daily use is the threshold at which a channel stops being a novelty and becomes a habit, and habits are where brand preference is formed or lost. The consumer adoption curve did not just rise. It bent into a new default.

Chapter 02

Gen Z is the leading edge, and everyone else is following

Adoption is not evenly distributed, and the distribution tells you where this is going.

Pew Research Center found that 58% of US adults under 30 have used ChatGPT, up from 43% in 2024 and just 33% in 2023, a near-doubling in two years. Among all US adults the figure reached 34%, also roughly double the 2023 level. The age skew is steep: in a separate Pew report from early 2026, 64% of US teens aged 13 to 17 have used AI chatbots, with 28% using them daily. Young adults are not dabbling. They are wiring AI into the default path for how they find things out.

The gap between cohorts is wide enough to function as a forecast. Roughly 70% of Gen Z report using generative AI tools weekly, far above any other age group. On search specifically, only 42% of millennials and Gen Z default to a traditional search engine, versus 76% of baby boomers. That 34-point spread is the entire trajectory of the market drawn on a single chart. The behavior that is dominant among 22-year-olds today is the behavior that becomes ordinary across the population as those users age into peak spending years and as older cohorts follow them up the curve, exactly as they did with mobile, social, and streaming.

Gen Z's relationship with discovery is also more fragmented than a single "AI search" label suggests, and brands need to hold both facts at once. The same generation that leads AI chatbot adoption also runs a multi-platform discovery stack:

  • Nearly half of US consumers now use TikTok as a search engine, up from 41% in 2024 to 49% in 2026.
  • 41% of Gen Z turn to social media first when searching for information online, and 52% say they trust product information on social media more than information from Google or AI chatbots.
  • Social platforms collectively drive over 60% of product discovery, while Google accounts for roughly 34.5% of total search share.

The mistake is to read this as AI search losing. It is not a zero-sum fight for the same query. Gen Z uses TikTok and Instagram for discovery-driven, visual, social queries, and reaches for AI chatbots when they want synthesis, comparison, and a direct answer to a specific, often complex question. The implication for brands is that the consumer journey now spans six or more surfaces, and the AI answer is the one that increasingly arbitrates the final decision. A BigCommerce-commissioned study captured the commercial edge of this shift directly: one in three Gen Z and one in four millennials now turn to AI platforms over other sources to guide purchase decisions.

The behavior that is already normal for 22-year-olds is the behavior that becomes ordinary for everyone. Gen Z is not a niche segment to optimize for later. It is the live preview of the default consumer.

Chapter 03

What consumers actually use AI search for

Adoption numbers tell you how many. Use cases tell you where the money is, and the data shows AI search has colonized exactly the moments that used to belong to Google and the retailer website.

Shopping and product discovery have moved fastest. Adobe's 2026 AI and Digital Trends report, which surveyed 4,000 consumers globally, found that the most common AI shopping behaviors are getting product recommendations (47%), finding deals (43%), ideating gifts (35%), and generating shopping lists (33%). Nearly half of consumers, 44%, say they are likely to use an AI assistant for inspiration, research, or shopping in entertainment and media, with clothing at 41% and health and beauty at 34% close behind. Crucially, 85% of consumers who have used AI for shopping say it improved their experience, the kind of satisfaction signal that turns trial into habit.

The everyday-question use case is just as large and matters just as much for visibility. Half of AI users surveyed by Adobe use it for general research, and 39% use it for movie and dinner recommendations, the small local and lifestyle decisions that traditional search and review sites used to own outright.

Health is the use case that should sober every marketer and reassure no one about the limits of trust. A KFF Tracking Poll conducted in February and March 2026 found 32% of US adults now turn to AI for health information and advice, equal to the share who use social media for health. Breaking that down, 29% have used AI for physical-health questions and 16% for mental-health questions in the past year. A Harris Poll for Merck Manuals went higher, finding more than three in five Americans (62%) have used AI tools for medical information. The demographic skew repeats: adults under 30 are about three times as likely as those 50 and over to use AI for mental-health support. These are high-stakes, high-intent moments, and consumers are bringing them to a machine that hands back one synthesized answer rather than a page of sources to weigh.

Consumers are not using AI search for trivia. They are using it for the decisions that matter most, what to buy, where to eat, and what to do about their health. The recommendation arrives pre-synthesized, and most users never see the runners-up.

The through-line across every use case is the same. The consumer is no longer asking the machine to fetch a list of links to evaluate. They are asking it to do the evaluating and return a verdict. That is the structural shift this report exists to explain.

Chapter 04

Trust is conditional, and verification is the new gate

The most dangerous misreading of AI search adoption is to assume that because people use it, they blindly believe it. They don't. The trust pattern is conditional, and understanding its shape is the difference between a winning strategy and a wasted one.

Trust is rising, and it is rising fast. Yext's 2026 research found that among AI users, 75% rate their trust in AI local-business recommendations at 4 or 5 out of 5, and 59% say their trust in AI has grown over the past year, against just 15% who say it declined. Consumers increasingly trust AI search output more than they trust paid placements, and the ad-free, direct-answer experience is a large part of why.

But trust does not mean obedience. The single most important behavioral finding for brands is this: only 5% of consumers move directly from an AI answer to a purchase (Yext, 2026). The other 95% verify first. The verification loop is substantial and revealing:

  • 53% search Google or Bing to verify or learn more after an AI recommendation.
  • 49% visit the business's website directly.
  • 42% click through to the sources or citations the AI provided.

This is "trust but verify" operating at population scale. The consumer treats the AI's recommendation as a strong, credible starting point, a shortlist of one to three names worth checking, and then confirms it across other surfaces before committing. What pushes them over the line once they verify is conventional reputation: after an AI recommendation, the top purchase influencers are star rating (34%), word of mouth (30%), review recency (29%), review sentiment (28%), and review count (28%).

The decision is made in the AI conversation. The confirmation happens everywhere else. If the machine doesn't name you, you are never in the set the consumer goes on to verify, and the reviews you spent years accumulating never get read.

The execution lesson writes itself. Because the consumer cross-checks the AI's answer against Google, your website, and your reviews, your brand's information has to be consistent and credible across every one of those surfaces. An AI recommendation that points to a website with contradictory pricing, or a brand with thin or stale reviews, fails the verification step, and a failed verification is a lost sale that never shows up in any AI-attribution dashboard. Multi-platform consistency is not a nice-to-have. It is the gate that converts an AI mention into revenue.

Chapter 05

The zero-click reality: the answer is the destination

If the AI conversation is where decisions are shaped, the click-through data shows just how completely that conversation has become the destination rather than a doorway to one.

Pew Research Center's 2026 analysis tracked the real browsing behavior of 900 US adults and found that when a Google search returns an AI summary, users click through to a website in only 8% of cases, down from 15% on searches without a summary. Clicks on the links inside the AI summary itself are rarer still, at just 1% of all visits. And the session-ending behavior is the quiet bombshell: users abandon browsing entirely 26% of the time after a page with an AI summary, versus 16% without. More than a quarter of the time, the AI answer is not the first step of a journey. It is the whole journey.

This is the zero-click reality made concrete at the level of individual human behavior. For the majority of searches, the consumer reads the synthesized answer, absorbs the recommendation, and never visits a website at all. The traffic that used to flow from search into your funnel is being intercepted and resolved inside the answer box.

When eight in a hundred AI-summary searches produce a click and more than a quarter end the session entirely, traffic stops being the scoreboard. The brand that gets named in the answer wins the decision whether or not the consumer ever clicks.

The strategic consequence is a hard inversion of how marketers have measured success for twenty years. Web analytics can tell you who arrived on your site. It is structurally blind to the far larger population that saw the AI name a competitor instead of you and never clicked at all. The absence of traffic is not evidence of low AI activity. It is often evidence that the activity resolved without you. Measuring AI search by referral sessions is like measuring an election by counting only the people who showed up at your campaign office. The decisive activity happened elsewhere, and you have no record of it unless you are watching the answer itself.

Chapter 06

From keywords to conversation: the query got long

Underneath the adoption and the zero-click numbers sits a change in the unit of search itself, and it rewires what content has to do to get selected.

Traditional Google queries average 3 to 4 words, the terse keyword strings consumers learned to type after two decades of training themselves to talk to an algorithm. AI search queries average 23 words, nearly five times longer. They are not keyword strings. They are full natural-language questions that carry context, constraints, and intent up front: not "running shoes" but "what are the best running shoes for someone with flat feet training for a first marathon on a budget under $120." By 2026, an estimated 67% of AI search queries are full questions or conversational phrases rather than keywords, and on some platforms the average query has stretched well beyond 30 words as users learn to articulate nuance and edge cases in a single prompt.

The conversation also doesn't stop at one turn. Consumers refine, follow up, and narrow across a multi-step exchange, building context the AI carries forward. A single shopping decision might run: broad category question, then a comparison of two named options, then a price-and-availability check, then a final "which should I get for my situation." Each turn is a chance for a brand to be named or dropped, and the platforms expand each prompt into multiple underlying retrievals, so one consumer question can trigger many parallel searches behind the scenes.

The query stopped being a keyword and became a conversation. Consumers now state their full situation and ask the machine to decide. Content that only matches keywords has nothing to say to a question that contains its own answer requirements.

For brands, this reorders the content job entirely. Keyword-matched pages were built to intercept a 3-word query and let the human do the comparison. They are nearly mute in front of a 23-word question that already specifies the use case, the constraint, and the budget. Winning the long conversational query requires content that answers specific, qualified questions directly, that covers a topic with enough depth and breadth to surface across the many sub-queries a single prompt fans out into, and that reads like a credible answer to a real human situation rather than a page optimized for a search string. The consumer changed how they ask. The brands that get recommended are the ones that changed what they say in response.

Chapter 07

What this means for brands: being seen is not being selected

Every thread in this report converges on one strategic distinction, and getting it right is the whole game.

For twenty years, the objective was visibility. Rank on page one, appear in the consideration set, get seen. Visibility was a placement you could earn, optimize, and to a meaningful degree buy. AI search retires that objective and replaces it with a harder one: selection. The machine does not hand the consumer a ranked page of options to choose among. It hands them a short, synthesized recommendation, usually three names or fewer, and the consumer treats that recommendation as the decision, then verifies it. There is no second page, no scroll, no organic spillover. You are named in the answer or you are absent from the only conversation that counts.

The numbers in this report make the stakes unambiguous. Half of consumers reach for AI search on purpose, 44% of them call it their preferred source of insight, 43% use it daily, only 8% of AI-summary searches produce any click, and 26% of sessions end inside the answer. Read together, those figures describe a market where the brand that is selected by the model captures the decision, and the brand that is merely optimized for traditional visibility captures a click-through rate that is collapsing toward single digits.

This is the inversion at the center of the consumer shift:

  • Being seen was about ranking a page you control on a results screen the consumer scrolled.
  • Being selected is about being named by a system you don't control, in an answer the consumer treats as the verdict.

The decision happens inside the conversation. By the time a verifying consumer reaches your website or your reviews, the AI has already decided whether to put you in the running. The reviews, the pricing, the comparison pages all matter, but they matter as confirmation of a selection the model already made, not as the place the selection happens. Brands that keep pouring budget into being seen, into traffic and rankings, are optimizing the confirmation step while a competitor wins the selection step that precedes it.

Being seen is a placement you can buy. Being selected is a reputation you have to earn, across every surface the model reads, before the consumer ever clicks. The decision happens in the conversation. The website is just where they confirm it.

The encouraging read is that the work to be selected is concrete and largely unglamorous. The model assembles its recommendation by cross-referencing what you say about yourself against what the rest of the internet says, and it names the brands where those stories agree, where the reviews are real and recent, where the pricing is published, and where the content answers the specific questions consumers actually ask. None of that requires a new ad budget. It requires making your brand's truth consistent, credible, and specific everywhere the machine looks, then measuring whether the machine names you. The consumer has already moved. The brands that move with them will be the ones the answer recommends.

Sources cited

  1. OpenAI / TechCrunch (2026) — ChatGPT reached 900M weekly active users in February 2026 (up from 800M in October 2025 and ~400M a year earlier), crossed 1B monthly active users by June 2026, and processes more than 2B queries per day.[02]Google Q4 2025 earnings / TechCrunch (2026) — Gemini app surpassed 750M monthly active users (up from 650M and from 400M in May 2025); Google AI Mode reached 100M monthly users in Q1 2026 and later passed 1B; AI Overviews reach ~2B monthly users globally.[03]McKinsey, "New front door to the internet: Winning in the age of AI search" — 50% of consumers intentionally seek out AI search; 44% of users call AI their primary and preferred source of insight (vs. 31% traditional search, 9% brand sites, 6% review sites); 62% use AI to compare options, 55% to learn about a category.[04]Pew Research Center (2026) — Tracked 900 US adults' browsing; clicks occur on 8% of searches with an AI summary vs. 15% without; only 1% of visits click a link inside the summary; 26% of sessions end after an AI-summary page vs. 16% without; 58% of US under-30 adults have used ChatGPT (up from 43% in 2024, 33% in 2023); 34% of all US adults; 64% of teens 13–17 have used AI chatbots, 28% daily.[05]Yext 2026 Consumer Search Behaviors Report — 43% use AI search daily; 75% rate trust in AI local recommendations 4–5 of 5; 59% say trust grew in the past year; only 5% buy directly from an AI answer; 53% verify on Google/Bing, 49% visit the website, 42% click cited sources; top post-AI purchase influencers are star rating (34%), word of mouth (30%), review recency (29%), sentiment (28%), count (28%).[06]Adobe 2026 AI and Digital Trends Report (4,000 consumers) — AI shopping uses: product recommendations (47%), finding deals (43%), gift ideation (35%), shopping lists (33%); 44% likely to use AI for entertainment/media research, 41% clothing, 34% health and beauty; 50% use AI for general research, 39% for movie/dinner recommendations; 85% of AI shoppers say it improved their experience.[07]KFF Tracking Poll (Feb–Mar 2026) — 32% of US adults turn to AI for health information and advice (equal to social media); 29% for physical health, 16% for mental health; under-30s ~3x more likely than 50+ to use AI for mental-health support. Harris Poll for Merck Manuals (2026) — 62% of Americans have used AI tools for medical information.[08]Pew Research / generational data — 70% of Gen Z use generative AI weekly; 42% of millennials and Gen Z default to traditional search vs. 76% of baby boomers.[09]Query-length and conversational-shift research (2026) — AI search queries average 23 words vs. 3–4 for Google; 67% of AI queries are full questions or conversational phrases; some platforms average 30+ words.[10]BigCommerce-commissioned study / TikTok-as-search data (2026) — 1 in 3 Gen Z and 1 in 4 millennials turn to AI platforms over other sources for purchase decisions; 49% of US consumers use TikTok as a search engine (up from 41% in 2024); 41% of Gen Z search social media first; social platforms drive 60%+ of product discovery vs. Google's ~34.5% share.[11]Perplexity usage data (2026) — Processes ~780M queries per month.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which consumer prompts the engines name you in, where competitors get selected into the answer instead, and which review, pricing, and content gaps are keeping you out of the recommendation the consumer never clicks past.

Related research

Buyer Behavior Macro Analysis Foundational Methodology
Back to Research
Live · FancyAI Research Corpus

When the Buyer Is a Machine: Agentic AI and the End of the Human Shopper

AI stopped recommending and started transacting. Agents now influence one in five holiday orders, the protocols to let them pay are live, and the question is no longer whether a person picks you. It is whether a machine finds you eligible.

20%
Share of 2025 Cyber Week orders influenced by AI and agents, worth $67B (Salesforce)
$3–5T
Projected global agentic commerce revenue orchestrated by 2030 (McKinsey)
3–4x
Higher AI recommendation visibility for catalogs with near-complete attribute data (commercetools)
60+
Organizations backing Google's AP2 agent payments protocol at launch (Google Cloud)
Chapter 01

The shift from recommending to transacting

For two years, the story of AI and commerce was a story about influence. The engine read the reviews, weighed the options, and named a product. A human still did the buying. The agent was an advisor standing next to the shopper, not the shopper.

That line has now been crossed. In 2026 the agent does not just recommend the product. It selects it, adds it to the cart, and completes the payment. The buyer is becoming a machine, and the machine is buying on someone else's behalf.

The proof is no longer hypothetical. During 2025's Cyber Week, AI and AI agents influenced 20% of all orders, accounting for $67 billion in global sales, according to Salesforce. Agent task completion volume surged through the peak. HUMAN Security measured AI agent traffic jumping 144% on Black Friday alone. And the retailers who put agents on their own properties grew US sales at seven times the rate of those who did not, 13% versus 2%, in the seven weeks into Cyber Week.

The agent does not send you a curious browser. It sends you a completed transaction, or it sends nothing at all because it never selected you in the first place.

This is the inversion at the heart of the new commerce. Search optimization was built to win a human's click. The human scanned a page, formed an impression, and decided. Agentic commerce removes the page, removes the impression, and increasingly removes the human from the decision loop entirely. What remains is a machine reading structured data, scoring options, and acting.

The strategic consequence is blunt. Being selected by a person and being selected by an agent are not the same problem. A person responds to a brand story, a hero image, a feeling. An agent responds to whether your price, availability, rating, and specifications are present, parseable, and verifiable. The shopper you spent a decade learning to persuade is being replaced by a buyer that cannot be persuaded, only qualified.

Chapter 02

Who is building the buying machine

This is not one company's bet. Every layer of the commerce stack, from the assistant to the payment rail, is racing to make agents into buyers. Five fronts matter.

OpenAI moved first on in-chat purchasing. It launched Instant Checkout in September 2025 and rolled "Buy it in ChatGPT" out to all US users on February 16, 2026, powered by the Agentic Commerce Protocol it co-developed with Stripe. Etsy sellers came first, with over a million Shopify merchants including Glossier, SKIMS, Spanx, and Vuori queued behind them. Then OpenAI did something instructive: in March 2026 it scaled back the standalone in-chat checkout and pivoted toward discovery plus merchant-owned checkout, listing Target, Sephora, Nordstrom, Lowe's, Best Buy, The Home Depot, and Wayfair as ACP discovery integrations (Digital Commerce 360). The buy button moved. The discovery layer stayed.

Google is building the payments substrate. It announced the Agent Payments Protocol (AP2) on September 16, 2025 with 60-plus launch partners including Mastercard, PayPal, Coinbase, American Express, and Salesforce (Google Cloud). AP2 uses three cryptographically signed mandates, Intent, Cart, and Payment, carried as W3C Verifiable Credentials, so an agent can prove a human authorized a purchase. By May 2026, Google had donated AP2 to the FIDO Alliance for open governance.

Amazon is building a walled garden. It runs Rufus, its on-site shopping agent, and "Buy for Me," which purchases from other brands' sites on a shopper's behalf. Then it went to court to keep rival agents out, which we return to below.

Perplexity is pushing autonomous shopping through its Comet browser, which can scour the web, compare prices, and click "purchase" for the user.

Visa and Mastercard are building the trust layer for payments. Visa introduced its Trusted Agent Protocol in October 2025 with more than ten partners, an open framework built on the HTTP Message Signature standard that lets merchants distinguish a legitimate authorized agent from a malicious bot. By 2026 Visa was working with more than 100 partners, with over 30 building in its Intelligent Commerce sandbox. Mastercard's Agent Pay runs in parallel, and in January 2026 Mastercard publicly committed to supporting AP2 alongside its own rails.

The market has effectively conceded that agentic commerce will be multi-protocol. The fight is not over which standard wins. It is over which brands are legible to all of them.

Notice what every one of these players is solving for. Not persuasion. Plumbing. They are building the rails that let a machine identify a product, trust a merchant, authorize a payment, and complete a transaction without a human touching a checkout button. The brand's job is no longer to be chosen at the end of that pipeline. It is to be eligible to enter it at all.

Chapter 03

The protocols that decide who can transact

The protocol layer is where eligibility is now defined, and it is worth understanding because it is the part most brands ignore.

A protocol is the contract that lets an agent and a merchant exchange a transaction safely. Four matter today, and they are converging rather than competing:

  • ACP (Agentic Commerce Protocol) — OpenAI and Stripe's open standard for in-chat purchasing. It works across payment processors, integrates without backend changes, and keeps the merchant in control of the customer relationship. There are no fees on purchases that start in ChatGPT.
  • AP2 (Agent Payments Protocol) — Google's signed-mandate framework for proving a human authorized an agent's purchase, now under FIDO Alliance governance with 60-plus backers.
  • Trusted Agent Protocol — Visa's framework for merchants to verify that an inbound agent is authorized and not a malicious bot, built on existing web standards with minimal UX change.
  • Agent Pay — Mastercard's network-specific implementation of agent-authorized payments, now committed to interoperate with AP2.

Underpinning all of them is MCP (Model Context Protocol), the open standard Anthropic introduced in November 2024 and now stewarded by the Linux Foundation with support from Anthropic, OpenAI, Google, and Microsoft. MCP lets a brand publish its catalog, cart, and commerce functions as a machine-callable interface. As one Shopify MCP guide put it, once your server is live, any agent that speaks the MCP language can interact with your brand instantly.

The strategic point is the same across every protocol. They reward brands whose product data is structured, complete, and exposed through a clean interface, and they ignore brands whose product truth is locked inside JavaScript-rendered pages and marketing copy. An agent transacting through ACP or AP2 does not read your homepage. It reads your feed, your schema, or your MCP endpoint.

The protocols are not a payments story. They are an eligibility story. They define the format your product must be in to be transactable at all, and that format is machine-readable, not human-readable.

For a brand, this collapses a decade of separate concerns, SEO, conversion-rate optimization, payments, into one question: is your product expressed in a structure an agent can read, trust, and act on? If the answer is no, the most advanced checkout infrastructure ever built routes around you.

Chapter 04

How agents actually pick

Persuasion does not move an agent. Structure does. The research on agent selection is consistent and, for marketers raised on brand-building, uncomfortable.

The most rigorous examination, a Yale, Columbia, and University of Chicago study, found that AI agents show a 20–40% reduction in selection probability when a single key product attribute is missing. The agent does not fill the gap with goodwill or brand affinity. It downgrades or drops the product. The study also found that a small rating difference, 4.4 versus 4.1, frequently flipped which product the agent ranked first, and that agents assign "trust bonuses" to well-reviewed products, sometimes selecting them even when they cost more.

The 2026 commercial data confirms the pattern at scale. Stores with 99.9% attribute completion are seeing 3–4x higher visibility in AI recommendations than stores with sparse data (commercetools). The mechanism is mechanical: agents operate on confidence scores, rating every product on how well it matches the request. When attributes are incomplete or vague, the agent recommends a competitor instead.

The signals that move an agent cluster into four groups:

  • Structured product data. Price, availability, dimensions, materials, compatibility, all explicitly marked up. This is the single most controllable lever and the one most retail pages fail. A page written for human scanning, with image-based feature callouts and client-side pricing, is frequently invisible to the agent deciding what to buy.
  • Reviews and ratings. The verification layer. Agents lean on user-generated content because it separates brand claims from real experience. Small differences in average rating change outcomes.
  • Trust and reliability signals. Return policies, fulfillment accuracy, retailer reliability. Agents favor merchants they can trust to deliver, and in 2026 those signals must be machine-readable, not buried in a footer.
  • Contextual comparison. Agents do not score products in isolation. They score them relative to alternatives, at machine speed, which makes competitive context more decisive than it ever was for human shoppers.

Clean metadata beats creative marketing. The agent favors clarity over branding and precision over persuasion. You cannot tell it a story. You can only tell it the truth, in a format it can read.

There is one critical caveat brands must internalize. SparkToro's research found that AI recommendations are highly inconsistent in their ordering: less than a 1-in-100 chance of getting the same brand list twice, and less than 1-in-1,000 of the same list in the same order. The reconciliation with the selection studies is the most important strategic insight in this report. The consideration set, which brands appear at all, is stable and optimizable. The ranking within it is essentially random. Chasing a "position" in agent output is a fool's errand. Earning a permanent seat in the consideration set is the entire game.

Chapter 05

Agent-eligibility is the new frontier, and it has teeth

If being chosen by a human was about reputation, being chosen by an agent is about eligibility. And eligibility is becoming something brands and platforms now fight over in court.

In early 2026, Amazon sued Perplexity, alleging its Comet browser agent concealed its automated nature and masqueraded as a human shopper to slip past bot detection and shop on Amazon on customers' behalf. A federal judge in San Francisco granted Amazon a preliminary injunction in March 2026, blocking Comet from accessing password-protected sections of Amazon to shop for users (GeekWire). The Ninth Circuit then temporarily stayed that injunction on appeal, letting Perplexity's agents back on, for now (Winbuzzer).

The case is a landmark, and it tells brands two things. First, the largest retailer on earth would rather build a walled garden, controlling the agent, the data, and the ad revenue through Rufus and Buy for Me, than let third-party agents transact freely on its turf. Second, "agent-eligible" is not a metaphor. Platforms are now drawing hard technical and legal lines about which agents may transact, which is exactly why protocols like Visa's Trusted Agent Protocol exist: to give merchants a sanctioned way to distinguish an authorized agent from an unauthorized bot.

For a brand outside the walled gardens, the implication is sharper still. Your eligibility to be transacted by an agent depends on three machine-facing assets:

  • Structured data completeness. The 20–40% selection penalty for missing attributes is the price of an incomplete feed. Product, Offer, Review, and AggregateRating schema, kept dynamic so price and availability never go stale, is the floor.
  • Machine-readable trust signals. Reviews, ratings, return policies, and reliability data the agent can parse and weight. Trust that lives only in brand copy is invisible to the buyer.
  • A clean agent interface. Feeds, APIs, or an MCP server that lets agents access your catalog directly rather than scraping a rendered page. This is the path that skips the infrastructure entirely and delivers your product straight to the point of decision.

Being agent-eligible is not a marketing posture. It is an infrastructure standard. Either an agent can read, trust, and transact your product, or it cannot, and there is no charisma that closes the gap.

The brands that treat this as a backend chore are mispricing it. In an agent-mediated market, your data quality is your brand marketing. It is the only part of your brand the buyer can actually perceive.

Chapter 06

The market size, and the honest version of it

The forecasts are large, and they vary enough that the honest reading matters more than the biggest number.

McKinsey's research projects agentic commerce could orchestrate as much as $1 trillion in US retail revenue by 2030, and $3 trillion to $5 trillion globally (Digital Commerce 360, McKinsey). That is the orchestration figure, the total revenue agents touch or influence. The narrower "agents actually doing the buying" estimates are more modest and more contested:

  • Morgan Stanley estimates agentic shoppers could represent $190 billion to $385 billion in US e-commerce spend by 2030, capturing 10% to 20% of the market.
  • Bain & Company projects 15–25%, or $300–$500 billion.
  • Gartner projects 20% of digital commerce transactions will be executed through AI platforms or agents by 2030, and that by 2028, 90% of B2B buying will be mediated by AI agents.

The consensus that emerges across the credible forecasts: by 2030, roughly 10–25% of US e-commerce will involve AI agents in a meaningful way, on the order of $200 billion to $500 billion in direct agent transactions, sitting inside a far larger pool of agent-influenced revenue.

Two near-term anchors keep this grounded. Gartner predicted traditional search engine volume would drop 25% by 2026 as answer engines substitute for queries, and that 60% of brands will use agentic AI for one-to-one interactions by 2028. And Salesforce's 39% of consumers, over half of Gen Z, already using AI for product discovery shows the demand side is not waiting for the forecasts.

The biggest number is not the point. The point is the slope. Agents went from zero share of holiday orders to influencing 20% in roughly two years. A channel does not need to be a majority to be the one deciding your next margin.

The discipline here matters. Gartner also predicts over 40% of agentic AI projects will be canceled by end of 2027, and OpenAI's retreat from standalone in-chat checkout shows the buy buttons will keep shifting. The durable bet is not a specific checkout integration or platform. It is being the product an agent can find, trust, and transact, on whatever surface and through whatever protocol the buyer happens to be using.

Chapter 07

The execution implication

The work splits cleanly into what you own, what you influence, and what you measure. In an agent-mediated market none of it is optional.

Make your product machine-eligible first. This is the foundation and the fastest win, because the selection penalty for missing data is steep and the fix is concrete.

  • Implement complete Product, Offer, Review, and AggregateRating schema, and keep it dynamic. Stale or contradictory data gets the product dropped, not forgiven.
  • Render price, specs, availability, and reviews server-side. If they only appear after JavaScript executes, assume the agent never sees them.
  • Drive attribute completeness toward 100%. The 3–4x visibility gap between complete and sparse catalogs is the highest-leverage number in this report.
  • Stand up a feed or MCP endpoint so agents can access your catalog directly rather than scraping a rendered page.

Earn the trust signals an agent weights. Reviews, ratings, and reliability data are the verification layer, and the agent treats third-party validation as more credible than brand claims. Build genuine review depth, surface return and fulfillment data in machine-readable form, and keep ratings current.

Optimize for the consideration set, not the ranking. Because ordering in agent output is near-random, chasing a position wastes effort. The winnable goal is permanent presence in the set of products an agent considers at all. That is a function of data completeness, trust signals, and contextual competitiveness, not of a single keyword or placement.

Plan for the protocols and the walled gardens. Get readable through the open protocols, ACP, AP2, MCP, so you are transactable wherever agents operate. And recognize that platforms like Amazon are building closed systems where they control the agent; decide deliberately where you accept those terms and where you protect a direct relationship.

Measure selection, not traffic. Classic analytics tell you who arrived. They cannot tell you whether the agent considered you in the first place, and the decision now happens before any click. Track share of voice in agent answers, the prompts you appear and do not appear in, and which competitors are selected when you are not.

The old job was persuading a person who could be moved by a story. The new job is qualifying for a machine that can only be moved by structure. Visibility was a placement. Eligibility is an architecture.

The brands that win the agentic transition will not be the ones with the best campaigns. They will be the ones whose product truth is the most complete, the most verifiable, and the most machine-readable, because that is the only version of the brand the buyer can now perceive. The buyer became a machine. The brands that get read are the ones that get bought.

Sources cited

  1. Salesforce (2025–2026) — AI and agents influenced 20% of Cyber Week orders ($67B); retailers with on-site agents grew US sales 7x faster (13% vs 2%); 39% of consumers and over half of Gen Z use AI for product discovery.[02]McKinsey / Digital Commerce 360 (2025) — Agentic commerce could orchestrate up to $1T in US retail revenue and $3–5T globally by 2030.[03]Morgan Stanley (2025–2026) — Agentic shoppers may represent $190B–$385B in US e-commerce spend by 2030, 10–20% market share.[04]Bain & Company (2025) — 15–25% / $300–$500B agentic share projection; walled-garden disintermediation risk.[05]Gartner (2024–2026) — Search volume to drop 25% by 2026; 60% of brands using agentic AI by 2028; 20% of digital commerce via AI by 2030; 90% of B2B buying agent-mediated by 2028; over 40% of agentic AI projects canceled by end of 2027.[06]Yale / Columbia / University of Chicago study — 20–40% selection-probability drop for missing attributes; small rating differences flipping rankings; trust bonuses for well-reviewed products.[07]commercetools (2026) — Stores with 99.9% attribute completion see 3–4x higher AI recommendation visibility; agents operate on confidence scores.[08]SparkToro / Rand Fishkin (2025) — Agent recommendation ordering is near-random; consideration set is stable and optimizable, ranking is not.[09]OpenAI / Stripe / Digital Commerce 360 (2025–2026) — Instant Checkout and Agentic Commerce Protocol launch; February 2026 US-wide rollout; March 2026 pivot to discovery plus merchant-owned checkout (Target, Sephora, Nordstrom, Lowe's, Best Buy, Home Depot, Wayfair).[10]Google Cloud (2025–2026) — AP2 Agent Payments Protocol launch with 60+ partners; signed Intent/Cart/Payment mandates; donation to FIDO Alliance.[11]Visa (2025–2026) — Trusted Agent Protocol and Intelligent Commerce; 100+ partners, 30+ building in sandbox; agent-authorization framework.[12]Mastercard (2025–2026) — Agent Pay; January 2026 commitment to support AP2.[13]GeekWire / Winbuzzer (2026) — Amazon v. Perplexity: preliminary injunction blocking Comet agent, Ninth Circuit stay on appeal.[14]HUMAN Security (2025) — AI agent traffic up 144% on Black Friday 2025.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which products and prompts agents consider you for, where competitors are selected instead, and which schema, feed, and trust-signal gaps are keeping you out of the consideration set entirely.

Related research

FancyAI Research Foundational Methodology FancyAI Research
Back to Research
Live · FancyAI Research Corpus

The Visibility Gap Most US Brands Are Ignoring: International and Multilingual GEO

Sixty-nine percent of ChatGPT's users are now outside the United States, and the fastest-growing AI markets on earth speak Hindi, Portuguese, Spanish, and Indonesian. But the models default to English sources, ignore the standard signals that tell a search engine which language version to show, and recommend US domains even when the question is asked in another language. Most American brands optimized for an English-speaking machine and never noticed the audience moved.

69%
Share of ChatGPT users located outside the United States, up from 62% in 2024 (DemandSage, 2026)
327%
More visibility in AI Overviews for translated sites vs. untranslated, on non-English queries (Weglot, 1.3M citations)
53 pts
Gap between the best-localizing AI engine (Perplexity, 56.5% non-global citations) and the worst (Gemini, 5.3%) (xfunnel)
90%+
Share of most LLMs' training tokens that are English, despite English being under 20% of the world's speakers (multiple corpora)
Chapter 01

The audience already left, and the optimization didn't follow

For most of the last two years, the entire GEO conversation has been conducted in English, about English-language engines, citing English-language sources, measured against English-language prompts. That made sense when AI search was a coastal-US novelty. It does not make sense anymore, because the audience has globalized faster than almost anyone optimizing for it.

The clearest single number is platform geography. According to DemandSage's 2026 ChatGPT statistics, the United States now accounts for roughly 31% of ChatGPT's user base, down from 38% in 2024, which means about 69% of ChatGPT users sit outside the US. The same data puts India at roughly 48 million users, the second-largest single-country market behind the US, and shows Asia-Pacific climbing from 19% to an estimated 24-28% of global traffic between 2024 and 2026. ChatGPT crossed 1 billion monthly active users in June 2026 (DemandSage), and the marginal new user is far more likely to be in Mumbai, São Paulo, or Jakarta than in San Francisco.

Adoption is not just large abroad, it is more intense abroad. Visual Capitalist's 2026 mapping of AI adoption, drawing on survey data, found that several Global South economies are adopting AI faster than developed nations: worker AI usage reached 92% in India, the highest of any surveyed country, with Brazil third globally at 76%. Rest of World's 2026 reporting describes the same pattern, generative AI breaking out fastest in markets the US-centric GEO playbook has barely modeled.

Sixty-nine percent of the audience is not in the United States, and the fastest-growing slice of it does not run its first query in English. A GEO strategy tuned only for English-language US prompts is now optimizing for a shrinking minority of the people asking.

The mismatch is the story. Brands poured effort into being cited by an English-speaking machine, while the machine quietly became the default research tool for a planet that is more than 80% non-English-speaking. The visibility gap that opens here is not a rounding error. It is most of the market.

Chapter 02

The English-source bias is structural, not incidental

The reason non-English visibility is so hard is baked into how these models were built. They are English-native systems wearing a multilingual interface, and the imbalance starts in the training corpus.

Common Crawl, the web scrape underlying most foundation models, is itself skewed: its CC-MAIN-2025-47 release was 42% English, with the next language, Russian, at 6.5%, and languages like Hindi, Turkish, and Malay each under 1%. But the final training mix is far more lopsided than the raw web. Published figures put English at roughly 90% of Llama 2's training data, and analyses of GPT-class models estimate over 90% of training tokens are English, leaving the entire rest of human language to share the remainder. Worse, research summarized by Nature in 2025 notes that a large share of what non-English training data exists is machine-translated from English rather than natively written, which means the model often learns a language through a distorted, anglocentric mirror of itself.

Set that against the actual distribution of people. English is the largest language online by speakers, at about 1.19 billion internet users, roughly 26% of the online population (Statista), and English is published on close to half of all websites (Statista, October 2025). Yet fewer than 20% of people on earth speak English at all (LingoBright, 2026). The training data over-represents English by a factor of four-plus relative to the world's speakers.

  • 42% of Common Crawl is English; 6.5% Russian; Hindi/Turkish/Malay each under 1% (Common Crawl, 2025).
  • ~90% of Llama 2 training data is English; 90%+ of GPT-class tokens are English (multiple analyses).
  • English is ~26% of online users but appears on ~half of all websites (Statista).
  • Under 20% of the world speaks English (LingoBright, 2026).

The models are not neutral readers of the global web. They were trained on a corpus where English outweighs every other language combined, and where much of the "non-English" data is English in translation. The default gravity of every answer pulls toward an English source.

This is why the contrast at the heart of FancyAI's thesis matters even more across borders. AI does not rank, it recommends, and the mention is the signal. But the corpus that decides which mention surfaces was assembled with a heavy English thumb on the scale. In a non-English market, being seen is not the problem. Being selected, when the model's instinct is to reach for an English domain, is the entire problem.

Chapter 03

How AI engines cite differently across languages and markets

The bias is structural, but it is not uniform. The most important practical finding of 2026 is that AI engines localize wildly differently from one another, so your visibility in any given country depends heavily on which engine your customers happen to open.

The sharpest measurement comes from xfunnel's analysis of 56,223 citations across four countries and six AI engines. It found a 53-percentage-point gap between the best and worst localizers. Perplexity led at 56.5% non-global citations, with Copilot close behind at 56.0%. The middle pack included Grok at 36.2% and ChatGPT at 29.7%. At the bottom, Gemini sourced just 5.3% of its citations from non-global domains, effectively ignoring local web ecosystems almost entirely. Across the board, 66.5% of top citations came from global, mostly US-based domains, while local ccTLDs like .de and .nl represented only 17.6%, and localized subdomains a near-invisible 0.9%.

Language, not location, is the dominant trigger. Evertune's 2026 testing found that AI responses key primarily off the language of the query rather than the user's location settings: an English prompt surfaces English sources, a Spanish prompt prioritizes Spanish content from local publications. And the engines that do localize, localize most aggressively for their top recommendation, the single most regionally adapted citation is usually the #1 result, which is exactly the slot that gets named in a recommendation.

Glenn Gabe's cross-platform testing (GSQI, 2026) showed the same divide at the mechanical level. Querying in French and Italian:

  • Copilot consistently returned correct-language URLs, leveraging Bing's multilingual systems.
  • ChatGPT answered in the right language but cited US English source URLs.
  • Perplexity usually returned the US English version, occasionally the correct one.
  • Claude defaulted to US English versions when asked for sources.
  • Gemini and Google's AI Mode more often returned the correct language version.

Your AI visibility in Germany or Mexico is not one number. It is six different numbers, one per engine, separated by up to 53 points. Win on Perplexity and you may be invisible on Gemini in the same market, for the same query, on the same day.

The strategic consequence is that "which engine" becomes a market-entry decision, not just a tracking detail. The engine your local buyers use determines whether your domain is even in the candidate pool.

Chapter 04

Where AI search is growing fastest, and it isn't the US

The English-source bias would be a manageable footnote if AI search were still concentrated in English-speaking markets. It is the opposite. The growth is overwhelmingly in markets where the model's default instinct works against local brands.

On the surface where Google injects AI, the country skew is dramatic. Semrush data reported in 2026 found that AI Overviews trigger on 37.2% of keywords in Indonesia, 29.1% in the Philippines and Mexico, 26.8% in India, and 26.4% in Nigeria, while the United States ranks 13th at just 20.5%. The AI answer layer is denser in emerging non-English markets than in the US.

Population-level adoption tells the same story. Visual Capitalist's 2026 data shows India leading worker AI adoption at 92% and Brazil third at 76%, and student-adoption rankings led by Brazil (11.6%) and India (11.5%). India's traffic on ChatGPT roughly doubled within a single month after the launch of the lower-cost ChatGPT Go tier at $4.50/month (DemandSage), a price designed precisely for high-volume, non-US markets.

  • AI Overviews fire on 37.2% of Indonesian keywords vs. 20.5% in the US (Semrush, 2026).
  • India: 92% worker AI adoption, the world's highest (Visual Capitalist, 2026).
  • Brazil: 76% worker adoption, third globally; ~5.7% of ChatGPT traffic (Visual Capitalist; DemandSage).
  • Asia-Pacific is ~28% of ChatGPT traffic and the fastest-growing region (DemandSage, 2026).

The fastest AI-search growth on earth is happening in Indonesia, India, the Philippines, Brazil, and Nigeria, in languages the models under-trained on, on engines that default to US domains. That is the gap. The demand is exploding exactly where the supply of local citations is thinnest.

For a US brand with any international ambition, this reframes the opportunity. The markets adopting AI search most aggressively are also the markets where the competitive field of well-optimized local content is emptiest. The first competent multilingual GEO program in a category often faces almost no one.

Chapter 05

Translation is not localization, and AI can tell the difference

The instinct, once a brand sees the gap, is to run its English pages through machine translation and call the market covered. That is the single most common and most costly mistake in international GEO, because the evidence shows translation moves the needle but localization is what actually wins selection.

The translation half is real and measurable. Weglot's 2026 study, analyzing 1.3 million citations across Google AI Overviews and ChatGPT for Spanish-language markets, found that translated websites received up to 327% more visibility in AI Overviews on non-English queries than untranslated ones. Translated sites pulled 24% more total citations per query, and critically, untranslated sites showed a 431% gap in citations between Spanish and English queries, versus only a 22% gap for translated sites. Weglot's blunt summary: untranslated means invisible. If your page does not exist in the query's language, the model treats you as if you do not exist for that query.

But translation only opens the door. Localization decides whether the model trusts you enough to name you. The same body of research found that geography compounds language: in US-based testing, Spanish queries still returned predominantly English sources, with only about 32% of citations from Spanish content, but run through a Mexico City connection the Spanish share jumped to roughly 63%. In localized Mexican-market testing, 96% of AI Overview citations came from Spanish sources, and English sources were pushed out of the top five entirely when a Spanish option existed. Language tells the model what to read. Geography and local authority tell it whom to believe.

This is where generic translation fails. The KnowledgeBase research and 2026 practitioner testing converge on a consistent finding: regional slang and idiom outperform "generic Spanish." Content written for Mexico beats content written in neutral textbook Spanish for Mexican queries, and the same holds for Spain versus Latin America, or Brazilian versus European Portuguese. AI systems can detect thin, machine-translated text and are measurably less likely to cite it. The pages that win carry local examples, local expert quotes, local publication mentions, and region-specific data, the regional E-E-A-T signals a machine reads as genuine local authority.

Translation gets you into the language. Localization gets you selected within it. A machine-translated page is visible the way a tourist speaking phrasebook Spanish is audible, technically present, obviously foreign, and rarely the voice anyone trusts.

The practical line is clean. Machine translation is a starting point, never the finish. Every page that matters needs a human editor fluent in the regional variant, regional keyword research instead of translated English keywords, and local proof the model can verify off your domain.

Chapter 06

Regional platforms and the hreflang blind spot

Two technical realities sit underneath everything above, and both are routinely missed by US teams. The first is that large parts of the world do not run on Western engines at all. The second is that the standard signal for serving the right language version is largely ignored by the AI layer.

Start with the platform map. In China, Western AI engines are functionally absent, and the field has consolidated around domestic models. Search Engine Land's 2026 reporting describes a fragmented Chinese ecosystem where Baidu's ERNIE, ByteDance's Doubao, DeepSeek, Kimi, and Alibaba's Qwen dominate. Baidu released ERNIE 5.1 in May 2026, landing at #4 on the LMArena Search Arena leaderboard, and is folding DeepSeek into its search product. Optimizing for these systems is a separate discipline with separate signals, separate hosting and indexing realities, and separate content norms. Korea (Naver) and Japan carry their own platform-and-language dynamics where local-language content is essentially mandatory for local queries. Treating "Asia" as one market, or assuming a ChatGPT strategy transfers to Baidu, is a category error.

The second reality is the hreflang blind spot. Hreflang tags are the web standard that tells a search engine which language and regional version of a page to serve. Traditional Google search honors them. But Glenn Gabe's 2026 testing (GSQI) found that most AI chat platforms largely ignore hreflang, with the consistent exception of Bing-powered Copilot. ChatGPT, Perplexity, and Claude did not reliably use hreflang to pick the correct language URL. Google's AI Mode and Bing did. So the signal you would normally lean on to route a French user to your French page does almost nothing inside most AI chat, today.

  • China runs on ERNIE, Doubao, DeepSeek, Kimi, and Qwen, not Western engines (Search Engine Land, 2026).
  • ERNIE 5.1 (May 2026) ranked #4 on the LMArena Search Arena leaderboard (Codersera/LMArena).
  • Hreflang is ignored by most AI chat platforms except Bing-powered Copilot (GSQI, 2026).
  • ccTLDs are 17.6% of top citations; localized subdomains just 0.9% (xfunnel).

The signal you trust to serve the right language is invisible to the engines that matter most. Most AI chat does not read hreflang. It reads the content, the language it is written in, and the local authority around it. You cannot route your way into the answer. You have to localize into it.

The takeaway is not to abandon hreflang, it still serves traditional search and may matter more as AI platforms mature, but to stop relying on it as your multilingual AI strategy. The durable signal is genuinely localized content with local authority, because that is what every engine, Western or regional, actually evaluates.

Chapter 07

The execution playbook for multilingual AI visibility

International GEO is not a translation project bolted onto an SEO team. It is a market-by-market selection problem, and it runs on a sequence.

1. Pick markets by AI demand, not legacy revenue. Prioritize where AI search is growing and your category is under-served. The Semrush AI Overview trigger rates and the adoption data point to India, Indonesia, Brazil, Mexico, the Philippines, and Nigeria as high-AI-density, low-competition openings. Rank candidate markets by AI search adoption, engine mix, and how empty the local citation field is.

2. Map the engine mix per market before you write a word. Visibility is six different numbers separated by up to 53 points (xfunnel). Identify which engines your buyers actually use in each market, ChatGPT and Perplexity in much of the West, Copilot where Microsoft is entrenched, Baidu/ERNIE and peers in China, Naver in Korea, and weight effort accordingly.

3. Localize, do not translate. Machine translation is the first draft. Every page that matters gets a regional human editor, regional keyword research, regional idiom, and local examples. Generic Spanish loses to Mexican Spanish; phrasebook output loses to native voice. The 327% translation lift is real, but the 96% local-source dominance only comes with genuine localization.

4. Build local-language authority off your own domain. The mention is the signal in every language. Earn brand mentions and citations on the local-language publications, review sites, forums, and directories the regional engines defer to. Local editorial verification outweighs any amount of self-description, and it is the regional E-E-A-T that pushes English defaults out of the top five.

5. Use ccTLDs and local hosting where you can, but don't bank on routing. ccTLDs (17.6% of top citations) outperform subdomains (0.9%) as a localization signal (xfunnel). Implement hreflang for traditional search, but treat localized content and local authority, not hreflang, as the engine of AI visibility.

6. For regional platforms, build a separate program. Baidu, Naver, and peers are not a translation of your Western strategy. They need native content, local hosting and indexing, and platform-specific structure. Budget for them as distinct workstreams or not at all.

7. Measure selection per market and per engine. Rank tracking by country is not enough. Track share of voice in AI answers per language, per engine, the prompts you appear and do not appear in, and which local competitors get named when you do not. Weglot's data shows pickup is fast where the work is real, roughly 21% of properly translated pages were referenced by AI within 60 days, so the feedback loop is measurable in weeks, not years.

The old job was being seen in a market. The new job is being selected in it, in the local language, on the engine local buyers use, against an English default the model reaches for first. You cannot translate your way there. You localize, you build local authority, and you measure selection one market at a time.

Chapter 08

The first-mover window is open and closing

Every disruption has a window where the work is cheap because almost no one is doing it. International GEO is in that window right now, and the asymmetry is unusually favorable.

Three forces compound. First, demand is exploding in exactly the markets US brands have ignored, with AI Overviews firing on more than a third of Indonesian keywords and worker adoption above 90% in India (Semrush; Visual Capitalist). Second, the competitive field of well-localized content in those markets is nearly empty, because most global brands stopped at machine translation or never localized at all, leaving the local citation ecosystem thin. Third, the structural English bias means the brands that do localize properly are not fighting other localized competitors, they are fighting the model's lazy default toward US domains, which a single well-built local presence can displace, as the 96% Spanish-source dominance in localized Mexican testing shows.

The brands that move now get to define the local answer in their category before the field fills in. The ones that wait will face the same problem they face in English at home: a crowded field, established local authorities, and a model that already has a trusted answer that is not them.

The first-mover window in international GEO is wider than it ever was in SEO, because the demand is bigger, the competition is thinner, and the default the model reaches for is beatable. The brands that localize for selection now will own the answer in markets where their rivals are still translating.

The audience already globalized. The optimization has not. That gap is the opportunity, and it is the one most US brands are still ignoring.

Sources cited

  1. DemandSage 2026 ChatGPT Statistics — US share of users down to ~31% (69% outside US); ~48M Indian users; Asia-Pacific ~28% and fastest-growing; 1B MAU in June 2026; ChatGPT Go at $4.50/month doubling Indian usage.[02]Weglot "Untranslated Means Invisible" study (1.3M citations, Google AI Overviews + ChatGPT, Spanish markets) — 327% more AI Overview visibility for translated sites; 24% more citations per query; 431% vs. 22% query-language gaps; ~32% to ~63% Spanish-source shift with localization; 96% Spanish-source dominance in localized Mexican testing; ~21% pickup within 60 days.[03]xfunnel "Do AI Search Engines Localize by Country?" (56,223 citations, 4 countries, 6 engines) — 53-point localization gap; Perplexity 56.5% and Copilot 56.0% vs. Gemini 5.3%; 66.5% global-domain dominance; ccTLDs 17.6%, subdomains 0.9%.[04]Glenn Gabe / GSQI (2026) — Cross-platform French/Italian testing; Copilot returns correct-language URLs, ChatGPT/Perplexity/Claude default to US English sources; hreflang largely ignored by AI chat except Bing-powered Copilot.[05]Semrush AI Overviews country data (2026) — AI Overview trigger rates: Indonesia 37.2%, Philippines/Mexico 29.1%, India 26.8%, Nigeria 26.4%, US 20.5% (13th).[06]Visual Capitalist 2026 AI Adoption mapping — India 92% worker adoption (highest), Brazil 76% (third); student adoption led by Brazil 11.6% and India 11.5%; Global South outpacing developed nations.[07]Statista / LingoBright (2025-2026) — English ~26% of online users and ~half of websites; under 20% of the world speaks English.[08]Common Crawl (CC-MAIN-2025-47) and corpus analyses — Common Crawl ~42% English; Llama 2 ~90% English training data; 90%+ English tokens in GPT-class models; much non-English data machine-translated (Nature, 2025).[09]Search Engine Land / LMArena (2026) — China's fragmented ecosystem (ERNIE, Doubao, DeepSeek, Kimi, Qwen); ERNIE 5.1 launched May 2026 at #4 on the Search Arena leaderboard.[10]Evertune (2026) — AI responses key primarily off query language, not location; engines that localize do so most aggressively for their #1 citation.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which prompts the engines name you in, per market and per language, where local competitors get recommended instead, and which localization and authority gaps are keeping your brand out of the answer outside the US.

Related research

Original Research Foundational Methodology Buyer Behavior
Back to Research
Live · FancyAI Research Corpus

Finance Runs on Borrowed Trust: How AI Decides Which Banks, Cards, and Advisors It Recommends

In the highest-stakes vertical AI touches, the brands that own the product almost never own the answer. Across 200,000-plus AI citations in wealth management, NerdWallet appeared in **38% of responses** and Bankrate in **35.3%** — while the banks, card issuers, and advisory firms whose products were being discussed sat largely outside the citation set. In a regulated, your-money-or-your-life domain, AI does not reward the institution. It rewards the source the institution is described in.

55%
Of Americans asked an LLM for financial advice in the past year, up from 10% — TD Bank, 2026
60%
Of citations in AI finance answers come from publishers and affiliates, not financial institutions — FintelConnect
25.8%
Of financial-services searches now surface an AI summary — Conductor 2026 AEO/GEO Benchmarks
76.7%
Of Gemini's financial-research citations were fabricated in one accuracy test — Beyond Accuracy / arXiv, 2025
Chapter 01

The trust bar in finance is the highest AI applies, and that cuts both ways

Financial services is a YMYL category — "Your Money or Your Life" in Google's quality framework — which means AI engines apply a scrutiny here they apply almost nowhere else. The stakes are real: a wrong answer about a loan, a tax move, or an investment can do measurable harm. That elevated bar reshapes who gets surfaced.

The correlation data makes the size of the gap concrete. E-E-A-T signals — experience, expertise, authoritativeness, trustworthiness — correlate with roughly 8% of ranking weight across general queries, but for YMYL queries that correlation roughly triples to about 24%, according to DollarPocket's 2025 correlation study. In finance, the trust signals that are "nice to have" elsewhere become the deciding factor.

The volume of AI exposure in the category is now substantial. Conductor's 2026 AEO/GEO Benchmarks Report found that 25.8% of financial-services searches surface an AI-generated summary, placing finance among the most AI-visible industries alongside healthcare. Financial services captured 27.3% share of voice in AI Overview visibility, and article content was by far the most-cited page type, with more than 110,000 article pages referenced across the results Conductor analyzed.

Healthcare content appears most frequently in Google's AI Overviews, followed by financial services. Finance is now one of the two most AI-mediated verticals on the internet. — Conductor, 2026 AEO/GEO Benchmarks Report

That visibility arrives at the same moment consumers are turning to AI for the decisions that matter most. TD Bank's second annual survey of 2,500 consumers found that 55% of Americans asked an LLM for financial advice in the past year — up from just 10% the year before. Among Gen Z the figure reaches 77%, and among millennials 72%. EY's global survey of more than 18,000 people across 23 countries put it at 49% of consumers worldwide using AI to support savings and investment decisions in the past six months.

The high bar is not only a constraint. It is a moat. The same scrutiny that keeps thin, unaccredited content out of finance answers rewards the firms that invest in genuine expertise, credentials, and verifiable accuracy. In a vertical where AI demands proof, proof becomes a durable advantage.

Chapter 02

Banks don't own finance answers. Aggregators do.

The single most important fact about GEO in financial services is also the most uncomfortable for the institutions in it: banks did not make the top 10 most-cited finance domains in AI search (Goodie). Not because they fail to publish — they publish constantly — but because they are not the sources AI engines lean on.

The citation hierarchy belongs to aggregators and publishers. In Goodie's analysis of the high-traffic financial-services subindustry, NerdWallet (10.14%) and Bankrate (8.47%) were the most-cited domains overall. In Gregory's larger study of 200,000-plus AI citations across ChatGPT, AI Overviews, Gemini, Claude, and Perplexity — 20,771 responses citing 8,433 domains and more than 27,000 pages — the concentration was even sharper for broad national queries:

  • NerdWallet — 38% of responses
  • Bankrate — 35.3%
  • The Wall Street Journal — 24%
  • CNBC — 20.7%
  • Forbes — 19.3%
  • Barron's — 17.3%

More than 60% of citations in AI finance answers come from publishers and affiliate sites rather than financial institutions themselves (FintelConnect). In Conductor's data, two publishers — NerdWallet and Bankrate — accounted for roughly 15% of all sources cited across financial-product searches. NerdWallet, in Goodie's phrasing, is "the closest thing to a single source of truth in AI finance searches."

Banks didn't make the Top 10 list. It's not because they don't publish content. It's because they're not being referenced by the sources AI models lean on. — Goodie, Most Cited & Trusted Finance Domains in AI Search

This is the gatekeeper dynamic in its purest form, and finance has it worse than any other vertical FancyAI tracks. When a consumer asks an AI for the best high-yield savings account or the best travel credit card, the model assembles its answer from comparison content, product explainers, and definitions drawn from across the ecosystem. The bank's own site is rarely the spine of that answer. As FintelConnect frames it, AI has become the intermediary: it decides which products are relevant, how they are described, and whether they are included at all. If a product is not present in affiliate and comparison content, it may never reach AI consideration.

The implication is structural. The content ecosystem around a financial brand now matters as much as the brand's own website. A bank with the best savings rate in the country can still be invisible in AI answers if NerdWallet, Bankrate, and the trade press have not described it.

Chapter 03

Being seen is not being selected

Finance is where the gap between visibility and influence is widest, and Semrush's AI Visibility Index put a name to it: the Mention-Source Divide. Fewer than 1 in 5 brands are both frequently mentioned and consistently cited as an authoritative source in AI answers (Semrush). Being talked about and being trusted as the source are two different states, and most brands hold only the first.

Finance shows an unusually concentrated answer space. Semrush measured a source-diversity score of just 2.59 for finance in ChatGPT — among the lowest of any industry — meaning a small set of brands dominates financial recommendations while everyone else competes for the margins. Low diversity is good news for the incumbents who already own the citations and brutal news for everyone trying to break in.

The contrast FancyAI draws across every report holds hardest here. AI does not rank financial brands and let the user click down a list. It recommends one or two, in prose, with the reasoning baked in. A mention in passing does not move a decision. Being the cited source behind the recommendation does. In a category where the answer is a single sentence — "a good option for your situation would be X" — the difference between being seen and being selected is the difference between existing and not.

What others say about your brand comes ahead of what you have to say. Different engines lean on different sources of authority. — Semrush, AI Visibility Index

This reframes the entire job. The goal is not to publish more on your own domain. The goal is to become the brand the trusted sources describe accurately, completely, and often enough that the model selects you.

Chapter 04

Each engine has a different definition of "trustworthy"

There is no single finance answer. Each AI engine carries its own trust fingerprint, and the differences are large enough to break any one-size strategy.

  • Gemini and Google AI Mode lean toward institutional and authority sources. Google AI Mode draws heavily on Bankrate and on Yahoo Finance, which Semrush found has become a trusted finance source specifically inside AI Mode. Gemini, in FinTech Weekly's analysis, leans more on financial institutions' own pages than its peers do.
  • ChatGPT draws more from publishers, community, and independent experts. Semrush found that when ChatGPT answers finance questions, Reddit outranks financial experts 176% of the time — a striking result given that YMYL guidelines are supposed to prioritize accredited authority.
  • Perplexity and Copilot also favor publishers and independents over institutions. ChatGPT referenced roughly 4x more citations than Copilot in FinTech Weekly's testing, a reminder that the same query produces very different source sets across engines.

The breakdown by query type matters as much as the breakdown by engine. Gregory's study found that for national, generic queries, the aggregators and tier-1 media dominate. But for location-based queries, tier-1 media collapses — The Wall Street Journal fell from 24% to 3.1%, CNBC from 20.7% to 2.9% — and local relevance takes over. For persona-driven queries about underserved or niche segments, brand-owned content is most often cited, because the media simply has not covered the niche. And for industry-facing queries — best technology platform, strongest compliance support, most competitive payout — trade media like InvestmentNews, WealthManagement.com, and Financial Planning rank among the top five sources.

The takeaway is operational: a national "best credit card" play, a local "financial advisor near me" play, and a niche "advisor for equity-comp employees" play are three different GEO problems with three different source sets. A single visibility strategy will not work across engines or across query types.

Chapter 05

Accuracy and brand safety: the risk finance can't ignore

In most verticals a hallucinated detail is an embarrassment. In finance it is a liability. AI's tendency to fabricate is well documented, and it is worse for financial content than the headlines suggest. One accuracy test found that ChatGPT-4o invented false references about 20% of the time when asked to cite financial research, and Gemini fabricated citations in 76.7% of cases (Beyond Accuracy, arXiv, 2025).

Regulators are now treating this as a supervised risk. FINRA's 2026 Regulatory Oversight Report defines hallucinations as "instances where the model generates information that is inaccurate or misleading, yet is presented as factual," and warns that a model misstating a regulatory requirement or client detail can drive flawed downstream decisions. FINRA's framework is technology-neutral: existing rules on supervision, communications, recordkeeping, and fair dealing apply directly to generative AI, and firms remain responsible for the outputs. The SEC, FINRA, and the FCA all expect the same rigor applied to AI that applies to any other regulated communication.

This creates a dual mandate that is unique to finance. Brand-safety risk runs in two directions:

  1. AI misrepresenting your products to consumers. If an engine states the wrong APR, the wrong fee, the wrong eligibility rule, or the wrong fiduciary status, the consumer acts on it — and the brand wears the consequence. Structured, machine-readable, current product data is the defense.
  2. Your own use of AI in regulated communications. Any AI-generated marketing or advice content is a regulated communication subject to FINRA and SEC supervision rules. Hallucinated claims are compliance failures, not just marketing errors.

The practical response is the same one that wins GEO: publish accurate, structured, well-attributed product and rate data, keep it current, and make compliance language explicit. Disclosures, fiduciary status, and risk statements are not only legal requirements. They are trust signals AI reads as evidence of a legitimate, accountable source. In finance, compliance content and GEO content are the same content.

Chapter 06

How a bank, insurer, or fintech becomes eligible

Eligibility in finance is earned across the ecosystem, not just on your own domain. The playbook follows the signal hierarchy FancyAI applies everywhere, weighted for a YMYL world.

1. Get into the aggregator and comparison layer. This is non-negotiable and finance-specific. If your product is absent from NerdWallet, Bankrate, and the comparison content AI leans on, it is structurally disadvantaged in AI recommendations. Affiliate and comparison placements now serve a dual purpose: direct acquisition and AI visibility. Treat them as a citation channel, not just a marketing channel.

2. Earn editorial validation from trusted publications. In YMYL, low-authority links do not move the needle — editorial validation from trusted publications is practically required (Search Engine Land's reporting on regulated-industry SEO). Coverage and accurate description in tier-1 and trade media is the authority layer beneath the aggregators.

3. Publish structured, current product data. AI assembles finance answers from comparison-ready facts: rates, fees, terms, eligibility, risks. Mark them up with schema, keep them current, and make them machine-readable. Comparison-heavy, structured content maps directly to how LLMs construct product-focused answers (Goodie).

4. Build credentialed, attributed expertise. Author bios, credentials (CFP, CFA, fiduciary status), and cited sources are the E-E-A-T signals that carry triple weight in YMYL. Attribute content to named, qualified people. Expertise that is visible and verifiable is expertise AI can trust.

5. Make compliance explicit, not buried. Disclosures, risk statements, and regulatory status read as trust to AI, not just to lawyers. The firms that surface compliance clearly look more legitimate to the model, not less.

6. Own your niche before you fight for the head term. Mid-volume keywords once locked up by Investopedia and NerdWallet — "Roth conversion strategies," "tax-loss harvesting," "advisor for equity-comp employees" — are now winnable by focused firms with clear, deep, credentialed content (Kitces). Persona and location queries reward brand-owned content precisely because the aggregators have not covered the niche. Topical depth in a narrow lane builds citable authority faster than competing head-on for "best savings account."

Chapter 07

The execution playbook

For banks, insurers, lenders, card issuers, robo-advisors, and advisory firms, the work sequences in a clear order. Pair each step with measurement: track mentions, citation share, and source-level accuracy per engine, because a national strategy, a local strategy, and a niche strategy will each move different numbers.

  1. Audit your presence across the gatekeepers. Where do you appear in NerdWallet, Bankrate, and comparison content — and is what they say about you accurate and current? Inaccurate aggregator data is a direct AI-visibility liability.
  2. Map the engine and query-type matrix. Test your priority queries across ChatGPT, Gemini/AI Mode, Perplexity, and Copilot, and across national, local, and persona framings. Build the source set for each cell. Do not assume one answer.
  3. Fix and structure your product data. Schema-mark rates, fees, terms, eligibility, and risks. Keep a single source of truth that propagates everywhere, so the model never has stale or conflicting numbers to choose from.
  4. Stand up credentialed content in your niches. Pillar-and-cluster content around defined personas, geographies, or product lines, attributed to named, qualified experts. Win the lanes the aggregators ignore.
  5. Earn editorial and trade validation. Pursue accurate description in tier-1 and trade media — InvestmentNews, WealthManagement.com, Financial Planning for advisory; the major financial press for consumer products.
  6. Treat compliance as a GEO asset. Surface disclosures, fiduciary status, and risk language clearly. It is both legally required and a trust signal AI reads.
  7. Monitor for hallucination and misrepresentation. Check what each engine says about your rates, fees, and status on a recurring cadence. In a YMYL category, an AI error about your product is a brand-safety and compliance event, not just a marketing miss.

The brands that win finance GEO will not be the ones that publish the most on their own sites. They will be the ones the trusted sources describe accurately, the ones whose structured data the engines can parse without error, and the ones whose expertise and compliance are visible enough that the model selects them with confidence. In a vertical built on trust, AI selects the brand it can trust to be right.

Sources cited

  1. Gregory (via BusinessWire / Morningstar / Yahoo Finance) — 200,000+ AI citation study across ChatGPT, AI Overviews, Gemini, Claude, Perplexity; NerdWallet 38% / Bankrate 35.3%; national vs. local vs. persona vs. industry query breakdowns.[02]Goodie — Most Cited & Trusted Finance Domains in AI Search; NerdWallet 10.14% / Bankrate 8.47%; banks absent from top 10; "single source of truth" framing.[03]FintelConnect (via FinTech Weekly) — 60%+ of finance AI citations from publishers/affiliates; AI as intermediary; aggregators as gatekeepers.[04]Conductor — 2026 AEO/GEO Benchmarks Report; 25.8% of finance searches surface AI summaries; 27.3% AIO share of voice; 110,000+ article pages cited; two publishers ~15% of sources.[05]TD Bank — 2026 consumer survey (2,500 respondents); 55% asked an LLM for financial advice (up from 10%); Gen Z 77%, millennials 72%.[06]EY — Global survey (18,000+ across 23 countries); 49% used AI for savings/investment decisions in six months.[07]Semrush — AI Visibility Index; Mention-Source Divide (fewer than 1 in 5 brands both mentioned and cited); finance diversity score 2.59; Reddit outranks experts 176%; engine trust differences.[08]FinTech Weekly — Engine trust fingerprints; Gemini favors institutions; ChatGPT 4x more citations than Copilot; cross-model strategy.[09]FINRA — 2026 Regulatory Oversight Report; hallucination definition; technology-neutral supervision framework for generative AI.[10]Beyond Accuracy (arXiv, 2025) — Citation fabrication rates; ChatGPT-4o ~20%, Gemini 76.7% in financial-research citation tests.[11]DollarPocket (2025 correlation study) — E-E-A-T correlation ~8% general vs. ~24% YMYL.[12]Kitces.com — AI opportunity for niche advisors; mid-volume keywords now winnable; clarity and topical depth over domain authority.[13]Search Engine Land — Regulated-industry SEO; editorial validation from trusted publications practically required in YMYL.[14]McKinsey — Global Banking Annual Review; gen AI could add $200B–$340B annually to global banking (2.8–4.7% of revenue).

Want this measured against your brand?

The AI Readiness Index (ARI) shows exactly where your institution stands across ChatGPT, Gemini, Perplexity, and AI Mode — who AI cites for your products, whether your rates and fees are represented accurately, and where the aggregator gap is costing you the recommendation. In a YMYL category, measuring accuracy is brand safety.

Related research

Foundational Methodology Original Research Brand Risk
Back to Research
Live · FancyAI Research Corpus

GEO for Legal: The Directories Own the Citation Layer, and Zero Law Firms Own It Back

Legal is the highest-trust, most-regulated, most locally-driven vertical in AI search — and it triggers more AI answers than any other YMYL category. When a client asks an AI engine to recommend a lawyer, the answer comes from seven directories, not from law firm websites. The firm that built the expertise is often the one the model never names.

77.67%
Share of legal queries that trigger an AI Overview — the highest of any YMYL category (SE Ranking)
~7
Number of directories that own the AI citation layer for virtually every legal query category tested (5WPR / Haute Lawyer, 2026)
65%
Share of Americans who have already used an AI chatbot for legal help (Rev AI Legal Advice Index, 2026)
1 in 3
Worst-case rate at which a leading AI legal-research tool returns fabricated information (Stanford, Journal of Empirical Legal Studies, 2025)
Chapter 01

The elevated authority bar: legal is the hardest YMYL category to be selected in

Generative engines do not answer every question the same way. Ask for a weekend itinerary and the model improvises from a wide pool. Ask "Do I need a lawyer for a first DUI in Texas?" and the model changes posture. It narrows. It hedges. It reaches for sources it can defend in front of a regulator.

This is the YMYL effect — Your Money or Your Life — and legal sits at the very top of it. Google's Search Quality Rater Guidelines flag legal topics as content where bad information can cause real harm to a person's finances, freedom, or family, and the AI systems built on top of search inherit that caution. The data makes the posture concrete. Legal queries trigger an AI Overview 77.67% of the time — the highest rate of any YMYL category, ahead of health at 65.33%, finance at 41.67%, and politics at 16.67%, according to SE Ranking's YMYL research. When someone asks a legal question, an AI answer is now the default, not the exception.

That default matters because of who is asking and why. 65% of Americans have already turned to an AI chatbot for legal help, per the Rev AI Legal Advice Index (2026). They are not outsourcing their cases to a machine. They are gathering information at the earliest stage: 43% feel comfortable using AI to clarify legal terms, 38% use it to research general rights before consulting a professional, and 41% reject AI entirely for high-stakes matters like criminal charges, divorce, or immigration. The pattern is consistent across the corpus — AI is the front door, not the courtroom.

AI-generated summaries now appear at the top of the results page, frequently before any individual law firm website is shown. Inclusion or exclusion at this stage determines whether the firm's expertise is visible at all." — Harvard Journal of Law & Technology, "AI as the New Front Door to Legal Services," 2025

The Harvard JOLT study examined 50 US law firm websites against AI Overview behavior and found most firms are poorly adapted to AI-mediated search. Pages bury core answers beneath marketing language. Few use FAQ-style headings or structured data. Many omit attorney bylines and statute citations. Local landing pages are thin and generic. The study's conclusion reframes the whole problem: AI-mediated visibility is no longer a marketing question. It is, the authors argue, an emerging professional responsibility issue for firms, regulators, and legal educators alike.

The selectivity is the entire game. And in legal, the model is more selective than almost anywhere else.

Chapter 02

Who owns legal citations today: a seven-directory cartel

In most verticals the question for a brand is "can I get into the answer." In legal, the more honest question is "can I get into a citation set that is already owned by a handful of directories I do not control."

The concentration is not subtle. The 2026 Legal AI Visibility Report — a joint audit by the Haute Lawyer Network and 5WPR across three query types (finder, decision, and elite) and eight practice areas — found that roughly seven ranking directories own the AI citation layer for virtually every legal query category tested. When consumers and businesses ask ChatGPT, Claude, Perplexity, or Google AI Mode to recommend a lawyer or a firm, the answer comes from Chambers, Legal 500, Super Lawyers, Best Lawyers, Martindale, Avvo, and Justia. The report's blunt headline finding: the directory cartel owns the legal citation layer, and zero law firms own their own AI citation layer.

The ownership splits by query intent:

  • Finder queries ("best personal injury lawyer NYC"): Super Lawyers consistently owns the top position, with Justia immediately below and Avvo, Martindale, and FindLaw rounding out the top tier (5WPR / Haute Lawyer, 2026).
  • Elite queries ("best M&A law firm US"): Chambers owns multiple positions and Legal 500 owns parallel positions, reflecting how AI defers to peer-reviewed, editorially-vetted rankings for sophisticated buyers (5WPR / Haute Lawyer, 2026).
  • Decision queries ("is Firm A better than Firm B"): the model synthesizes across directory profiles, reviews, and news rather than reading either firm's own marketing.

This is why directory presence is table stakes and not a nice-to-have. Justia, Avvo, FindLaw, and Martindale function as trusted, structured, pre-verified data sources that AI models lean on precisely because the directories have already confirmed attorney credentials, bar admissions, and practice areas (Martindale-Avvo, 2026). The model treats them as a vetting layer it does not have to rebuild. A firm absent from those listings is not low-ranked. It is structurally invisible to the citation set.

Sites like Justia, Avvo, and FindLaw act as trusted digital listings that provide authoritative signals for both search engines and AI systems that rely on structured legal data." — Rankings.io, "10 Best Legal Directories for Law Firms in 2026

There is a deeper structural story here. The major consumer directories — FindLaw, Avvo, Nolo, Martindale — have been consolidated under private equity ownership. The same report frames the strategic question every firm now faces: the directories own the citation layer, the firms own none of it, and the legal industry is, by 5WPR's measure, roughly six quarters behind the AI discovery shift. The contrast that defines legal GEO is not ranking against rival firms. It is whether a firm can build enough independent, verifiable authority to be named alongside — or instead of — the directory that currently answers in its place.

Chapter 03

The signals that drive eligibility

Eligibility in legal GEO is earned through signals the model can verify and cross-reference. Marketing language does not qualify a firm. Provable authority does. Five signal categories carry the weight.

1. Directory presence and credential verification. The seven-directory layer is the first gate. Complete, consistent profiles on Justia, Avvo, Martindale, FindLaw, and — for sophisticated buyers — Best Lawyers, Super Lawyers, Chambers, and Legal 500 are what feed the model verified credentials it will repeat. Martindale-Avvo describes these as the data sources AI relies on because the firm's credentials are already checked. Absence here is disqualifying before any other signal is read.

2. Review volume and consistency. Reviews are a load-bearing eligibility signal in legal, not a vanity metric. Forward Push's testing found that firms with 50 or more aggregated reviews across BBB, Yelp, and Google averaged 52% higher recommendation rates in AI results. One solo attorney who reached the top of both Google Business Profile and ChatGPT results reported that the model "really likes my firm's Google reviews" (r/solofirm). Volume and genuine cross-platform consistency signal a real, active practice the model can vouch for.

3. Practice-area specialization. AI rewards focus and punishes breadth. A family-law firm that practices only family law outranked larger multi-practice competitors in ChatGPT results because the model could cleanly identify it as a specialist (Rocket Clicks). The matching is granular: a birth-injury prompt surfaces firms with demonstrated birth-injury experience, not the umbrella category of medical malpractice (Custom Legal Marketing). Multi-practice firms struggle because the model cannot determine what they do best.

AI rewards specialization. Multi-practice firms struggle to rank because AI cannot determine what they do best." — Rocket Clicks, 2025

4. Authoritative, jurisdiction-specific legal content. The model can verify specific legal claims and cannot verify vague ones. Citing "California Code of Civil Procedure Section 335.1" is checkable against authoritative databases; "statutes of limitations vary by state" is not (Lexicon Legal Content). A criminal-defense firm that rewrote generic DUI pages to answer the actual questions clients ask — "If I refuse a breathalyzer, will my license automatically suspend?" — with jurisdiction-specific statute citations began appearing in AI Overviews (Lexicon Legal Content). Plain language outperforms legalese: "You must file your injury claim within two years or you may lose your right to sue" beats "the claimant must seek recourse through established tort remedies within the jurisdictional time frame" (Exults).

5. Third-party validation and entity consistency. The model reads the whole web, not just the firm's site. Recommended firms showed up "everywhere — LinkedIn, YouTube, legal directories, local news mentions, bar association references," giving the model a coherent identity it could summarize confidently (Forward Push). NAP consistency (Name, Address, Phone) must be identical across the website, social profiles, and every directory; even "St." versus "Street" variations can damage AI trustworthiness scores (Martindale-Avvo). Bios that read alike because lawyers copy-paste descriptions across platforms actively hurt, because the model cannot distinguish one professional from the next (Best Lawyers).

The inversion every legal marketer needs to internalize: the tactics that produced lift in classic SEO — keyword density, promotional tone, thin local landing pages at scale — are neutral or negative here. Eligibility comes from proof the model can verify, not promotion it has to take on faith.

Chapter 04

Local and practice-area dynamics: two different games at once

Legal GEO is not one problem. It is two, and they behave differently.

The local game is "near me" and city-plus-practice queries — "best personal injury lawyer near me," "divorce attorney in Denver." Here the directories with deep local coverage dominate the answer. Justia, Super Lawyers, FindLaw, and Yelp each maintain city-by-city, practice-by-practice listings that the model reaches for because they aggregate exactly the local signal a consumer wants: ratings, reviews, awards, and case results in a single structured page. A firm wins this game by being densely present in those local listings, carrying strong Google Business Profile reviews, and publishing genuinely local content — not a thin template page swapping in a city name, which the Harvard JOLT study found AI systems routinely ignore.

The practice-area game is "best [practice] lawyer" and elite-buyer queries — corporate, M&A, IP, complex litigation. Here the editorially-vetted, peer-reviewed rankings carry the weight: Chambers and Legal 500 own elite queries because the model trusts their methodology over self-reported marketing. This is where independent recognition compounds. As Best Lawyers' Phil Greer put it, "LLMs have been doubling down on sources of truth that are credible. Rankings like Best Lawyers are extremely important because they are backed by data and peer review."

AI does not replace professional authority. It amplifies the firms and recognitions that have already earned it." — Best Lawyers, 2025

The two games reward different work. The local game rewards directory density, review volume, and local specificity. The elite game rewards peer recognition, editorial rankings, and substantive thought-leadership content the model can attribute to named, credentialed attorneys. Most firms pick one and execute it well. The mistake is running a generic, single strategy across both and winning neither.

A note on the moving target: unlike traditional SEO, where rankings stay relatively stable, AI search outputs shift frequently with the underlying data (Good2bSocial). A firm cannot set its directory profiles once and walk away. The corpus is consistent that legal AI presence must be audited regularly because the model's source mix changes faster than a search ranking does.

Chapter 05

Hallucination, brand-safety, and the ethics edge

Legal GEO carries two risks no consumer vertical carries at this magnitude: the model can get the law wrong, and the firm operates under advertising rules that were not written for conversational AI.

Start with accuracy, because legal information is where AI fails most visibly. Stanford's study "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools," published in the Journal of Empirical Legal Studies (2025), found hallucination rates of 17% for Lexis+ AI, 33% for Westlaw AI-Assisted Research, and 43% for GPT-4 — meaning one in six queries returns fabricated information at best, one in three at worst, even on tools marketed as hallucination-resistant. The errors are not only fake cases; they include mischaracterizing real cases and citing inapplicable authority. An earlier Stanford study, "Large Legal Fictions" (2024), found general-purpose models hallucinated on between 58% and 88% of over 800,000 verifiable legal questions. For a vertical where wrong information can cost someone their freedom or their custody, those numbers define the trust ceiling.

The courtroom has made the risk concrete. As of early 2026, a public database had cataloged 1,227 cases globally in which generative AI produced hallucinated content submitted to courts, with five to six new documented cases per day (PlatinumIDS). In May 2026, a federal judge in Oregon fined two lawyers a combined $110,000 for submitting 23 fabricated citations and eight invented quotations — the largest AI hallucination penalty in American legal history. More than 300 federal judges have adopted standing orders or local rules addressing generative AI in filings, many requiring certification that every citation has been independently verified against primary sources.

That practitioner-facing crisis maps directly onto the consumer brand-safety problem. The same systems that fabricate citations in a brief can fabricate the law a prospective client reads, attach a firm's name to guidance it never gave, or summarize a firm's content into something it would never publish. When the model misstates a statute of limitations and cites a firm as the source, the firm carries the reputational and potential regulatory exposure for an error it did not make. The defensive move is the same as the offensive one: publish content so clear, so structured, and so precisely cited to actual law that the model has little room to distort it — and monitor relentlessly for where it does anyway.

Most consumers cannot tell accurate legal information from inaccurate. That makes the source the model chooses to cite the de facto safety mechanism — and a brand-safety exposure for every firm name attached to it.

Then there is the ethics layer, which is unique to legal. ABA Model Rule 7.1 prohibits communications about a lawyer's services that are false or misleading, and Rules 7.1 through 7.5 govern attorney advertising broadly. These rules were not written with conversational AI in mind, and as of early 2026 most state bars had not issued formal ethics opinions specifically addressing AI-driven discovery and advertising (AdVenture Media; Justia 50-state survey). That gap creates exposure. Copy that implies guaranteed results — "We Win Cases Like Yours," "Get the Settlement You Deserve" — runs afoul of bar rules in most jurisdictions, and the model can amplify or reframe such claims in ways the firm never approved. Several states impose specific disclosure requirements on awards and recognitions; New Jersey, for instance, mandates disclosure of the awarding organization and its methodology and restricts how certain rankings may be quoted.

The regulatory floor is also rising. New York's AI Disclosure Law, effective June 1, 2026, is the first US state law requiring disclosure of AI-generated advertising content, and it explicitly prohibits presenting AI-generated content as the genuine personal opinion or experience of a specific human when it is not — directly relevant to AI-generated testimonials and synthetic client reviews. And privacy is now a litigated risk: a 2026 federal decision held that consumer conversations with public AI chatbots about legal matters are not protected by attorney-client privilege, which means the firm's own AI-facing content strategy has to account for what clients may have already exposed before reaching out.

The throughline: in legal, the compliance bar and the AI authority bar point the same direction. Verifiable, accurately cited, ethically-compliant content is simultaneously the regulatory requirement and the visibility strategy.

Chapter 06

How firms become eligible: an execution playbook

Legal GEO is buildable. The work is unglamorous and verifiable, which is exactly why it compounds — and why the directories have a six-quarter head start most firms can still close. Sequence it by firm type, because solo and boutique firms and multi-location firms face different versions of the same problem.

Foundation for every firm (weeks 1 to 6).

  • Claim and complete every relevant directory profile — Justia, Avvo, Martindale, FindLaw at minimum, plus Best Lawyers, Super Lawyers, Chambers, or Legal 500 where eligible. These are the data sources the model reads first.
  • Enforce NAP consistency to the character across the website, Google Business Profile, social profiles, and every directory. "St." versus "Street" measurably damages trustworthiness scores.
  • Implement LegalService, Attorney, FAQPage, and Article schema so engines extract practice areas, locations, credentials, and review ratings without guessing.
  • Put named, credentialed attorney bylines — JD, bar admissions, years in practice — on every substantive page, linked to a real credential bio. Generic, author-less content does not earn the citation.

Solo and boutique firms — win the local and specialist game.

  • Pick one practice area and one geography and dominate them. Specialization is the single biggest lever a small firm has; the model identifies focused firms cleanly and outranks broader competitors on intent-specific queries.
  • Build review volume deliberately toward the 50-plus aggregated-reviews threshold across Google, Yelp, and BBB. This is where the 52% higher recommendation rate lives, and it is achievable for a small firm in a way that competing on directory editorial rankings is not.
  • Write consultation-style answer content sourced from actual intake questions, cited to the specific statutes that govern the firm's jurisdiction. Replace one buried, marketing-heavy practice page with a structured FAQ that answers the real question in plain language.

Multi-location and multi-practice firms — engineer entity clarity and avoid the breadth penalty.

  • Resolve the specialization problem structurally: give each office and each practice area its own clearly-defined entity, schema, and substantive local content, rather than a single generic firm page the model cannot parse into specialties.
  • Build genuinely local landing pages with local statutes, local case results, and local attorney bios — not template pages swapping in city names, which AI systems ignore.
  • Govern the content layer: separate public-facing data (practice focus, recognitions, bios) from internal and privileged material, and route AI-facing content through compliance review against Model Rule 7.1 and applicable state advertising rules before publishing.
  • Pursue editorial and peer recognition for elite queries — Chambers, Legal 500, Best Lawyers — because that is the currency the model trusts for sophisticated buyers.

For every firm — govern accuracy and monitor continuously (ongoing).

  • Audit AI presence on a regular cadence, because legal AI outputs shift faster than search rankings. Ask the engines directly whether the firm exists, what they recommend, and how the firm compares to named competitors.
  • Maintain content freshness: stamp pages with "Last updated [Month Year]," run quarterly audits for superseded statutes or case law, and correct any source material the model is misreading.
  • Monitor for misattribution and hallucination — where the model misstates the law and cites the firm, or summarizes the firm's guidance into something it never said — and fix the underlying source so the error stops propagating.

The strategic case is straightforward. The directories own the citation layer today and zero firms own theirs. That is a gap, and gaps close for whoever does the verifiable work first. The firm that builds genuine, cited, well-governed authority is the one the model eventually names alongside — and then instead of — the directory currently answering in its place.

The legal brands that win in AI are not the loudest or the best-advertised. They are the most verifiable. In the highest-trust, most-regulated vertical there is, provable authority is the only currency that buys a citation.

Sources cited

  1. SE Ranking, YMYL AI Overview research — legal queries trigger AI Overviews 77.67% of the time, the highest of any YMYL category, ahead of health (65.33%), finance (41.67%), and politics (16.67%).[02]2026 Legal AI Visibility Report (Haute Lawyer Network + 5WPR, 31-page audit across three query types and eight practice areas) — roughly seven directories (Chambers, Legal 500, Super Lawyers, Best Lawyers, Martindale, Avvo, Justia) own the legal AI citation layer; zero law firms own theirs; 79% of legal professionals use AI internally.[03]Rev, AI Legal Advice Index (2026) — 65% of Americans have used an AI chatbot for legal help; 43% use it to clarify terms, 38% to research rights, 41% reject AI for high-stakes matters.[04]Harvard Journal of Law & Technology, "AI as the New Front Door to Legal Services" (2025) — 50-website study; AI Overviews appear before firm sites; AI-mediated visibility framed as a professional responsibility issue; thin local pages and buried answers ignored.[05]Stanford, "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools," Journal of Empirical Legal Studies (2025) — hallucination rates of 17% (Lexis+ AI), 33% (Westlaw), 43% (GPT-4); "Large Legal Fictions" (2024) found 58–88% on general-purpose models.[06]PlatinumIDS / federal court reporting (2026) — 1,227 cataloged court cases involving AI hallucinations; $110,000 Oregon sanction (largest in US history); 300+ federal judges with AI standing orders.[07]Forward Push — firms with 50+ aggregated reviews across BBB, Yelp, and Google averaged 52% higher AI recommendation rates; consistent cross-web identity drives recommendation.[08]Rocket Clicks (2025) — specialist firms outrank multi-practice competitors in ChatGPT; AI rewards practice-area focus; dual Google/AI strategy required.[09]Martindale-Avvo (2025–2026) — directories as pre-verified AI data sources; NAP consistency; "indexed to recommended" authority framework.[10]Best Lawyers (2025) — independent recognition and peer review as AI trust anchors; Phil Greer on credible sources of truth; copy-paste bios as a liability.[11]Lexicon Legal Content / Exults / Custom Legal Marketing — jurisdiction-specific statute citations, plain-language readability, sub-practice matching, and schema types for legal pages.[12]AdVenture Media / Justia 50-state survey / New York AI Disclosure Law (eff. June 1, 2026) — ABA Model Rules 7.1–7.5, gap in state-bar AI guidance, award/testimonial disclosure rules, AI-content disclosure requirements.

Want this measured against your brand?

The AI Readiness Index (ARI) scores how AI engines see your firm — which practice-area and "near me" queries surface you, which directories outrank you, and where the model misstates the law and attaches your name to it. In a vertical where seven directories own the citation layer, that visibility map is the difference between being recommended and being invisible.

Related research

Foundational Methodology Original Research Brand Risk
Back to Research
Live · FancyAI Research Corpus

The Itinerary Is the New Search Result: How AI Picks Which Hotels and Destinations to Recommend

More than half of US travelers now use AI to plan a trip, and 78% of AI users have booked based primarily on an AI recommendation. But when a traveler asks for the best hotels in a city, the engine names a handful and ignores the rest, and the brands it names are often not the biggest. This is the new front desk.

56%
Share of US leisure travelers who used AI for at least one trip in the past year (Phocuswright)
78%
Share of AI users who booked travel based primarily on an AI recommendation (TakeUp AI)
37%
How much higher revenue per visit AI-referred travel traffic generates vs. non-AI traffic (Adobe)
70%+
Citation share captured by the top three brands in several travel sub-categories (5W)
Chapter 01

The planning journey collapsed into a single prompt

For two decades, travel ran on the longest funnel in consumer commerce. The average leisure trip spanned 36 days from a traveler's first research visit to a first booking, across 29-plus website visits, according to Tripadvisor data cited by Forbes. Inspiration, comparison, validation, and purchase were spread across dozens of tabs, sessions, and devices. The discipline of travel marketing was the discipline of being present in as many of those tabs as possible.

That funnel is collapsing into a conversation. In Phocuswright's research, 56% of US leisure travelers used an AI tool for at least one trip within the past 12 months, more than double the level in 2024. In a Skift Research and McKinsey joint study, the share of travelers using AI tools "extensively" for trip planning rose 124% year over year, from 13% to 30%. McKinsey found that 84% of travelers who used generative AI for planning said it improved their experience.

The behavior is generational and accelerating. 58% of US millennials have used AI for trip planning, versus 45% of Gen Z and just 11% of baby boomers (Phocuswright). And 84% of travelers who have used AI for travel started doing so within the past year, which tells you the curve is still bending upward, not leveling off.

Traditional search is feeling it directly. Phocuswright found search engines' share as a trip-planning starting point dropped from 51% in late 2024 to 36% by the second half of 2025. The first tab a traveler opens is increasingly a chat box, not a search box.

The 36-day, 29-visit journey is compressing into a single conversation. The engine reads the reviews, compares the options, and hands the traveler a shortlist before they ever reach a hotel website or an OTA.

This is the inversion that defines the vertical. AI does not send a curious browser to your property. It sends a traveler who has already been told where to stay. By the time that person sees your name, the comparison is done and the recommendation is made. The question this report answers is the one every hotel, destination, and travel brand now has to ask: how does the engine decide which names to say?

Chapter 02

Being seen vs. being selected: the biggest brand doesn't win the answer

The most important finding in travel GEO is also the most counterintuitive. The brand with the most rooms, the most flights, and the largest loyalty program is frequently not the one the AI names.

In 2026, 5W released the Airlines & Hotels AI Visibility Index, the first study to rank travel brands by citation share inside ChatGPT, Claude, Perplexity, and Google AI Overviews, using 60-plus consumer-intent prompts across leisure, business, family, luxury, and budget travel. The structural pattern was stark. In several sub-categories, the top three brands captured more than 70% of total citation share, leaving more than twenty competitors fighting over the remainder. AI does not distribute visibility. It concentrates it.

But concentration did not follow size. 5W's headline finding was that loyalty programs do not predict AI visibility. Several of the largest loyalty programs in travel underperformed their market share inside AI answers, while smaller brands with stronger earned-media footprints punched well above their weight. The brands that "win the answer" were the ones with sustained tier-1 press coverage and structured authority on the publications the engines trust, not the ones with the biggest paid-media and OTA-distribution spend.

The same dynamic shows up at the property level. A Skift analysis of travel AI visibility found that when AI agents answer questions about a specific hotel brand, third-party and user-generated sources often outrank the brand itself. Asking about a Hyatt property, NerdWallet was cited at 13.6% of citations, more than Hyatt's own website at 10.3%. The brand's own domain was not the primary source of truth about the brand.

  • Earned media is the dominant signal across both airlines and hotels (5W).
  • The top three brands take 70%+ of citation share in several sub-categories; everyone else is rounding error (5W).
  • Third-party publishers can out-cite the brand's own site on questions about that brand (Skift).

AI visibility is a winner-take-most market, and size does not buy a seat. The engine selects a short list and reads it back. Being the biggest chain gets you seen by humans. It does not get you selected by the model.

This is the visibility-versus-selection gap, and in travel it is brutal. You can have rate parity, top OTA placement, perfect classic SEO, and a loyalty program with fifty million members, and still never be named when a traveler asks an engine where to stay. Selection runs on a different set of signals.

Chapter 03

The signals that get a property or destination recommended

If the engine is not leading with your website and not weighting your loyalty tier, what is it using? Travel AI recommendations are assembled from a consistent set of signal classes, and they map directly to what a brand can and cannot control.

1. Reviews, read for substance not stars. Modern engines process thousands of reviews at once, using natural-language processing to scan for recency, sentiment, and specific experiences, tagging themes like "quiet for business travel," "perfect for families," or "dated decor" through pattern recognition (Mediaboom). The engine does not average your star rating. It reads what guests actually said and looks for consistent, specific, recent signal. A property with a slightly lower average but rich, current, detailed reviews can be selected over a higher-rated property with thin or stale review content.

2. Multi-source consensus and consistency. AI recommendation engines build trust by cross-referencing a property across many sources at once: the website, Google Business Profile, OTA listings, Tripadvisor, review platforms, travel blogs, local guides, and social. They look for consistency, credibility, and clarity (Mediaboom). Contradictory data is a disqualifier. Hospitality Today reported that hotels with unclear, incomplete, or inconsistent data risk losing direct bookings outright, because AI agents favor sources with structured, reliable information, and that data-quality gaps could accelerate OTA dependence.

3. Reddit and community consensus. Nearly all AI-recommended hotels appear across YouTube, travel blogs, and Reddit, and the more a property shows up in trusted third-party content described consistently and positively, the more confident the engine becomes (TakeUp AI). The Skift analysis confirmed that user-generated platforms like Reddit are among the most-cited sources in travel answers, frequently out-citing the brands themselves. Community discussion is not a side channel. It is a primary input.

4. Third-party "best of" lists and editorial. For "best hotels in [city]" or "best food tour in Rome" queries, the engine's shortcut is to defer to the publishers humans already trust for that judgment. A placement in a credible roundup, a guidebook, or a destination editorial feature is worth more to AI visibility than any amount of on-site copy claiming you are the best.

5. Structured data and machine-readable content. The controllable foundation. Hotels and destinations that publish content matching how travelers actually phrase questions, structured so generative models can extract and assemble it, perform measurably better (Bored Hotelier). Natural-language queries are now the norm: "pet-friendly hotel near the temple with early check-in," not "hotel in Varanasi" (AxisRooms).

AI does not rank your property in isolation. It ranks it relatively, against alternatives, across reviews, cross-source consistency, community sentiment, and editorial consensus, then names a winner.

Notice the pattern. Four of the five signal classes live off your own domain. The job has shifted from optimizing a website you own to seeding and shaping a consensus you can influence but not author. For an independent hotel, that is the entire game.

Chapter 04

Earned media beats paid placement: where the engines actually look

The single clearest takeaway from 2026 travel research is that the source mix AI engines draw on is dominated by earned and owned content, not advertising or social.

Promodo's analysis of travel AI sources found that ChatGPT relies on earned media most heavily, 61% of sources, followed by owned content at 44%, while social media accounts for less than 1%. That ordering is decisive. The PR mention, the guest contribution, the third-party feature, and the genuinely useful owned guide carry the weight. The paid social campaign and the influencer post carry almost none.

This is why the 5W index found brands leaning on paid media and OTA distribution underperforming brands with sustained press coverage. The engine is not impressed by spend. It is persuaded by the consensus that forms when credible third parties write about you consistently over time. Recomaze's travel research reached the same conclusion from the other direction: brand mentions correlate more strongly with AI travel visibility than backlinks alone, and investment in digital PR, guest contributions, thought leadership, and community engagement builds the footprint AI recognizes as authoritative.

There is a structural reason travel rewards explanation. AI uses fan-out querying, breaking a complex travel request into sub-queries, then assembling an answer from the best source for each (Recomaze). A request like "five-day family trip to Lisbon with a pool hotel and kid-friendly food" fragments into hotel, neighborhood, dining, and activity sub-questions. Content that answers a specific sub-question wins that fragment, even from a small operator. This is how a boutique hotel or a niche tour operator outranks a global platform on a narrow query the platform only answers generically.

  • ChatGPT travel sources: 61% earned media, 44% owned content, under 1% social (Promodo).
  • Brand mentions correlate with AI travel visibility more strongly than backlinks alone (Recomaze).
  • Fan-out querying lets niche operators win specific sub-queries large platforms answer generically (Recomaze).

In travel, the engine reads the press, the reviews, and the genuinely useful guides. It barely reads social. Earned authority is the currency, and you cannot buy your way to it with media spend.

The corollary is uncomfortable for marketing teams built around paid acquisition. The lever that moves AI visibility is the slowest, least-controllable lever in the toolkit: being genuinely good enough that credible people write about you, then making sure the engine can read what they wrote.

Chapter 05

The booking funnel is moving into the chat, and the economics favor it

The reason this matters now, rather than someday, is the conversion economics. AI is not just a small new traffic source in travel. It is, per visit, the most valuable one.

Adobe's 2026 data found that revenue per visit from AI-source traffic is 37% higher than non-AI traffic, and that AI referrals converted 31% more than other traffic sources during the 2025 holiday season, nearly doubling year over year. Travel was among the fastest-growing sectors for AI visit share, up 233% year over year in the first quarter of 2026, and AI-driven travel traffic rose 539% during the 2025 holiday season. Adobe earlier documented generative-AI-driven travel site traffic up 3,500% off its small base. The slope is exponential and the quality is rising at the same time.

The intent explains the quality. The engine pre-qualifies the traveler. It has already filtered for fit, compared options, read the reviews, and surfaced your property as the answer to a specific need. There is no top-of-funnel browsing to discount. The traveler who clicks through has been handed a recommendation, and recommendation traffic closes.

For independent hotels, this reframes the entire OTA economics question. An OTA booking carries a base commission of 15% to 18% on Booking.com, but the all-in cost reaches 18% to 30% once Genius and Visibility Booster are added, and 17% to 23% on Expedia with Accelerator (Cloudbeds and industry analysis, 2026). A direct booking, by contrast, costs roughly 5% to 12% all-in. GEO is the mechanism to intercept guest intent at the most profitable stage, before the traveler ever reaches an OTA, by being the property the engine names in the first place.

  • AI travel traffic: 37% higher revenue per visit, 31% better conversion vs. other sources (Adobe).
  • Travel AI visit share up 233% YoY in Q1 2026; holiday AI travel traffic up 539% (Adobe).
  • OTA all-in cost 18–30% (Booking.com) vs. 5–12% for direct (2026 industry analysis).
  • The global travel market is roughly 40% OTA-dominated; AI visibility now shapes that split (AxisRooms).

A channel that delivers 37% higher revenue per visit, converts better, and is growing triple digits a year is not a side experiment. For an independent hotel, it is the most direct path off the 18-to-30% OTA tax.

The platform that built modern travel discovery is itself being disrupted by this shift. Tripadvisor acknowledged on earnings calls that traditional search traffic is falling and will play a smaller role in future bookings, and it partnered with OpenAI to build AI itinerary generators using its review corpus (Hospitality Today, Hotel Dive). When the company that defined travel discovery concedes the model is changing, the bellwether is unambiguous.

Chapter 06

Agentic booking is coming, but the checkout retreat is the real lesson

The widely predicted endgame is agentic booking, where an AI agent does not just recommend a hotel but books it, pays for it, and manages the trip end to end. Phocuswright found more than 60% of travel businesses surveyed are experimenting with or scaling agentic AI, with 6% already scaling actively. The infrastructure is being built. Expedia partnered with OpenAI on access connectors and with Perplexity for its Comet browser, and ChatGPT launched in-chat apps starting with Booking.com and Expedia.

But 2026 delivered a clarifying reversal. OpenAI spent most of 2025 building an in-chat "Buy Now" capability so travelers could research and book in the same window. In March 2026, it quietly pulled checkout for travel, concluding the category was too complex (Skift). The reason matters: travelers were happy to ask the engine for ideas, but when it came time to enter a credit card, they left and booked somewhere they trusted. Expedia Group's own survey found only 8% of travelers are comfortable letting AI handle the actual booking, even as a majority use AI to plan.

The market read the retreat as good news for the transactional layer. Expedia's share price jumped 12% and Booking Holdings rose 8% on the news (Skift). The emerging consensus: OTAs may lose their grip on the discovery interface, but they remain the trust layer and transactional intermediary. The current model routes a traveler through conversational research in the chat, then hands them off to Booking.com or a hotel's own site to complete the reservation.

The strategic lesson is not to wait for agents. It is that the recommendation, not the transaction, is the contested ground today. The engine decides who to name long before anyone enters a card. And one durable risk hangs over the whole system: AI-generated fake travel reviews rose 137% from 2019 to 2024 (Originality.ai). If the engines increasingly learn from reviews that are themselves machine-generated, the trust foundation under every travel recommendation erodes, which raises the premium on verifiable, first-party, structured proof a property can actually stand behind.

Don't optimize for a buy button that keeps moving. Optimize to be the name the engine says, on whatever surface the traveler is standing. The recommendation is the contested ground. The checkout is downstream.

Chapter 07

The execution playbook for hotels, destinations, and travel brands

The work divides into what you own, what you influence, and what you measure. In travel, the influence column is the largest, and it is where the wins are.

Make your property data machine-readable and consistent everywhere. This is the controllable foundation and the fastest disqualifier to fix. AI agents drop properties with contradictory data.

  • Implement complete Hotel, LocalBusiness, Review, AggregateRating, and FAQPage schema, kept dynamic so rates, availability, and amenities stay current.
  • Enforce data consistency across your site, Google Business Profile, OTA listings, and Tripadvisor. Same name, address, amenities, and descriptions everywhere. Inconsistency reads as unreliability.
  • Render critical content server-side. If amenities, location detail, and policies only load after JavaScript, assume the engine never sees them.

Earn the third-party signals you cannot author. Roughly 61% of what ChatGPT draws on is earned media, and the engine weights consensus across sources it did not get from you.

  • Pursue placements in the "best of" roundups, guidebooks, and destination editorial the engines defer to for "best X" queries. One credible third-party recommendation outweighs a page of self-description.
  • Build genuine, recent, specific review depth. The engine reads reviews for substance and recency, so a steady flow of detailed, current guest stories matters more than a frozen high average.
  • Participate authentically in travel communities, especially Reddit. The strategy is real contribution and being genuinely worth writing about, not seeded spam, which engines detect and discount.

Write for the question, not the keyword. Travelers ask in natural language and the engine fans the query out into sub-questions.

  • Build content clusters: a pillar page plus related articles that demonstrate topical depth on a destination, neighborhood, or experience (Recomaze recommends 5 to 10 supporting pieces per pillar).
  • Answer the specific long-tail questions travelers actually ask: "pet-friendly hotel near the old town with early check-in," "best beachfront hotels for families in [city]," "quiet hotel for business travel near the convention center."
  • Lead with verifiable specifics, location, amenities, policies, distances, that the engine can extract and quote, not atmospheric copy it cannot.

Use GEO as your direct-booking strategy. Intercept intent before the OTA. Package experiences with local partners that are only available on your direct channel, differentiating your listing from the commoditized OTA version (Revinate), and make sure those differentiators are described in machine-readable terms.

Measure selection, not just traffic. Classic analytics tell you who arrived. They cannot tell you whether the engine named you in the first place, and in travel the recommendation happens before any click. Track your share of voice in AI answers, the itinerary and "best of" prompts you appear in and miss, and which competitors get named when you do not. Dedicated travel AI-visibility tools have already emerged, a signal of how fast the category is maturing.

The old job was ranking a page you control. The new job is being selected by a system you don't, built on reviews, consensus, and earned authority you mostly don't own. Visibility was a placement. Selection is a reputation.

The hotels, destinations, and brands winning this transition are not the ones with the biggest loyalty programs or the deepest ad budgets. They are the ones the engine can read clearly, that credible third parties describe consistently, and that answer the traveler's actual question better than anyone else. The funnel collapsed into a conversation. The only question that matters now is whether your name is in the answer.

Sources cited

  1. Phocuswright (2025–2026) — 56% of US travelers used AI for a trip in the past year; millennial vs. Gen Z vs. boomer adoption; search engines falling from 51% to 36% as a starting point; 60%+ of travel businesses experimenting with or scaling agentic AI.[02]Skift Research / McKinsey, Remapping Travel With Agentic AI (2026) — Extensive AI trip-planning usage up 124% YoY to 30%; 84% of gen-AI users say it improved their experience; 90% of the industry experimenting with generative AI.[03]TakeUp AI (2026) — 78% of AI users have booked travel based primarily on an AI recommendation; six-step hotel GEO framework.[04]5W, Airlines & Hotels AI Visibility Index (2026) — Citation-share rankings across ChatGPT, Claude, Perplexity, and Google AI Overviews; top-three brands take 70%+ in several sub-categories; earned media dominant; loyalty programs do not predict AI visibility.[05]Skift (2026) — Hotels and airlines vs. NerdWallet and Reddit visibility analysis; NerdWallet at 13.6% vs. Hyatt's 10.3%; ChatGPT travel-checkout walkback; OTA share-price reaction (Expedia +12%, Booking +8%).[06]Adobe Analytics (2026) — AI travel traffic 37% higher revenue per visit; 31% higher conversion; travel AI visit share up 233% YoY in Q1 2026; holiday AI travel traffic up 539%; earlier 3,500% travel-traffic surge.[07]Promodo (2025) — ChatGPT travel source mix: 61% earned media, 44% owned content, under 1% social.[08]Recomaze (2026) — Brand mentions correlate with AI travel visibility more than backlinks; fan-out querying; content-cluster framework; 61% organic CTR drop when AI Overviews appear; AI-referred visitors convert up to 23x higher.[09]Cloudbeds / 2026 OTA commission analysis — Booking.com all-in cost 18–30%, Expedia 17–23%, direct 5–12%.[10]Expedia Group survey (2026) — Only 8% of travelers comfortable letting AI handle the booking.[11]Hospitality Today / Hotel Dive (2025–2026) — Data-quality gaps accelerate OTA dependence; Tripadvisor traffic decline and OpenAI itinerary partnership.[12]Mediaboom / Bored Hotelier / AxisRooms (2026) — How AI reads hotel reviews; multi-source consistency; natural-language query shift.[13]Originality.ai (2024) — AI-generated fake travel reviews up 137% from 2019 to 2024.[14]Forbes / Tripadvisor — 36-day, 29-visit traditional travel planning journey.

Want this measured against your brand?

The FancyAI AI Readiness Index (ARI) shows exactly which itinerary and "best of" prompts the engines name you in, where competitors are selected instead, and which review, consistency, and earned-media gaps are keeping your property or destination out of the answer.

Related research

Foundational Methodology FancyAI Research Business Case
Back to Home
Back to Home