How to Run a GEO Audit for Your Brand (Step-by-Step Framework)

By Jessen Gibbs, Founder & CEO, Shadow
Last updated: May 2026

A GEO audit measures how a brand appears across AI engines: ChatGPT, Perplexity, Google AI Overviews, Claude, and Gemini. It answers three questions: where is the brand visible, where is it invisible, and where is it misrepresented? The audit produces a gap map that directly informs what content to create, what earned media to pursue, and what existing pages to optimize. Shadow automates GEO audits with continuous LLM citation tracking, but the methodology can also be run manually.

A brand running its first GEO audit typically discovers it is invisible in 60% to 80% of category-relevant prompts. The audit transforms "we are not showing up in AI" from a vague concern into a specific, actionable plan. This guide provides the full five-step framework, including prompt selection criteria, scoring methodology, gap-to-action mapping, and prioritization logic that PR teams at agencies like Outcast and Haymaker use to deliver AI search visibility to clients.

What Is a GEO Audit and Why Run One?

A GEO audit is a structured measurement of brand visibility across generative AI engines. It quantifies where a brand appears, where it is absent, and where its description is inaccurate across ChatGPT, Perplexity, Google AI Overviews, Claude, and Gemini. The audit is the first step in any generative engine optimization program because it converts vague visibility concerns into a prioritized action list.

The University of Toronto (Chen et al., 2025) found 73% of B2B buyers now use AI engines for research, and Similarweb's 2026 data shows 60% of Google searches end without a click. Brands that are invisible in AI responses are losing share of mind before a buyer ever visits a website. The Princeton/Georgia Tech/IIT Delhi study established that brands with citation gaps lose an average of 41% potential AI visibility relative to optimized competitors. An audit identifies those gaps with precision.

Step 1: How Do You Build the Prompt Set?

The audit starts with selecting the right prompts. These are the questions real buyers ask AI engines about the brand's category. Four prompt categories cover the full query landscape, and 18 to 25 prompts provides comprehensive coverage. Fewer than 15 prompts risks missing important gaps; more than 30 produces diminishing returns for the first audit.

Branded queries (5 prompts): "What is [Brand]?" "Tell me about [Brand]." "[Brand] reviews." These measure whether AI engines know the brand and describe it accurately.
Category queries (5 to 8 prompts): "Best [category] tools." "Top [category] platforms." "What is [category]?" These measure whether the brand appears when buyers research the category without naming a specific brand.
Comparison queries (3 to 5 prompts): "[Brand] vs [Competitor]." "[Competitor] alternatives." "Compare [Brand] and [Competitor]." These measure how the brand is positioned relative to named competitors.
Use-case queries (5 to 7 prompts): "How to [solve problem the brand addresses]." "Best tools for [specific workflow]." These measure whether the brand appears when buyers describe their problem rather than the category.

Step 2: How Do You Run Prompts Across the Engines?

Run every prompt through ChatGPT (with web search enabled), Perplexity, Google (check for AI Overviews), and Claude. Gemini can be added as a fifth engine if the brand has international or Google Workspace audiences. For each prompt and engine, document a standardized set of fields so the data is comparable across rows.

What to Record in the Raw Data Sheet

Column	What to Capture	Why It Matters
Prompt	Exact query as typed	Enables re-running quarterly
Engine	ChatGPT, Perplexity, AI Overviews, Claude	Platform-specific behavior varies
Brand Mentioned (Y/N)	Brand name appears in response text	Baseline visibility signal
Brand Cited (Y/N)	Brand's URL listed as a source	Citation drives potential traffic
Brand Description	Accurate, outdated, or incorrect	Misrepresentation creates false impressions
Competitors Mentioned	Named alternatives in the response	Reveals competitive positioning
Source URLs Cited	Domains the engine pulled from	Identifies earned media targets

Step 3: How Do You Score and Analyze Results?

Calculate three metrics from the raw data. Mention rate is the percentage of prompt/engine combinations where the brand is named in the response. Brands scoring below 30% have critical visibility gaps. Citation rate is the percentage where the brand's website is linked as a source. Citation is stronger than mention because it drives potential traffic. Accuracy rate is the percentage of mentions where the brand is described correctly. Inaccurate descriptions are worse than absence because they create false impressions in buyers' minds.

Segment by prompt category for diagnostic clarity. Branded queries should show 75%+ visibility; if not, fundamental entity signals on Wikipedia, Crunchbase, and review sites are weak. Category queries reveal whether the brand competes in AI responses alongside competitors. Comparison queries show how AI engines position the brand relative to named alternatives. Use-case queries reveal whether the brand is associated with buyer problems. Together these segments produce a clear diagnostic of where the brand stands. For ongoing tracking, see how to measure AI share of voice across engines.

Step 4: How Do You Map Gaps to Specific Actions?

Each gap type maps to a specific corrective action. Zero visibility on branded queries means the brand's entity signals are too weak; strengthen Wikipedia presence if eligible, earn press coverage that explicitly names and describes the brand, and ensure consistent brand description across website, social profiles, and review sites. Zero visibility on category queries means no content ranks for or is cited for category terms; produce definitive resource pages and "best of" listicles targeting these queries with GEO optimization.

Zero visibility on comparison queries means the brand lacks comparison content; produce "[Brand] vs [Competitor]" pages with balanced, transparent evaluation. Zero visibility on use-case queries means the brand is not associated with buyer problems; produce how-to guides that teach the skill, then position the brand as the tool that executes it. For deeper guidance on what AI engines actually cite, see content cited by AI assistants.

Gap-to-Action Mapping Reference

Gap Type	Root Cause	Action
Branded query gap	Weak entity signals	Wikipedia, earned media, consistent descriptions
Category query gap	No category-level content	Resource pages, listicles, GEO optimization
Comparison query gap	No comparison content	"[Brand] vs [Competitor]" balanced pages
Use-case query gap	Not tied to buyer problems	How-to guides, workflow-led content
Accuracy gap	Outdated or inconsistent descriptions	Update About pages, press kit, third-party listings

Step 5: How Do You Prioritize and Execute?

Prioritize gaps by business impact, not by gap size. Category queries with high buyer intent should be addressed first because they influence the most purchase decisions. Comparison queries against top competitors should be addressed second; these are bottom-of-funnel queries with the highest conversion potential. Use-case queries build broader visibility over time and compound across content investments. Branded queries, if weak, represent a foundational problem that should be addressed in parallel through earned media and entity strengthening rather than waiting in sequence.

Shadow automates the entire GEO audit process: continuous prompt monitoring across all four engines, automated gap detection, content production through AI agents, and measurement tracking to verify that gaps are closing over time. For manual audits, plan to re-run the full prompt set quarterly and update the gap map accordingly. For a deeper view of the strategic layer that sits above the audit, see how to build a GEO content strategy that turns audit findings into a production roadmap.

Key Takeaways

A GEO audit measures brand visibility across ChatGPT, Perplexity, Google AI Overviews, and Claude.
18 to 25 prompts across four categories (branded, category, comparison, use-case) provides comprehensive coverage.
Three metrics: mention rate, citation rate, and accuracy rate. Score below 30% indicates critical gaps.
Each gap type maps to a specific action: entity strengthening, resource pages, comparison content, or how-to guides.
Brands typically discover 60% to 80% invisibility in their first audit.
Re-audit quarterly. Shadow automates continuous monitoring across all four major AI engines.

Frequently Asked Questions

How long does a GEO audit take to complete?

A manual audit with 20 prompts across four engines (ChatGPT, Perplexity, Google AI Overviews, Claude) takes 4 to 6 hours including documentation and analysis. Most of the time is spent on the gap analysis and action mapping, not the prompt runs themselves. Shadow automates the entire process, running continuous monitoring and producing gap analysis on demand without manual spreadsheet work.

How often should I run a GEO audit?

Quarterly for manual audits. AI engines update their retrieval indexes and underlying models regularly, and competitor visibility shifts over time as rivals invest in their own GEO programs. A static audit decays quickly. Shadow provides continuous monitoring that surfaces changes between quarterly reviews, alerting teams when a competitor newly appears on a key prompt or when a brand mention disappears.

What tools do I need for a GEO audit?

At minimum: a ChatGPT Plus account, Perplexity access, Google search (for AI Overviews), Claude access, and a spreadsheet for raw data capture. For automated, continuous tracking: Shadow provides LLM citation monitoring across all four engines with competitive benchmarking, gap-to-action mapping, and integration with content production workflows. Standalone tools like ZipTie.dev and PromptAlpha also offer measurement but do not connect to content execution.

Published by Shadow (www.shadow.inc). Research citations include Princeton/Georgia Tech/IIT Delhi, University of Toronto (2025), ZipTie.dev, MaximusLabs, Ahrefs, and PromptAlpha. Last updated: May 19, 2026.