Checking whether your brand shows up in AI search isn't one prompt in one tool. It's a structured set of prompts, run in fresh sessions, across all four major platforms. A single ChatGPT check tells you almost nothing, because the platforms disagree about who to cite often enough that any one of them on its own is misleading.
This is the prompt set I actually use, organised by buyer intent, plus what to watch for on each platform. Copy it, swap in your category and competitors, and you've got a repeatable visibility test you can run every month.
One prompt in one tool is a vibe check. Four intent types across four platforms is a measurement.
People don't arrive at your brand in one step, so testing only branded prompts tells you the least useful thing. Run four categories, in this order, from least to most branded. The earlier categories are where you're most likely to be absent, which is exactly why they matter. This applies whether you sell software, run a clinic, fit kitchens, or publish recipes: the intent ladder is the same, only the wording changes.
- Problem-aware. The person has a problem and no solution in mind. "How do I fix [problem]", "best way to [achieve outcome]". This is where most brands are invisible, and where being cited matters most because the buyer hasn't chosen anyone yet.
- Comparison. They're weighing options. "[Option A] vs [Option B]", "best [category] for [need]". Watch whether you're in the considered set at all.
- Category. They're learning the space. "What should I look for in [category]", "how does [category] work". Tests whether you're treated as an authority on the topic, not just a name for sale.
- Brand-check. They already know you. "What do you know about [your brand]", "is [your brand] any good for [use case]". Tests what the models say about you when asked directly, which is its own reputation signal worth watching.
The bracketed templates are the skeleton. What they look like filled in depends on your sector and your size, so below is a full library: twelve sectors, each with three lists of twenty prompts for small, mid-market and enterprise businesses. That's 720 worked prompts, spread across the funnel from problem-aware to brand-check. Open the sector closest to you, copy the size that fits, and you've got a ready-made test set.
Why twenty per list and not a handful? Because a single phrasing under-samples. A model can cite you for "best running shoes for flat feet" and miss you entirely on "what trainers are good for fallen arches", which is the same intent worded differently. Twenty variations across the funnel is the smallest set that gives you a reading you can trust rather than a one-off result.
Put your brand in the box below and every brand-check prompt updates to use it. Add up to three competitors and the comparison prompts fill in their names too. Then open your sector, hit Copy CSV on the list you want, and paste it straight into a spreadsheet to start logging results.
The pattern holds across every sector. The problem-aware rows almost never name a brand, which is exactly why they're the hardest to show up in and the most valuable when you do. If you only test the brand-check rows, you'll feel great and learn nothing: of course the model knows who you are when you name yourself. The test that matters is whether it reaches for you when the person describes a need and names no one.
The reason you test all four is that they retrieve differently, so they cite differently. A page that Perplexity loves can be invisible in Gemini, and vice versa. Here's the practical shape of each.
The split that matters most: Perplexity and Claude lean on live retrieval, so freshness and crawlability decide whether you're in. Gemini leans on Google's index, so your existing Google footprint carries more weight. ChatGPT sits in between and will sometimes skip retrieval entirely. Test all four and the disagreement between them tells you where your content is and isn't reaching. If you're focused on Google specifically, the AI Mode playbook is the next step.
Running a prompt is easy. Reading the result correctly is where people slip. When you run "best running shoes for flat feet" in Perplexity, you'll get an answer with a handful of numbered citations down the side. The questions to ask, in order:
- Are you cited at all? Not "is your category mentioned", but does a link to your domain appear in the sources. Brand name in the prose without a citation is weaker, but still worth logging separately.
- Which page got cited? Often it's not the page you'd expect. A model will pull your buried comparison section over your polished category page if the section answers the facet better. That tells you which content is actually doing the work.
- Who else is in the answer? The other cited sources are your real AI-search competitors for that prompt, which is frequently not the same list as your ranking competitors. Publishers, forums, and review sites show up here constantly.
One honest warning: the same prompt can return different sources on different days, and the same prompt run logged-in versus logged-out can differ too. That's not a bug in your method, it's the nature of live retrieval. It's also the reason a single run is anecdote and a monthly average is data. Don't over-react to one result in either direction.
The prompts are only half the job. Without a consistent way to record results, you've got anecdotes, not a trend. Keep it simple: a spreadsheet, one row per prompt per platform.
- Run each prompt in a fresh, logged-out session so personalisation doesn't skew the result.
- Record cited yes or no, and if yes, which page or source was cited.
- Note the date. AI answers change, so a result without a date is noise.
- Repeat monthly and track the direction, not just the absolute number.
That count, cited divided by total prompts tested per platform, is your citation share. It's the metric I'd report instead of keyword position, and I've written up the full method in citation share is the new keyword ranking.
Start with ten prompts this week, not forty. A small set run consistently beats a big set run once and abandoned. The point isn't a perfect audit. It's a number you can watch move.