Lead with the distinction that matters: data type. Everything else follows from whether the number was measured at the source or inferred from the outside.
Bing's AI Performance report is the only first-party AI citation data any platform gives you, and it's free. SimilarWeb and the third-party AI trackers are modelled estimates built from panels and traffic inference. Both are useful. They answer different questions, and the mistake I see most often is people quoting a third-party estimate to a CMO as if it were measured fact.
This isn't a "how to use the Bing report" post. I covered that in the Bing AI Performance report walkthrough. This is about what first-party data and third-party estimates each can and can't tell you, and when to trust which.
One of these is measured. The other is modelled. Report them as if they're the same and you'll get caught.
The comparison.
| Source | Data type | What it measures | Coverage |
|---|---|---|---|
| Bing AI Performance | First-party, measured. | Which of your pages Copilot cited, and the grounding queries that retrieved them. | Your verified site only. Bing and Copilot only. Free. |
| SimilarWeb / traffic trackers | Third-party, estimated. | Modelled referral traffic from AI platforms, inferred from panels and clickstream. | Any domain, including competitors. Cross-platform. Paid. |
| GEO citation trackers | Third-party, sampled. | Citation presence from running prompt sets against the platforms and recording results. | Any brand, multiple platforms, limited to the prompts sampled. Paid. |
What each is actually good for.
Bing AI Performance: ground truth, narrow scope
Because it comes from inside Microsoft's system, the Bing report is the closest thing to certainty you can get. When it says a page was cited for a grounding query, it was. The trade-off is scope: it only covers your verified site, and only Bing and Copilot. It tells you nothing about competitors and nothing about ChatGPT, Perplexity, or Gemini. Use it as your baseline truth for the slice it covers.
Third-party trackers: breadth, with error bars
The value of SimilarWeb and the GEO trackers is everything Bing can't give you: competitor visibility, cross-platform coverage, and a market-level view. The cost is accuracy. These are models. SimilarWeb infers AI referral traffic from panels and clickstream data; GEO trackers sample a finite set of prompts. Both are directional. They're genuinely useful for spotting trends and benchmarking against competitors, as long as you hold them at the right confidence level.
The competitor angle is what makes them worth paying for. Bing can tell a hotel chain which of its own pages Copilot cited. Only a third-party tool can estimate that a rival chain is pulling roughly twice the AI referral traffic, or that a travel publisher is eating both their lunches in the answers. You'd never get that from any first-party source, because no platform shows you a competitor's citation data. The trade is breadth for precision, and for competitive context that's usually the right trade.
A worked example
Say you run marketing for a mid-size accountancy firm. Bing's report shows Copilot cited your "self-assessment deadlines" guide on 40-odd grounding queries last month. That's measured, defensible, and tells you that guide is your strongest AI asset on Bing. Useful, but it's one platform. You open SimilarWeb and it estimates a larger national firm is getting several times your AI referral traffic, and that a personal-finance publisher is showing up across the same topics. Also useful, but it's modelled. Then you run ten prompts yourself in ChatGPT and Perplexity ("when is the self-assessment deadline", "how do I file a tax return") and find you're cited in neither, while the publisher is in both. Three sources, three confidence levels, one coherent picture: strong on Bing, invisible on the platforms your clients probably use more. No single tool would have told you that.
What neither one can see.
The gap between them gets the attention. The gap they share gets missed, and it's the more important one. Between Bing's first-party report and the third-party trackers, there's a large blind spot, and pretending otherwise is how measurement goes wrong.
Bing's report covers Bing and Copilot, full stop. It says nothing about ChatGPT, Perplexity, Claude, or Google's own AI surfaces. The third-party trackers model traffic or sample prompts, so they can gesture at those platforms but can't tell you with certainty whether a specific page was cited for a specific query. Put bluntly: nobody hands you measured citation data for ChatGPT or Gemini about your own site. That data doesn't exist as a feed.
This matters because the platforms your customers use may be exactly the ones in the blind spot. A consumer brand whose buyers live in ChatGPT, a clinic whose patients ask Gemini, a software company whose users trust Perplexity: in every case the tools above give you a partial picture at best. The honest position is that AI search measurement in 2026 is incomplete, and the way you fill the gap is to run the prompts yourself.
How to combine them.
You don't pick one. You layer them, and you keep straight in your own head which layer is measured and which is modelled.
- Bing AI Performance as your baseline truth. It's free and first-party. Start every measurement with what it actually shows about your own citations.
- Third-party trackers for competitive context. Use SimilarWeb or a GEO tracker to see where competitors sit and how the market is moving, holding the numbers as directional.
- Your own prompt testing for the gaps. Neither source covers ChatGPT, Perplexity, Claude and Gemini cleanly for your brand. Run a structured prompt set yourself to fill that in. The method is in the prompts I use to test AI search visibility, and the metric they feed is in citation share is the new keyword ranking.
The one rule when you report these numbers.
First-party data gets reported as fact. Estimates get reported as estimates, with the source named. "Bing's report shows we were cited on 38 grounding queries" is a measurement. "SimilarWeb estimates competitor X gets roughly 2x our AI referral traffic" is a model output. Blur the line and the first time someone checks, your whole report loses credibility.
The honest summary: there's one free first-party window, it's narrow, and everything wider than it is an estimate. Use the truth where you have it, use the estimates where you don't, and never let the two get confused in a slide.
NOTES
- The Bing AI Performance report launched in public preview in February 2026 and is, at time of writing, the only first-party AI citation dashboard offered by any AI search platform.
- Third-party AI referral estimates (SimilarWeb and similar) are modelled from panel and clickstream data. Methodologies differ by vendor and none is independently audited here. Treat absolute figures as directional and trust the trend over the point value.
Frequently asked
Is the Bing AI Performance report free?
Yes. It's part of Bing Webmaster Tools, which is free, and you access it by verifying your site. That's a large part of why it's underused: people assume first-party AI data must be paid or gated, and it isn't.
Does SimilarWeb measure ChatGPT citations?
Not directly. It estimates referral traffic from AI platforms based on its panel and clickstream models. It infers visits, not citations, and the figures are modelled rather than measured. Useful for trend and competitor context, not for a precise citation count.
Which should I report to a CMO?
Lead with the Bing first-party data because it's defensible. Add third-party estimates for competitive context, clearly labelled as estimates. The combination is more honest and more persuasive than either alone, and it survives scrutiny.
Why don't the numbers match across sources?
Because they measure different things by different methods. Bing counts actual Copilot citations of your site. Third-party tools model traffic or sample prompts. Expecting them to agree is like expecting your bank statement to match a budgeting app's forecast. Reconcile the question each one answers, not the raw figure.