SealMetrics
Data Quality

GA4 Data Sampling: Why Your Traffic Numbers Are Wrong

7 min read

If your website receives significant traffic, GA4 is not showing you all of it. It is showing you a statistical estimate based on a sample. This is called data sampling, and it is one of the most misunderstood aspects of Google Analytics.

What is data sampling in GA4?

Data sampling occurs when GA4 analyzes a subset of your data and extrapolates the results to represent the full dataset. Instead of processing every event, GA4 takes a statistical sample and applies mathematical models to estimate what the full picture would look like.

In GA4, sampling is triggered when you create exploration reports that exceed certain data thresholds. Google does not publicly disclose the exact thresholds, but the sampling icon appears in your reports when it is active. The free version of GA4 has lower thresholds than GA360, which means sampling kicks in sooner for most businesses.

Why sampling matters for decision-making

Sampling introduces a margin of error. For high-level traffic trends, this might be acceptable. But for specific analyses — campaign performance by segment, conversion path analysis, revenue attribution by creative — even small margins of error compound into unreliable conclusions.

Consider a scenario:

Your actual data shows Campaign A generated 342 conversions and Campaign B generated 298. After sampling, GA4 estimates Campaign A at 310 and Campaign B at 320. You increase budget for Campaign B.

The decision was based on sampled data. The reality was the opposite.

Sampling is only part of the problem

Sampling reduces the accuracy of data you do have. But the larger problem is the data you never collect in the first place. Before sampling even begins, GA4 has already lost visitors to consent banner rejection, ad blockers, and browser cookie restrictions.

In a typical European website, GA4 captures roughly 13% of actual traffic. Of that 13%, sampling further degrades the accuracy. You are making decisions based on an estimate of a fraction.

The alternative: full-resolution analytics

Cookieless analytics platforms like SealMetrics take a fundamentally different approach. By collecting data through first-party server-side methods, every session is captured regardless of consent banner status, ad blocker usage, or browser restrictions. And because the data volume is managed at the infrastructure level, there is no need for statistical sampling.

When you see 72,847 visitors in SealMetrics, that number represents 72,847 actual sessions. Not a sample. Not an estimate. Not a projection from the subset that happened to accept cookies.

What you can do about it

If you are using GA4, check your exploration reports for the sampling indicator. If you see it, your data is approximate. For standard reports, GA4 uses modeled data (Google calls it “blended data”), which introduces its own estimation layer.

The most reliable way to understand the gap is to run a complete analytics tool alongside GA4 and compare the numbers. You can estimate your data loss here or learn how cookieless analytics works.