Skip to content
SealMetrics
AI & Analytics

What It Takes to Make Self-Service Analytics Actually Work

Pointing an LLM at your data is the easy part. The hard part is making its answers trustworthy — and it's the part SealMetrics built first, then shipped as LENS AI.

10 min readBy Rafa Jiménez

Ask your analytics team why sales were soft last week. If the answer already lives in a dashboard, you have it in a minute. If it doesn't, you have a ticket — and a two-day wait — because the handful of people who can safely query the warehouse are the only ones who know which of its several revenue columns is the one that reconciles with finance.

That is the self-service problem, and it has resisted every obvious fix. Lock the data behind curated dashboards and you get consistency at the cost of every question nobody anticipated. Open the warehouse to everyone and you get a hundred conflicting definitions of the same metric. Either way, the question that doesn't fit an existing chart ends up in a queue.

Large language models look like the escape hatch: connect one, ask in plain English, get an answer. For writing code, that instinct is roughly right — you run the output and it either works or it doesn't. Analytics has no such safety net. A number that is subtly wrong looks exactly like a number that is right. Point a capable model at a raw warehouse and the most likely outcome is not insight; it is confident, well-formatted, false precision — the most dangerous kind of wrong, because nobody thinks to double-check it.

So the real question isn't whether an LLM can answer analytics questions. It's what has to be true underneath for the answers to be worth trusting. There are four things, and they stack.

04
The question
Plain language, from anyone — a CMO, a growth lead, an ecommerce manager. No SQL, no dashboard spelunking.
03
The playbook
Picks the method: which tools to call, in what order, and how to read the result — the routine a senior analyst runs on instinct.
02
Semantic tools (the MCP)
Several dozen named, read-only tools. Each maps one business concept to one canonical metric. Nothing to misread.
01
Complete data (the foundation)
Cookieless, first-party, 100% of traffic, never sampled. The input every answer inherits.
A question travels down the stack; a traceable answer comes back up. Get any layer wrong and the answer is only as good as the weakest one.

Layer 01 — Complete data, or none of the rest matters

Self-service analytics is a data-quality problem before it is an AI problem. If your measurement layer only captures the visitors who accepted a cookie banner, every answer built on top of it inherits that bias — and no amount of prompting fixes a dataset that was never collected. A model reasoning over partial data isn't wrong because it's a bad model; it's wrong because it's reasoning over a fraction of reality and has no way to know it.

SealMetrics starts here. Cookieless, first-party measurement counts events anonymously on your own domain: no cookies to reject, no third-party endpoint for ad blockers to target, nothing on the device to expire, and no sampling at volume. When the model asks “how many conversions from paid search last week,” the number describes the whole of your traffic — not the compliant remainder of it. If you want the arithmetic of how the alternative erodes, we walked through why GA4 ends up showing a sliver of EU traffic.

Layer 02 — A surface the model can't misread

The second failure mode — the model guessing which field means what — is solved by not giving it a warehouse to guess about. The SealMetrics MCP (Model Context Protocol) server doesn't hand the model a SQL prompt. It exposes several dozen named, read-only tools, and each maps a single business concept to a single canonical metric with a fixed contract: get_overview for headline KPIs, get_channels and get_campaigns for acquisition, get_funnel and get_conversions for outcomes, get_landing_pages for entry performance.

When the model wants revenue by country, there is one tool for it and one definition behind it. It cannot accidentally sum a staging column, because there is no staging column within reach — only the metric your team actually agreed on. This is the same idea a good data org implements internally as a semantic layer: force every question through a small set of governed definitions before it touches raw data. The difference is that here it ships as the product. The tool contract is the guardrail — you don't build or maintain it.

LLM + raw warehouseSealMetrics MCP
How it answersWrites SQL against raw tables it has never seenCalls a named tool with a fixed, documented contract
Metric definitionsWhatever the model infers this timeOne canonical definition per concept, always the same
Wrong-but-plausible answersHigh — a staging column looks like a real oneStructurally constrained — no raw columns in reach
Personal data exposurePossible — raw rows may carry PIIImpossible — 0 PII by construction, rejected at the tool
AttributionWhatever the schema happens to encodeLast-click, aggregate, on 100% of traffic
Where the AI runsWarehouse credentials handed to a modelEU private AI, your own key, or hosted MCP

Layer 03 — A senior analyst's workflow, encoded

Naming the metrics is half the job. The other half is knowing what to do with them — the sequence a good analyst runs almost without thinking when someone asks “why did sales drop.” SealMetrics encodes that sequence as a marketing playbook the model invokes on its own. Ask a diagnostic question and it doesn't improvise; it pulls the playbook, which hands it a method rather than an answer: start with the overview against the previous period, decompose by channel, isolate the campaigns and landing pages that moved, separate traffic from conversion rate, and only then form a diagnosis with a prioritized action list. Then it executes that method by calling the data tools in order.

Ecommerce · a real question, start to finish

You“Sales look soft this week versus last. What happened?”

→ get_overview · this_week vs previous — entrances flat, conversions down
→ get_channels — paid search steady; direct conversions dropped
→ get_landing_pages — one checkout page's conversion rate fell by a third
→ get_conversions — revenue loss concentrated in a single category

LENS AITraffic held — this is a conversion problem, not an acquisition one. It's isolated to your checkout page, where the rate fell sharply mid-week and dragged direct conversions with it while paid search was unaffected. That pattern points to something on the page (a checkout regression or a stock issue), not your media spend. Start there; the budget isn't the cause.

Illustrative worked example. The tool calls and their order are real; the movements are representative. No SQL was written, no dashboard opened, no analyst interrupted.

The method travels to any vertical — only the vocabulary changes. Here is the same loop for a hotel group asking about direct bookings instead of checkout conversions:

Hotels · the same loop, direct bookings

You“Direct bookings are down week over week. Is it our channels or the site?”

→ get_overview · this_week vs previous — sessions steady, bookings down
→ get_channels — metasearch and paid referrals held; direct softened
→ get_landing_pages — one property's booking-engine step lost conversion
→ get_conversions — revenue loss concentrated in that property's rooms

LENS AIDemand held — metasearch and paid referrals are steady, so this isn't an acquisition problem. The drop is concentrated on one property's booking page, where the conversion rate fell mid-week while the rest of the portfolio held. That points to something in that property's booking flow — rate availability, inventory, or a broken step — not your channel mix. Check that booking engine first.

Illustrative. Same method, a hotel portfolio measuring direct bookings — attributed last-click, in aggregate, with no per-guest journey.

The output reads like a junior analyst's first pass — because it followed a senior analyst's checklist. That is the work being automated: not the analyst's judgment, but the retrieval-and-first-pass layer that consumed most of their week. It is the same shift teams are seeing across the stack as AI agents move into the analytics workflow.

Layer 04 — Why the answers stay honest

Self-service is dangerous when it lets people generate authoritative-looking numbers no one can defend. Four things keep it grounded, and all four are structural rather than promised:

  • Zero PII by construction. The event-level tools validate against personal data and reject it. The model cannot surface a person because a person was never stored.
  • Aggregate-only measurement. No per-user journeys, no cross-session identifiers, no multi-touch models — so the model cannot fabricate one. It answers only what aggregate, anonymous counts can answer.
  • One definition per concept. Because each tool carries a single canonical metric, two people asking the same thing in different words get the same number. Consistency is enforced by the surface, not by discipline.
  • Provenance you can trace. Every answer resolves to a named tool over an explicit period in your account timezone — so you can always see which metric produced it, and attribution is last-click at the event level.

And the model runs where your compliance team wants it to. With LENS private AI, inference runs on an open-source model (Gemma) hosted by Scaleway in Paris while your analytics data stays in Dublin — both EU, never shared, never used to train third-party models. Prefer your own stack? Bring your own Anthropic, OpenAI or Gemini key, or connect the hosted MCP at mcp.sealmetrics.com from any compatible client. The data foundation is identical either way; you're only choosing the algorithm.

What it does not replace — on purpose

It is worth being precise about the boundary. LENS AI automates the reporting queue: the “pull me the numbers,” “why did this move,” “how did the campaign do” questions that make up the bulk of inbound analyst requests. It does not design your experiments, reason about causality beyond what the data supports, or set strategy — that judgment stays human, and freeing up time for it is the entire point.

It also won't reconstruct a customer journey or split credit across touchpoints, because SealMetrics doesn't collect the per-user data those reports require. If your model of the world depends on stitching one person's path across sessions, this is the wrong tool, and honestly so. What you get instead is a defensible, complete, aggregate picture that a non-analyst can interrogate directly.

Getting started

The lightest possible start is the open LENS demo at lens-lite.sealmetrics.com, which runs the whole self-service loop on sample data — ask it to boost ROAS, find growth, or cut waste, and watch it work through the method. When you're ready with real data, connect the MCP at mcp.sealmetrics.com from Claude, ChatGPT, Cursor or Claude Code; the same server can even provision a fresh site from the chat, so the tool that answers your questions is also the one that sets you up. The measurement foundation is covered end-to-end on the how it works page.

The upshot is simple, and it's the through-line of all four layers: self-service analytics doesn't start with the model. It starts with data complete enough to trust and a surface constrained enough that the model can't misread it. Get those right — as SealMetrics does — and the analyst's queue mostly answers itself.

Questions teams ask

What is self-service analytics with LENS AI?

It is the ability for a non-technical person — a CMO, a growth lead, an ecommerce manager — to ask a question in plain language and get an answer directly from their own analytics, without writing SQL, hunting through dashboards, or filing a request with a data analyst. LENS AI is the umbrella brand for SealMetrics AI; the SealMetrics MCP exposes several dozen read-only tools that a model like Claude, ChatGPT or Cursor calls on the user's behalf to pull real, complete data.

Why does pointing an LLM at GA4 usually produce wrong answers?

Two reasons. First, the data is incomplete — cookie-based tools miss most EU traffic after consent rejection, ad blockers and browser restrictions, so the model reasons confidently over a fraction of reality. Second, an open warehouse forces the model to guess which of hundreds of fields represents a business concept like 'revenue' or 'conversions,' which produces plausible but wrong numbers. Complete data plus a constrained, named tool surface removes both failure modes.

Does LENS AI reconstruct customer journeys or do multi-touch attribution?

No. SealMetrics measures aggregate, anonymous events and attributes revenue last-click at the event level. It does not identify individuals, does not stitch pageviews into per-user journeys, and does not run multi-touch models. The model can only answer questions the underlying aggregate data can answer — which is exactly what keeps the answers honest.

Where does the AI run, and does my data leave the EU?

With LENS private AI, inference runs on an open-source model (Gemma) hosted by Scaleway in Paris, and your analytics data stays in Dublin — both in the EU. Your data is never shared with any third party and never used to train third-party models. You can also bring your own key (Anthropic, OpenAI or Gemini) or connect the hosted MCP at mcp.sealmetrics.com from your own client.

Does this replace my data analyst?

It replaces the queue in front of your analyst — the steady stream of 'can you pull me the numbers for X' requests. It does not replace the judgment work: experiment design, causal reasoning, and strategy. The realistic outcome is that the analyst stops being a reporting bottleneck and spends their time on the questions that actually need a human.

Related reading

Go deeper