The doctrine, in one line
LLMs are essential for the parts of market intelligence that require language understanding. They are unsafe for the parts that require graph structure.
Most "AI-driven" market research tools use LLMs for everything. The result is intelligence that drifts every run, retrieval that hallucinates, and a graph that has to be rebuilt every quarter.
Theia's architecture splits the work. LLMs do extraction. Math does connections. The graph is stable. The intelligence layer compounds rather than decays.
Where LLMs are the right tool
Four jobs that genuinely require language understanding:
| Job | Why LLM | Theia implementation |
|---|---|---|
| Feature / benefit / use-case extraction from messy text | Requires semantic understanding of "battery lasts ages" → BATTERY_LIFE = positive | Claude Haiku on review/transcript text |
| Sentiment scoring in source language | Requires native-language nuance | Claude Haiku, per-language calibration |
| Cross-language harmonisation (label A → canonical property) | Requires multilingual semantic mapping | Claude Haiku batched harmonisation |
| Strategy synthesis at L1-L4 | Requires reasoning about competitive dynamics | Claude Sonnet, structured output |
These jobs share a property: the input is unstructured language, the output benefits from semantic understanding, and there's no reasonable mathematical alternative.
Where math is the right tool
Four jobs that look like LLM jobs at first glance — and shouldn't be:
| Job | Why math, not LLM | Theia implementation |
|---|---|---|
| Product → product similarity | Reproducible, set-independent edges required | Cosine similarity on raw mention vectors |
| Keyword cluster discovery | Stable communities, no resolution-limit failure | Leiden community detection with Surprise optimisation |
| Keyword → cluster naming | Distinctive, non-generic, interpretable | HHI × traffic scoring |
| Selection (which URLs to scrape, which keywords to keep) | Bounded cost, interpretable cutoffs | Cumulative CTR threshold (e.g. 65%) |
Each of these can be implemented with an LLM. Each is dramatically worse when you do.
The five problems with all-LLM connections
01 — Drift. Re-running an "ask the LLM to cluster these products" call produces different clusters each time. The graph is unstable. Two analyses on the same data give two answers.
02 — Cost compounds linearly. Math operations on a 10K-product corpus cost roughly the same as 100. LLM operations cost 100×. The all-LLM approach can't scale to production volumes without bankruptcy or aggressive sampling.
03 — Set-dependence. Ask an LLM "which products are similar to Canon EOS R6 II?" and the answer depends on which other products are in context. Add Sony A7C II to the prompt; the R6-R8 similarity score changes. Math edges don't have this property — they're stable regardless of which other products are in the analysis.
04 — Super-nodes form. LLMs tend to over-cluster generic concepts. Without distinctiveness scoring, "autofocus" becomes the cluster name for half the camera market — and stops being useful for retrieval. Math-based HHI scoring prevents this.
05 — No audit trail. "The LLM said these products are similar" is not a defensible answer in front of a board, a regulator, or a sceptical analyst. "Cosine similarity 0.74 on raw mention vectors, computed on this run_id" is. The math approach is auditable; the LLM approach isn't.
The 80/20 split
Across the Theia production pipeline:
- ~80% of LLM cost is upstream extraction (review → snippet, transcript → features)
- ~15% is downstream strategy synthesis at L1-L4
- ~5% is harmonisation maintenance
Connection-building operations — clustering, similarity, distinctiveness — are 0% of LLM cost. They run as math on the pre-extracted intelligence layer. Re-clustering the Canon EU market takes ~10 minutes from parquet and costs roughly nothing.
A competitor running connection-building through LLMs would spend $5K-50K per re-cluster, depending on volume. They don't re-cluster as often. Their graphs decay.
Where the discourse is wrong
The popular AI market research discourse frames the question as "should we use AI for market research?" — yes/no. The right question is: where in the pipeline is the LLM the right tool, and where is it the wrong tool?
The doctrine: LLM where the input is language and the output benefits from semantic understanding. Math everywhere else.
This isn't a Theia-specific opinion. It's the operational discipline that separates production-grade AI research infrastructure from demos.
Strategic implication
For any brand evaluating an AI research vendor, ask: "show me the part of your pipeline that doesn't use an LLM."
- If the answer is "everything uses an LLM" — that vendor is shipping demos.
- If the answer is "extraction and synthesis use LLMs; clustering, similarity, selection, and distinctiveness use math" — that vendor is shipping infrastructure.
The brands that compound intelligence value over 18-24 months are buying from the second group. The brands that have to rebuild every quarter are buying from the first.