AI due diligence · tech moat 2026

AI Due Diligence 2026: How Pre-Seed Startups Build a Real Tech Moat

Q: What does „getting Sherlocked“ mean for AI startups?

„Sherlocking“ describes a large platform (OpenAI, Google, Microsoft, Anthropic) natively integrating a feature a startup used to offer — wiping out the startup's reason to exist overnight. In 2025/2026 this mostly hit thin wrappers without a proprietary data asset or workflow depth. The term originates from the Apple world (Sherlock vs. Watson) and is now the standard term for this platform risk.

Q: What exactly is an AI moat in 2026?

An AI moat in 2026 consists of three layers: a System-of-Workflow Moat (deep embedding in daily work, high switching costs), a Proprietary Context Engine (your own RAG pipeline plus data refinement that compounds with usage) and Agentic Orchestration (autonomous, multi-step agents instead of a single prompt). A moat is not the model — models are swappable and constantly get cheaper.

Q: Can you patent an AI wrapper?

A plain prompt or a UI on top of someone else's API is generally not patentable and barely defensible. A concrete, novel technical method can be patentable — for instance a specific orchestration or retrieval technique. That is a question for a patent attorney; this article is not legal advice. For most pre-seed startups the more robust protection is the workflow and context moat anyway, not a patent.

Q: What does a technical audit or tech moat check cost?

At decivo the moat analysis is not an isolated audit but part of the Innovation Workshop (€7,500 net, including a clickable prototype). There we identify the critical dependencies and lock in the moat architecture. If the technical proof should actually be built — a working context engine and agent architecture — that is the Code Prototype (€12,500 net).

Q: pgvector or Qdrant — what do we use for the context engine?

Rule of thumb for 2026: below roughly 5 million vectors, pgvector in your existing Postgres is usually the right call — one fewer component to operate, with an HNSW index under 20 ms p50 at over 95% recall. Above 5 million vectors, with a sub-10ms requirement or very filter-heavy queries, Qdrant plays to its strengths. The 2026 trend clearly points toward consolidation into the relational database.

Q: Is a vibe-coding prototype from Lovable or Cursor enough for tech DD?

For customer conversations often yes, for technical due diligence usually no. Investors grill exactly the parts a fast prototype skips: data architecture, cost structure, model dependency, security. The prototype is a good starting point — it only becomes investable with a clean moat architecture behind it.

Q: How early should a pre-seed startup build the moat?

You do not need the full moat before the first pitch. But you need a credible architecture story and at least one tier as a demonstrable start — usually the context engine or the workflow integration. VCs do not expect a finished moat at pre-seed; they expect proof that you understand where it forms and that your architecture does not block it.

In 2026 no serious VC invests in thin AI wrappers anymore. If you do not want to die at the next OpenAI update, you need a real moat. Here is the architecture that makes it provable — plus the formula to test your own risk.

By Janni HaresMay 19, 202616 min read

Every time OpenAI or Google ships an update, startups that only put a thin surface on someone else's API disappear. The industry calls this „Sherlocking“ — and in 2026 it is no longer a fringe phenomenon but the central platform risk of every pre-seed round in AI.

In English there are solid analyses from Andreessen Horowitz and Sequoia. In German the concrete, technical guide is still missing: not „build a moat“, but which layers, which database, which architecture — and how a tech auditor in an investor call recognizes whether it is real.

This article closes exactly that gap. decivo is a Lean Software Studio following the Clarity Before Code principle — we build prototypes and code prototypes so teams decide well before development gets expensive. We use the architecture below whenever founders ask: „Would our MVP survive a technical due diligence?“

TL;DR — the 6 key points

Sherlocking: a large platform natively integrates your feature — devaluing thin wrappers overnight. In 2025/2026 this hit exactly the startups without a proprietary data asset or workflow depth.
An AI moat in 2026 is not the model. It consists of three layers: System-of-Workflow Moat, Proprietary Context Engine, Agentic Orchestration.
The a16z nuance: raw data is not a moat. The moat is the refinement pipeline that turns usage into compounding context.
Sherlock Risk = (API dependency × UI copyability) ÷ proprietary context depth. Each factor 1–10, with clear risk bands.
An MVP becomes investable in 3 steps: context isolation & vector DB, agentic orchestration instead of a single prompt, workflow integration with measurable switching costs.
decivo clarifies the moat architecture in the Innovation Workshop (€7,500 net) and builds the technical proof on demand as a Code Prototype (€12,500 net) — Clarity Before Code.

The death of the AI wrapper: why VCs say no in 2026

An AI wrapper is a product whose entire value sits in someone else's model: prompt in, answer out, a UI around it. The problem is not that it does not work — it often works great. The problem is that nobody owns it.

In 2025 OpenAI shipped AgentKit and Anthropic shipped „Skills“ — capability layers that made entire categories of workflow-automation startups redundant overnight. In early 2026 a Google VP publicly warned that two types of AI startups will not survive — first in line: pure model wrappers. Investor patience for „white-labeled models“ is gone.

Definition

Sherlocking (AI, 2026)

Sherlocking is the practice of a dominant platform natively integrating a feature a third party used to offer — devaluing the third party's reason to exist. For AI startups it hits products whose only value creation was model access.

The flip side is the good news: a platform update is a threat to a wrapper startup — but a tailwind for a startup with a real moat. When the foundation layer gets cheaper and stronger, the product whose value is not in the model is exactly the one that benefits.

If your code came from a fast prototype: Vibe coding — from prototype to product.

The 3-Tier Moat Architecture

A modern AI moat in 2026 consists of three layers. None replaces the other — they stack. The more layers you can demonstrate, the lower the Sherlock risk.

Tier 1

System-of-Workflow Moat

The depth of integration into the user's daily workflow. Not „yet another chat app“, but the place where the work actually happens — with data, state and decisions the user does not want to migrate away from.

Proof for VCs: measurable switching costs. Daily/weekly active use, stored artifacts per account, number of connected third-party systems. A workflow moat shows up in retention curves that flatten instead of decaying.

Tier 2

Proprietary Context Engine

Not a data dump — a refinement pipeline. Your own RAG architecture (retrieval-augmented generation), proprietary embeddings and a feedback loop that turns every use into better context. a16z puts it bluntly: raw data is not a moat, the refinement is.

Proof for VCs: compounding. Answer quality measurably improves with usage time, not with the next foundation-model release. The data asset can only emerge through your product and cannot be rebuilt in 48 hours.

Tier 3

Agentic Orchestration

The shift from a single prompt to autonomous, multi-step agent systems with a typed shared state, memory, self-correction loops and checkpoints for recovery. The orchestration — not the model — is the IP.

Proof for VCs: robustness under model swaps. The system keeps running with GPT, Claude or a local open-source model because the logic lives in the orchestration layer — not in a hardcoded system prompt.

Tier 1System-of-Workflow Moat

Deep workflow embedding · high switching costs

Tier 2Proprietary Context Engine

Own RAG pipeline · compounding data refinement

Tier 3Agentic Orchestration

Autonomous agents · memory · self-correction

Foundation model

swappable · GPT / Claude / local

The single most important sentence for any pre-seed pitch: the foundation model is the swappable pedestal under this architecture, not the architecture itself. a16z nails it in „The Empty Promise of Data Moats“ — what matters is not how much data you have, but whether you have a pipeline that refines it.

The Sherlock Risk Formula

Use this formula to test your startup before a VC does. It is deliberately simple — it should trigger a discussion, not deliver a decimal place.

Sherlock Risk = (API dependency × UI copyability) ÷ Proprietary context depth

Each factor is rated from 1 (uncritical) to 10 (maximal). The higher the raw dependency on a standard API and the easier the surface is to copy, the higher the risk. A deep, proprietary context layer in the denominator pushes the risk toward zero.

The three factors

API dependency (1–10)

10 = the product is instantly dead without exactly one model provider. 1 = the model is swappable behind an abstraction layer, a local fallback exists.

UI copyability (1–10)

10 = a developer rebuilds the surface in 48 hours. 1 = the value sits in the workflow and data, not in the visible surface.

Proprietary context depth (1–10)

10 = a compounding data asset that can only emerge through your product, with a refinement pipeline. 1 = no own context, pure forwarding.

Risk bands

Score ≥ 50 — red flag. Classic wrapper. This does not survive a technical DD.
Score 15–49 — yellow. There is an approach, but at least one tier is undemonstrated.
Score < 15 — green. Demonstrable moat. The platform risk is structurally addressed.

Worked example

Plain GPT wrapper: API dependency 9, UI copyability 8, context depth 1 → (9 × 8) ÷ 1 = 72. Deep red flag. The same product with its own context engine and workflow anchoring: 4 × 3 ÷ 8 = 1.5. Green — and exactly that jump is the job before the pre-seed round.

Wrapper vs. moat: the side-by-side

This table is the checklist a tech auditor mentally walks through. Left the red flag, right the green light — per checkpoint.

VC checkpoint	Red flag — simple wrapper	Green light — real tech moat
Tech stack	Plain API call to OpenAI or Anthropic, thin UI on top.	Orchestration layer (e.g. LangGraph) + swappable model, incl. local open-source fallback.
Data handling	No storage, just forwarding the prompt.	Vector database (pgvector or Qdrant) with proprietary user embeddings and a feedback loop.
Logic layer	System prompt hardwired into the app.	Autonomous agents with memory, self-correction and a typed state.
Context	Generic — the same as ChatGPT with a different logo.	Refined per account — answers get better the longer it is used.
Switching costs	User is gone in seconds, nothing is lost.	Workflow, data and history are anchored in the product.
IP protection	Any developer rebuilds it in 48 hours.	Complex, proprietarily orchestrated interfaces — not trivially replicable.
Sherlock risk	High — dies at the next platform update.	Low — a platform update tends to make the product better.

How to make your MVP investable in 3 steps

You do not need the full moat before the first pitch. You need a credible architecture and at least one demonstrable tier. This sequence has proven itself:

Step 1 — Context isolation & vector database
Cleanly separate user context from the model. Put your own embeddings into a vector database: pgvector in your existing Postgres below ~5M vectors, Qdrant above or for filter-heavy queries. From here a data asset forms that is only reachable through your product — the basis of the Proprietary Context Engine.
Step 2 — Agentic orchestration instead of a single prompt
Replace the hardcoded system prompt with an orchestration layer that has a typed shared state, memory, a self-correction loop and checkpoints for recovery (e.g. with LangGraph). The logic moves out of the prompt into a swappable, testable layer — robust against model swaps.
Step 3 — Workflow integration & switching costs
Anchor the product in daily work: stored artifacts, connected third-party systems, state the user does not want to migrate. Make switching costs measurable — that is exactly the curve investors probe in tech DD.

Visual proof: on the left a screenshot of the pgvector similarity query with hits and latency, on the right a short video of the wrapper-to-agent rebuild with a visible self-correction loop.

Screenshot: pgvector in an existing Supabase Postgres — similarity query with top hits and response time in the result panel.

What the whole path from idea to launch looks like: Building an MVP with AI — the complete workflow.

The tech due diligence cheat sheet

Ten questions that come up in the investor call — and the direction of the best technical answer. Do not memorize, understand: each question probes exactly one tier of the architecture.

Question: What happens to your product if OpenAI ships exactly your feature natively tomorrow?
Best answer direction: Best answer: „We would get better, because the foundation layer gets cheaper and stronger — our value is in refined context and the workflow, not the model.“
Question: What data do you have that a competitor cannot simply buy or scrape?
Best answer direction: Best answer: proprietary data that only emerges through using your product, plus the refinement pipeline on top. Raw data alone does not count.
Question: Is your model swappable — or are you locked to one provider?
Best answer direction: Best answer: the model is swappable behind an abstraction layer; a local open-source model is tested as a fallback.
Question: What does your RAG architecture look like and where do the embeddings live?
Best answer direction: Best answer: a concrete vector database (pgvector below ~5M vectors, Qdrant above), your own embedding strategy, metadata filtering and re-ranking, all explainable.
Question: What is a single prompt and what is real agent orchestration in your system?
Best answer direction: Best answer: draw a clear line — single prompt for trivial tasks, orchestrated agents with state and self-correction for multi-step workflows.
Question: How do you measure that your context improves over time?
Best answer direction: Best answer: a defined quality metric (e.g. task success rate, human rating) tracked over cohort age, not over model releases.
Question: What are your inference costs per active user and how do they scale?
Best answer direction: Best answer: a concrete number per action, a caching strategy, and the point where a smaller or local model takes over.
Question: How do you prevent prompt injection and data exfiltration through the model?
Best answer direction: Best answer: input validation, tool allowlists, separation of user data and system context, never trusting the model as a security boundary.
Question: What exactly is your intellectual property here?
Best answer direction: Best answer: the orchestrated workflow logic and the refinement pipeline — not the prompt. Honestly name what is patentable and what is not.
Question: If we double your team — what do you build first to deepen the moat?
Best answer direction: Best answer: a prioritized roadmap along the three tiers, weakest tier first, with a measurable hypothesis per step.

How decivo helps build the moat

decivo is a Lean Software Studio, not an investor auditor and not an AI tool. We build clickable prototypes, validate with real users and build code prototypes — so you decide well before development gets expensive. That methodology maps exactly onto the moat question.

In the Innovation Workshop (€7,500 net, including a clickable prototype) we analyze the critical dependencies of your AI MVP, compute the Sherlock risk and lock in the moat architecture along the three tiers — including the concrete vector-database and orchestration decision. That is Clarity Before Code: first the architecture VCs sign off on, then the code.

If the technical proof actually has to stand — a working context engine and agent architecture that survives a tech DD — that is the Code Prototype (€12,500 net). Once the direction is clear, we continue the implementation or build the product together with you.

Innovation Workshop — €7,500 net · incl. clickable prototype · Sherlock risk + moat architecture along the 3 tiers.
Code Prototype — €12,500 net · a working context engine and agent architecture that survives a tech DD.
UX Validation Loop — €1,350 net per loop · real users test the workflow that carries the tier-1 moat.
Prototype Loop — €4,500 net per loop · iteratively sharpen toward an investable MVP.

A 15-minute intro call is enough to roughly classify your Sherlock risk — even if the honest answer is that you do not need a workshop yet. See all modules.

FAQ

Frequently asked questions on AI due diligence & tech moat

What does „getting Sherlocked“ mean for AI startups?

„Sherlocking“ describes a large platform (OpenAI, Google, Microsoft, Anthropic) natively integrating a feature a startup used to offer — wiping out the startup's reason to exist overnight. In 2025/2026 this mostly hit thin wrappers without a proprietary data asset or workflow depth. The term originates from the Apple world (Sherlock vs. Watson) and is now the standard term for this platform risk.

What exactly is an AI moat in 2026?

An AI moat in 2026 consists of three layers: a System-of-Workflow Moat (deep embedding in daily work, high switching costs), a Proprietary Context Engine (your own RAG pipeline plus data refinement that compounds with usage) and Agentic Orchestration (autonomous, multi-step agents instead of a single prompt). A moat is not the model — models are swappable and constantly get cheaper.

Can you patent an AI wrapper?

A plain prompt or a UI on top of someone else's API is generally not patentable and barely defensible. A concrete, novel technical method can be patentable — for instance a specific orchestration or retrieval technique. That is a question for a patent attorney; this article is not legal advice. For most pre-seed startups the more robust protection is the workflow and context moat anyway, not a patent.

What does a technical audit or tech moat check cost?

At decivo the moat analysis is not an isolated audit but part of the Innovation Workshop (€7,500 net, including a clickable prototype). There we identify the critical dependencies and lock in the moat architecture. If the technical proof should actually be built — a working context engine and agent architecture — that is the Code Prototype (€12,500 net).

pgvector or Qdrant — what do we use for the context engine?

Rule of thumb for 2026: below roughly 5 million vectors, pgvector in your existing Postgres is usually the right call — one fewer component to operate, with an HNSW index under 20 ms p50 at over 95% recall. Above 5 million vectors, with a sub-10ms requirement or very filter-heavy queries, Qdrant plays to its strengths. The 2026 trend clearly points toward consolidation into the relational database.

Is a vibe-coding prototype from Lovable or Cursor enough for tech DD?

For customer conversations often yes, for technical due diligence usually no. Investors grill exactly the parts a fast prototype skips: data architecture, cost structure, model dependency, security. The prototype is a good starting point — it only becomes investable with a clean moat architecture behind it.

How early should a pre-seed startup build the moat?

You do not need the full moat before the first pitch. But you need a credible architecture story and at least one tier as a demonstrable start — usually the context engine or the workflow integration. VCs do not expect a finished moat at pre-seed; they expect proof that you understand where it forms and that your architecture does not block it.

16 min read

Is AI-Generated Code Production-Ready? What the 2026 Studies Actually Say

Our own study: 12 MVP features from a naive prompt meet only 19.6% of production-readiness controls in chat coding, 47.1% with the newest agent (Codex) — 0 of 12 ready. Plus every major 2026 study (Veracode, GitClear, METR) and the 5-question Build-On Test.

Read article

14 min read

What Does an MVP Cost? Having an MVP Built in 2026 — Real Prices, Not Ranges

The AI price break: why MVPs are cheaper in 2026 than 2024, traditional agency vs. AI-native studio per MVP type, published fixed prices instead of hourly ranges — plus an interactive cost calculator.