Design Substrate
The Design Studio composes an OpenAPI specification from matched standards, industry patterns, and your organisation’s conventions. This page describes where each of those inputs lives, and why each service has the responsibility it has.
The substrate is the moat. AI agents are amplifiers, not prerequisites. Direct API callers, scripts, CI pipelines, and humans get the same composition quality — because the platform’s intelligence lives in the data, not in any single consumer’s brain.
Single Responsibility — Five Services
Section titled “Single Responsibility — Five Services”| Service | Responsibility | Holds |
|---|---|---|
| Recommendations | Apiway’s design helper. Pure suggestions for how to make APIs better. Domain-neutral. Does not enforce. | Universal typed refs (Identifier, Timestamp, LocaleCode), universal compositions (link, paginationList, filter, defaultList), cross-domain shapes (Address, Money, PersonalIdentity), and pattern-level guidance (cursor pagination, conditional requests, audit fields, RFC 7807 errors, OAuth2 + ApiKey) |
| Insights | The tenant’s data lake + semantic graph. Both Apiway-seeded domain knowledge and the tenant’s accumulated reality. | Per-tenant: prospects, existing APIs, observed conventions, runtime telemetry. Apiway-seeded baseline: per-domain ontology of typical entities, fields, relationships. Derived: gap, coverage, duplication analysis |
| Architecture | The tenant’s technical inventory + decision record. | Technologies (Postgres, MongoDB, etc.), team skills, ADRs, principles, persona archetypes, infra topology, system diagrams |
| Organisations | The structure of the company + responsibility mapping. | Functions (HR, Finance, Engineering, …), capabilities, teams, business + technical ownership |
| Core | The primary entity store — authoritative rows for the platform. | APIs, products, deployments, prospects, subscriptions, programmes |
How the Substrate Composes a Design
Section titled “How the Substrate Composes a Design”When design-service produces an OAS from a thin request (or from a richer payload posted by an AI agent via MCP), it pulls from all five services and applies deterministic composition:
- Organisations — resolve the function + capability candidate set from the request’s context/domain
- Insights — pull the tenant’s existing APIs, observed conventions, telemetry, and (if available) the captured prospect for this function. Also pull the Apiway-seeded domain ontology for the matched function (
Learning,Finance, etc.) — what entities + fields are typical - Architecture — pull the tenant’s principles, personas, and architectural decisions for this function
- Recommendations — pull universal pattern suggestions (typed refs, pagination shape, error envelopes, audit-field conventions)
- Compose — design-service merges these into a
ProductDesignRequest, builds the OAS via specs-library, applies cross-cutting post-walks (versioned path prefix, schema naming, audit fields, soft-delete semantics, hypermedia links, OAuth2 scheme, pagination envelope), and uploads the result
No LLM lives inside design-service. The platform’s intelligence is the substrate itself.
Universal vs Domain Knowledge
Section titled “Universal vs Domain Knowledge”This is the cleanest mental split in the architecture:
| Layer | Owner | Example |
|---|---|---|
| Universal patterns (domain-neutral) | Recommendations | Identifier shape, paginationList composition, RFC 7807 error envelope, “use cursor pagination” |
| Domain knowledge (per-domain) | Insights — Apiway-seeded ontology | ”Learning APIs typically have LearningPath, Module, Enrolment, Completion, Certification, Notification, …” |
| Tenant truth (per-tenant) | Insights — tenant overlay | ”This tenant has content-library-api with these schemas, uses snake_case fields, prefers offset pagination” |
A new design call layers all three: universal patterns apply uniformly, the domain ontology suggests entity shapes, the tenant overlay overrides where the tenant has its own observed convention or schema.
Insights Is the Compounding Asset
Section titled “Insights Is the Compounding Asset”Recommendations is curated and slow-changing. Architecture is set by the tenant. Organisations is set by the tenant. Insights compounds with every API shipped through the platform.
Every API deployed adds:
- A new existing-API snapshot (operation list, schema fields)
- New convention signals with provenance (“12/13 APIs use snake_case”)
- New telemetry signals once it serves traffic
Every captured prospect adds business intent the next design can reason against. Every governance decision reinforces or weakens an architectural principle.
The more tenants use Apiway, the more the Apiway-seeded domain ontology itself refines from aggregated anonymous observation. Cross-tenant insights become the substrate that no individual customer could build alone.
Gap Analysis Is a Natural Byproduct
Section titled “Gap Analysis Is a Natural Byproduct”Because insights holds both the domain ontology and the tenant’s existing APIs, gap analysis is a graph join:
- Missing entities — ontology minus tenant existing-API schemas. “You have LearningPath + Module + Instructor; you’re missing Enrolment, Completion, Notification, TestFeedback.”
- Coverage percent — how much of the typical domain shape the tenant has built
- Duplication — multiple existing APIs covering the same ontology entity
- Orphans — existing APIs that don’t map to any ontology entity (candidates for retirement or for extending the ontology)
This is what Design Studio’s gap-analysis surface returns. No LLM needed. Just the semantic graph.
What Lives Where — Authoritative List
Section titled “What Lives Where — Authoritative List”If you’re not sure where to put a piece of data, this is the deciding question: what kind of thing is this?
| Thing | Service |
|---|---|
| Pattern for how to express an ID | Recommendations (Identifier typed ref) |
| Pattern for how to paginate | Recommendations (paginationList + skip/take/orderBy parameters) |
| Pattern for HATEOAS links | Recommendations (link schema, links array) |
| Cross-domain shape (Address, Money) | Recommendations |
| Pattern-level guidance (“use cursor pagination”) | Recommendations |
| What entities typically exist in the Learning domain | Insights (domain ontology) |
| What entities typically exist in the Finance domain | Insights (domain ontology) |
| This tenant’s existing APIs + their schemas | Insights (tenant overlay) |
| This tenant’s observed naming conventions | Insights (tenant overlay) |
| This tenant’s runtime telemetry | Insights (tenant overlay) |
| Captured prospects for this tenant | Insights (tenant overlay) — also rows in Core |
| The tenant’s tech stack (Postgres, MongoDB) | Architecture |
| The tenant’s ADRs | Architecture |
| The tenant’s team skills | Architecture |
| The tenant’s persona archetypes | Architecture |
| The function/capability taxonomy (HR, Finance, …) | Organisations |
| Team + ownership mapping | Organisations |
| The authoritative API / product / deployment rows | Core |
| The OAS attached to an API | Core (via specifications) |
Why Not Put It All in One Service?
Section titled “Why Not Put It All in One Service?”A recurring temptation is to consolidate — “insights could hold everything”, “core already has the data”, “why three knowledge services?”. The answer:
-
Single responsibility makes scope provable. Recommendations is small + curated; insights is rich + observed; architecture is tenant-declared; organisations is structural. Each fits in one engineer’s head.
-
Different change cadences. Recommendations updates when industry standards evolve (slow, careful). Insights updates every time the tenant ships anything (fast, frequent). Architecture updates when the tenant chooses new tech (deliberate). Putting them together couples cadences that should be independent.
-
Different governance models. Apiway owns recommendations; the tenant owns architecture + organisations; insights is observed (Apiway curates the baseline, the tenant accumulates the overlay). Conflating governance roles is a security and audit problem.
-
Single source of truth per concern. When everyone asks the same service for the same thing, there’s no “which copy is right?” question.
The five-service shape exists because each concern has a different shape, cadence, and owner.
Warm-start: Insights Begins Useful, Not Empty
Section titled “Warm-start: Insights Begins Useful, Not Empty”A fresh non-personal tenant signs up. Insights would otherwise be empty for months. They’d get industry-typical output from the Apiway-seeded ontology but no tenant-specific override — because there’s no tenant truth captured yet. The platform feels generic for the first dozen designs.
The warm-start pipeline solves this. At non-personal tenant onboarding (and idempotently re-runnable when new connectors are wired), insights ingests from everything the platform already knows about this org:
| Source | What it seeds |
|---|---|
| Registration data | Tenant profile (industry, geography, size, stated domain focus) |
| Organisations-service onboarding | Functions + capabilities + teams as they’re declared |
| Architecture-service tech inventory | Stated tech stack (Postgres, MongoDB, etc.) — drives risk + compatibility |
| Connector wire-up (GitHub / Bitbucket / Azure Repos) | Existing OAS files → ExistingApiInsight with operations + schemas |
| Existing OAS uploads at setup | Tenant conventions, schema reuse signals |
| Connector wire-up (Linear / Jira / Confluence) | Drafted prospects + ADRs |
| Stated compliance posture | Initial ComplianceMandate set |
Day 1, that tenant’s first design call has real existing APIs to match conventions against, observed naming patterns with provenance, an initial captured prospect, and tech-inventory-aware risk signals. The platform feels personal immediately.
Setup-service owns the warm-start orchestration. Insights-service exposes a baseline-ingest write surface. Design-service is unchanged — richer insights at consumption time is automatic.
Data Quality: Quality Over Quantity
Section titled “Data Quality: Quality Over Quantity”Warm-start ingestion and runtime auto-capture (web enrichment, gateway telemetry, OAS uploads) are powerful but dangerous. What separates a useful semantic graph from a noisy data dump nobody trusts is quality bars at ingestion, provenance on every stored item, and trust tiers on consumption.
The principle: never store anything that doesn’t pass the bar. Reject loudly, log the rejection, surface to the tenant for review if appropriate. False rejects are cheap; false accepts pollute the moat.
Quality bars per artefact
Section titled “Quality bars per artefact”| Artefact | Stored only if | Rejected if |
|---|---|---|
| ExistingApiInsight | OAS validates 3.x; function/capability mapping; ≥1 operation with summary; production deployment; not test/poc/sandbox-tagged | Validation fails; orphan (no function); only stub operations; abandoned (no deployment activity in 90 days) |
| ApiSchemaSnapshot | Referenced from ≥1 operation; ≥3 fields; recognisable naming convention | Synthetic Create/Update/Response wrapper; orphan; single-field stub |
| ConventionSignal | Adoption ≥ 0.6 across ≥3 APIs; provenance documented | Single-API observation; contradictory observations; below threshold |
| TelemetrySignal | ≥30 days observation; ≥1000 requests; production traffic; statistical significance | Transient pattern; load-test traffic; insufficient volume |
| ProspectIntent | Intent ≥50 chars narrative; verified owner; ≥1 constraint or persona; function mapping; semantic dedup against existing | Stub intent; no owner; no function; duplicate within 30 days |
| DomainEntityTemplate (discovered via enrichment) | Source URL traceable; ≥3 fields; entity name matches recognisable noun; not already an alias | Web-search snippet without source; field-list extracted from prose; name conflict |
| Tech inventory entry | Stated in production manifest / IaC / CI; not deprecated; version ≥ supported floor | Sandbox-only; banned tech list; abandoned (no commits/deploys in 180 days) |
Provenance metadata
Section titled “Provenance metadata”Every artefact stored in insights carries provenance. Non-negotiable.
{ "source": "github.com/acme/learning-platform/blob/main/apis/learning-paths.yaml", "capturedBy": "setup-service:warm-start-flow", "capturedAt": "2026-05-20T14:30:00Z", "trustTier": "admin-approved", "confidence": 0.92, "validationLog": ["oas-3.1.0-valid", "has-function-mapping", "12-operations-with-summaries"]}This metadata enables:
- Audit — “where did this LearningPath shape come from?”
- Confidence-weighted consumption — design-service applies only Tier 1–3 by default
- Cross-tenant aggregation safety — anonymised aggregation reads only Tier 1–2
- Tenant transparency — visible audit log of what was ingested, what was rejected, why
Trust tiers
Section titled “Trust tiers”| Tier | Source | Design-service consumes? | Cross-tenant aggregation feeds it? | Gap analysis includes? |
|---|---|---|---|---|
| 1 — Apiway curator-seeded | Apiway-owned baseline ontology | Always | n/a (this is the baseline) | Always |
| 2 — Admin-approved | Tenant admin explicitly approved | Always | Yes (anonymised) | Always |
| 3 — Auto-high-confidence | Passed all quality bars + provenance verifiable | Yes | Optional (default no) | Yes, with confidence marker |
| 4 — Pending review | Captured but failed one or more bars | No | No | Optional (“needs admin review”) |
| 5 — Rejected | Failed multiple bars or contradicted existing | No | No | Audit log only |
Staging / review surface
Section titled “Staging / review surface”The warm-start pipeline and runtime auto-capture don’t write directly to the canonical store. They write to a staging surface that runs the gates:
- Item arrives at insights ingestion endpoint
- Validators run (OAS validation, dedup, threshold, provenance)
- Item routed by tier:
- Tier 1–3 → canonical store, immediately available to design-service
- Tier 4 → review queue, surfaced to tenant admin
- Tier 5 → rejected, audit log entry
- Tenant admin can promote Tier 4 → Tier 2 (explicit approval) or accept rejection
Cross-tenant aggregation safety
Section titled “Cross-tenant aggregation safety”The cross-tenant aggregation path is the highest-stakes. One tenant’s bad data leaking into the cross-tenant ontology pollutes the baseline for every tenant. So the aggregation feeder:
- Reads Tier 1–2 only
- Requires the same pattern observed across ≥N independent tenants
- Requires median confidence across observations above 0.8
- Runs anomaly detection against the existing baseline; contradictions block aggregation
This is what protects the moat over time. Without these gates, the more tenants Apiway has, the noisier the baseline becomes — and the platform value degrades with scale. With them, the platform’s intelligence compounds as it grows.
Storage Discipline: Quality Over Quantity is a Cost Principle Too
Section titled “Storage Discipline: Quality Over Quantity is a Cost Principle Too”Quality bars at ingestion aren’t only about consumer confidence and audit fidelity — they also keep the platform economically viable as tenant count grows. Unbounded data growth would kill unit economics: a tenant paying $1,000/month is worthless if their storage costs $1,000/month to keep.
Every new data type added to the substrate must answer three questions:
- What’s the retention policy? Is this kept forever, or does it expire?
- Aggregated or per-event? If per-event, what’s the volume at scale?
- What’s the per-tenant growth curve? Linear with usage, bounded by tier, or amortised across tenants?
Required answers per data type
Section titled “Required answers per data type”| Data type | Retention | Storage shape | Per-tenant growth |
|---|---|---|---|
Domain ontology (insights-platform.domains, domain-schemas) | Forever (authoritative) | Replaced on upsert, not appended | None — shared across all tenants |
Domain overlays (insights-{tenantId}.domain-overlays) | Forever | Sparse delta only — never duplicate the baseline | O(overrides), small |
Review queue (insights-platform.domain-schemas-review) | TTL 60 days | Items reclaimed if not promoted | O(unreviewed submissions), bounded |
Ingestion audit (insights-platform.ingestion-audit) | TTL 30 days (rejected), 90 days (accepted, configurable longer for regulated tenants) | Append-only, TTL-bounded | O(submissions/day × TTL), bounded |
ProductInsight aggregates (insights-{tenantId}.product-insights) | Forever — these ARE the tenant’s view | Aggregate updates, not per-event | O(products × APIs × subscriptions), small |
| Telemetry signals | Hourly/daily rollups, NOT per-event | Aggregated only | O(active APIs × rollup-granularity × TTL), bounded |
| Cross-tenant baseline aggregations | Derived on read or cached short-lived | NEVER materialised per-tenant | O(domains × patterns), platform-wide, small |
| Audit / event-history on entities | Lives in audit-log (TTL’d), not on the entity itself | Provenance.ValidationLog reflects CURRENT validation, not history | Constant per entity |
Storage modelling — realistic Fuse-scale numbers
Section titled “Storage modelling — realistic Fuse-scale numbers”A new non-personal tenant (e.g., Fuse) onboarding + designing + deploying their first API costs approximately:
- Onboarding: ~50 KB across organisations + architecture + insights ProductInsight
- First API design + deploy: ~85 KB across ProductEvents + Design metadata + OAS + Deployment + Subscription
- One-year active growth (10 APIs, ~50 events, ~20 prospects, ~100 captures): ~3–5 MB per tenant
At MongoDB Atlas M10 pricing this is under $1/year per tenant for storage. Storage is not the unit-economics constraint.
Where cost explodes without discipline
Section titled “Where cost explodes without discipline”The danger isn’t the substrate today — it’s future capabilities done naively:
| Capability | Naive cost | Mandatory mitigation |
|---|---|---|
| Per-request telemetry capture | ~21 GB/month per API per tenant at 1,000 req/min | Aggregate to hourly/daily rollups; never store per-event |
| Cross-tenant baseline aggregation | Linear with tenants × depth of history | Derive on read; cache short-lived; never materialise per-event |
| Audit log without TTL | ~36 MB/year per tenant | TTL indexes mandatory at deploy |
| OAS version history without compression | ~5 MB per tenant after 100 versions | Compress at scale; reference deltas where useful |
Operational mandates
Section titled “Operational mandates”- TTL indexes are deployment manifests, not application code. Code documents the policy; manifests apply it. Reviewers verify the manifest matches the documented policy.
- Per-tenant databases multiply connection pool + index overhead. Reasonable at low-hundreds of tenants; reconsider at >1,000 (collapse to single-DB with
tenantIdfield). - Compression on Mongo (zstd) for collections with large documents (OAS specs, schema snapshots) — toggle when collection >100MB.
- Retention exceptions for regulated tenants (HIPAA: 6 years; SOC 2: as contracted; PCI-DSS: 1 year minimum) are configured per-tenant, not platform-wide.
This is the same discipline as quality bars — applied to cost instead of correctness. Both protect the platform’s long-term viability. Skipping either ends the platform.
For AI Agents
Section titled “For AI Agents”AI agents (Cursor, Claude, Copilot, etc.) interact with the platform through alpha-gateway’s MCP exposure. They call the same HTTP endpoints any other consumer does — including POST /v1/designs. Their value is in capturing richer prospect intent + shaping rich request payloads. They do not replace the substrate; they are amplifiers on top of it.
A direct API caller posting a thin DesignRequest gets the same composition quality as an AI agent posting a rich one — because the substrate is what produces quality, not the caller.