Case Study — Enterprise Design System · Governance & Architecture

bpCore

Architecting bp’s enterprise design system from first principles — building the token architecture, component model, and governance structure that enabled five independent product squads to ship consistently across web and mobile without central bottlenecks.

Role

Design System Lead · Architecture & Governance

Domain

Enterprise Design System · Web & Mobile

Period

2022–Present

Scope

System architecture · Token taxonomy · Governance model · Cross-squad adoption across 5 product teams

01 — Industry Context

Why enterprise design systems fail, and what it costs when they do

A design system at enterprise scale is not a Figma library. It is a governance infrastructure — a set of technical decisions, process structures, and organisational agreements that determine whether independent product teams make consistent decisions without requiring a central authority to review every one. At the scale of a multi-product portfolio with teams operating on independent roadmaps, the limiting factor is almost never the quality of the components. It is whether teams actually use them — and whether the system survives the first year of production without fragmenting back into the state it was built to replace.

The failure mode most organisations encounter is predictable. A design system is built with genuine technical quality, shipped with internal fanfare, and quietly abandoned within 18 months. The components are too rigid to accommodate real product complexity. The contribution pathway is too slow or too opaque for squads to use. Governance sits with a small central team that becomes a bottleneck rather than an enabler. Teams learn to work around the system rather than through it, and the component library that was supposed to consolidate debt instead becomes another layer of debt sitting alongside the existing one.

What separates a design system that survives from one that doesn’t is rarely technical. The token architecture can be well-structured, the components well-built, the documentation thorough — and the system can still fail if the organisational conditions for adoption are not deliberately designed. Adoption has to be more clearly faster than non-adoption, or rational teams will find ways around the overhead. Contribution has to be lightweight enough to use under delivery pressure, or gaps in the system will be filled locally and never consolidated. Governance has to produce decisions quickly enough to be useful, or it will be routed around.

bp’s product portfolio had reached the scale where the cost of fragmented design was measurable: in handoff overhead, in onboarding time for new designers, in the rework that accumulated when inconsistent components were discovered late in delivery cycles. The problem was not that the individual squads were building poorly — it was that they were each building the same things independently, and no mechanism existed to consolidate those decisions into shared value.

Enterprise Design System — Maturity Levels & Failure Points

Maturity Level

What Exists

What Typically Breaks

Component Catalogue

Shared UI files, ad-hoc patterns, no semantic layer

Brand updates require manual find-and-replace across every team's codebase

Token Layer

Semantic naming, Figma variables, design/code mapping

Without governance, tokens drift between design and engineering over time

Governed System

Contribution model, cross-team review, documented decisions

Governance too heavyweight → teams build outside the system rather than through it

Platform

Cross-squad alignment, adoption tracking, measurable velocity impact

What bpCore was built to achieve and sustain

02 — Problem Definition

Design debt compounding faster than any squad could address alone

bp’s product portfolio had grown to span multiple web and mobile products, each built and maintained by independent squads. Without a shared foundation, the same design problems were being solved independently across every team — producing subtle variations that compounded into a measurably incoherent experience across the suite.

Redundant Component Production

Every squad maintained its own version of the same primitives — buttons, form fields, data tables, status indicators. The same design decisions were being made five times, independently, producing divergent outputs with no mechanism to consolidate them.

Divergent Visual Language

Subtle variations in colour application, spacing, type weight, and interaction patterns had accumulated across squads. No single violation was egregious — the aggregate effect was a product suite that did not read as a coherent platform, even where individual products were well-executed.

Fragile Design-to-Dev Handoff

Without shared component definitions, every design handoff required engineers to interpret component intent from Figma files. Spec clarification cycles were routine. Rework after handoff — when engineers discovered ambiguities that had not been caught at review — was a consistent source of delivery overhead.

No Propagation Mechanism

Brand updates or spacing scale changes required manual search-and-replace across every squad's Figma files and codebase. There was no single source of truth from which changes could propagate. Every global decision was a co-ordination exercise across five independent teams.

Onboarding Without a Baseline

New designers joining any product squad faced months of reverse-engineering implicit conventions — learning the squad's specific patterns, component naming, and layout rules from existing work rather than from documented standards. Reaching production-ready output required a long implicit apprenticeship.

No Contribution Pathway

When a squad hit a UI requirement not covered by shared patterns, they built locally. There was no route for that solution to become shared value. Patterns that could have benefited every team were instead built, used once, and left to accumulate as unreachable squad-specific debt.

03 — Audit & Diagnosis

Mapping the actual state before deciding what to build

Before any system architecture decisions were made, a structured audit was conducted across all five product squads — cataloguing every component in active production use, mapping the variations that existed across teams, and identifying where debt was most severe. The audit was not a design exercise; it was an organisational diagnosis. The goal was to understand the actual state of the portfolio, not the stated state.

Component-level duplication was more severe than any single team had visibility into. Each squad could see their own debt clearly; no one had visibility across all five simultaneously. The audit made the aggregate visible — and made the case for a shared foundation in concrete operational terms rather than design principle arguments.

Interviews with designers across squads surfaced the workflow cost of the fragmented state: the time spent reverse-engineering other squads’ components when work required cross-product consistency, the overhead of maintaining squad-specific documentation for patterns that every team needed, and the recurring tension between delivery speed and visual quality when no shared component existed for the thing being built.

The audit also identified which components had converged naturally across squads — cases where teams had independently reached similar solutions — and which had diverged most severely. Convergent components became the first candidates for consolidation; divergent components revealed where the design problem was genuinely unresolved rather than merely under-communicated.

Component Audit — Duplication Across Five Product Squads

Component

Squads with own version

Variants found

Design debt

Button variants

5 of 5

Critical

Data table

5 of 5

Critical

Form fields

4 of 5

High

Status indicators

5 of 5

Critical

Navigation patterns

3 of 5

High

Modal / dialog

4 of 5

High

Empty states

3 of 5

Medium

Loading patterns

4 of 5

High

Finding 01

No component had a single agreed definition

Every shared UI primitive had at least two competing implementations across squads. There was no canonical version — only competing local conventions, none of which had enough authority to become the standard without a deliberate process to establish it.

Finding 02

Design debt was invisible at the squad level

Individual squads had no visibility into the aggregate. Each team knew their own debt. The cross-portfolio picture — five teams separately maintaining 6+ button variants — was only visible from the audit, and it changed the stakeholder conversation significantly.

Finding 03

Handoff cost was structural, not behavioural

The spec clarification overhead in design-to-dev handoff was not caused by unclear designers or inattentive engineers. It was caused by the absence of shared component contracts. The problem could not be solved by better communication — only by shared definitions.

Finding 04

Adoption would be voluntary or it would fail

The audit interviews surfaced a consistent signal: squads would not adopt a new system under delivery pressure unless it was demonstrably faster than the status quo. Mandate without value was a path to the same failure pattern the previous tools had followed.

04 — System Architecture

Token-first, composable by design

The architectural decisions made at the foundation level determined what was possible everywhere above it. Getting the token taxonomy right was not a design detail — it was the structural commitment that made propagation, platform adaptability, and design/code parity viable. Every subsequent decision was made in service of the same goal: a system that was genuinely faster to use than not using it.

Architectural Decision

Rationale

Layer

Semantic tokens as foundation

Named design decisions by their purpose, not their value. color.ink.primary is a semantic role; #1A1A1A is an implementation. A rebrand updates the implementation — nothing else changes. This was the decision that made global propagation viable.

Foundation

Three-tier token hierarchy

Global tokens (raw values) → Alias tokens (semantic roles) → Component tokens (contextual use). Components never reference global tokens directly. This isolation meant component-level changes could not inadvertently affect unrelated surfaces.

Architecture

Composition over enumeration

Structured the component library around composition rules rather than a catalogue of pre-built variants. Primitive components assembled into patterns; patterns into templates. Fewer total components expressing a wider range of UI — and a model that squads could extend without forking.

Components

Figma / code parity as an organisational commitment

Tokens exported from a single source via Style Dictionary — changes in the token file propagated to Figma and code simultaneously. Parity was not a state to be maintained manually; it was a structural property of the architecture. Squads could trust that what they saw in Figma was what engineering would produce.

Parity

Platform-adaptive components

Components designed to serve both web and mobile surfaces from shared token foundations, with platform-specific rendering logic handled at the component level. Product teams did not need to maintain separate web and mobile design files for the same component.

Platform

Documentation as a first-class deliverable

Every component shipped with usage guidelines, do/don’t examples, accessibility notes, and token references — authored alongside the component, not added retrospectively. Documentation that arrived after adoption had already happened was too late to influence decisions.

Enablement

Token Architecture — Three-Tier Hierarchy

Tier

Type

Example

Architectural Purpose

Global

Raw values

#1A1A1A · 16px · 400

Foundation — never referenced directly in components or product code

Alias

Semantic roles

color.ink.primary → #1A1A1A

Named by purpose, not value. A rebrand updates the alias target — nothing else changes

Component

Contextual use

button.label.color → color.ink.primary

Scoped to component usage. Components never reference global tokens directly

05 — Governance Model

Governance designed for adoption, not enforcement

The governance model was the product. The component library was evidence that the governance worked. A system without contribution workflows eventually becomes a snapshot — correct at launch, increasingly outdated as products evolve around it. The governance model was what made bpCore a living system rather than a well-documented starting point that squads forked on day one.

The core design decision was incentive over mandate. Teams were never required to adopt bpCore — they were given strong reasons to want to. The first reason was time: using a bpCore component was consistently faster than building an equivalent from scratch, because it came with tokens, accessibility, documentation, and a code implementation already resolved. The second reason was quality: the review process that governed contributions meant every component in the system had been validated to a standard individual squads rarely had time to apply under delivery pressure.

Contribution was treated as a first-class workflow, not an afterthought. When a squad hit a UI requirement not covered by the existing component library, they had a clear, lightweight path to propose it for inclusion. Proposals were reviewed against explicit criteria — semantic correctness, accessibility compliance, token alignment, usage generalisability — and decisions were documented publicly regardless of outcome. A rejected proposal with a documented rationale was more useful to the squads than an approved one that had been accepted silently.

Cross-squad visibility was built into the governance process from the start. Contribution proposals were visible to all squads before decisions were made — which surfaced cases where multiple teams had independently identified the same gap, and allowed combined input to produce a better-specified solution than any single squad would have produced alone. The process created cross-squad communication that had not previously existed.

Contribution Workflow — Before & After Governance

Before — Ungoverned Contribution

1
Squad hits a UI gap not covered by existing patterns
2
Builds a local solution in their own component file
3
Solution ships to production inside the squad's product
4
Pattern is never reviewed, never documented, never reusable
5
Same gap appears in another squad → rebuilt again independently

Debt compounded each sprint · no consolidation path · knowledge siloed per squad

After — Governed Contribution Model

1
Squad identifies gap → opens a contribution proposal with usage context
2
Core review: evaluated against existing patterns, accessibility, token compliance
3
Decision with documented rationale — merged to core or alternative recommended
4
If merged: component documented, tested, available to all squads immediately
5
Proposing squad gets credit for the contribution — incentive, not mandate

Debt eliminated at source · every gap becomes shared value · adoption self-reinforcing

Governance Principle

How it operated in practice

Purpose

Proposal-based contribution

Any squad could submit a contribution proposal. The bar was not perfection — it was a clear use case, token compliance, and evidence of real product need. Proposals without a real use case were declined with rationale.

Quality

Public review decisions

All review outcomes — approved, rejected, deferred — were documented in the contribution log, visible to all squads. Rationale was always provided. This made the governance process legible and predictable.

Trust

Adoption tracked, not assumed

Squad adoption was tracked quarterly — which components were in use, where the system had been extended locally, and which gaps remained unaddressed. Tracking made adoption visible to stakeholders and identified the next priority contributions.

Accountability

No breaking changes without migration support

Any change to a shipped component that would require squads to update their implementation was accompanied by a documented migration path and a deprecation window. Breaking changes without migration support destroyed trust faster than any benefit they could provide.

Safety

06 — Enterprise Constraints

Constraints that shaped every architectural decision

A design system serving a multi-product enterprise portfolio operates under constraints that a squad-level component library does not. The system had to be correct at scale, not just in isolation — and correctness at scale requires accounting for the organisational and technical complexity the system lives inside.

Multi-Platform Delivery

bpCore served both web and mobile product surfaces — different rendering environments, different interaction paradigms, different performance constraints. The token architecture provided a shared semantic foundation; platform-specific rendering was handled at the component implementation level. Squads building for mobile did not maintain a separate design system.

Independent Squad Release Cadences

Five product squads shipped on different cycles, with different engineering priorities and different product managers. The governance model could not assume synchronised releases or simultaneous adoption. Version management and deprecation windows had to accommodate squads in different states of adoption at the same time.

Accessibility — WCAG 2.1 AA

Enterprise software procurement in regulated industries requires demonstrable accessibility compliance. Colour contrast ratios, keyboard navigation, focus management, and screen reader compatibility were validated as part of the component review process — not assessed by individual squads during product delivery. Accessibility was a property of the system, not a squad responsibility.

Brand Governance vs. Product Flexibility

bp operates a global brand with defined visual identity standards. bpCore had to implement those standards precisely while giving product teams the flexibility to compose interfaces that served different functional contexts. The token layer was the mechanism that made this possible — brand values lived in the global token tier; product teams composed from the semantic tier above it.

Figma / Code Synchronisation Integrity

Token parity between Figma and code was a structural commitment, not a best-effort goal. Any state where the design environment and the engineering environment diverged — even temporarily — eroded designer confidence in the system and reintroduced the spec clarification overhead the system had been built to eliminate.

Onboarding New Designers Mid-Cycle

Product squads hired designers throughout the year, not only at system launch. The documentation and onboarding pathway had to be sufficient for a designer joining six months after launch to reach production-ready output without needing a walkthrough from a system team member. Documentation that required institutional knowledge to interpret was a system failure.

07 — Outcomes

Operational outcomes across design, engineering, and the product organisation

Outcomes were tracked against baselines established during the audit phase — component duplication counts, spec clarification frequency, onboarding time, and cross-squad consistency measures. The most significant outcomes were not the metrics; they were the organisational behaviours the system changed.

Area

Signal

Outcome

Handoff Friction

60% reduction in spec clarification cycles

Measured by reduction in design-to-engineering clarification tickets and rework rounds after handoff. Shared component contracts replaced per-handoff specification — engineers could implement against the system definition rather than interpreting intent from Figma files. The reduction was tracked over three release cycles after full adoption.

Designer Onboarding

Day-one production-ready baseline

New designers joining any product squad after bpCore adoption reached production-ready output from their first working day — onboarding against documented standards rather than reverse-engineering squad conventions. The previous average was measured in weeks, not days. This had a direct impact on squad delivery capacity during ramp periods.

Component Debt

Zero new component duplicates post-adoption

After all five squads had adopted bpCore, the audit process confirmed no new duplicate components had been introduced. Gaps in the system were handled through the contribution workflow — solutions became shared value rather than local debt. The compounding effect that had produced the pre-bpCore state was structurally interrupted.

Brand Propagation

Single-source updates across web and mobile

The first brand token update after full adoption required changes to a single token file — and propagated to both Figma and all production surfaces automatically. The equivalent update in the pre-bpCore state would have required coordinated manual changes across five squad codebases and Figma libraries, with a high probability of inconsistent application.

Cross-Squad Consistency

Unified visual language across 5 squads

Qualitative assessment confirmed by design review: the product suite read as a coherent platform following adoption. The subtle variations that had accumulated across squads — spacing inconsistencies, type weight deviations, status pattern divergence — were resolved not by correction but by structural elimination. Squads composing from the same token layer could not produce the divergence that had previously required ongoing review to identify.

Governance Sustainability

Contribution model operating without central bottleneck

The contribution review process was handling squad proposals within the defined review window consistently twelve months after launch — without requiring the system team to scale headcount proportionally to the squad workload it served. The governance model had been designed to distribute the review load rather than concentrate it, and that design held under production conditions.

08 — Reflection

What this work taught me about design systems leadership

The most useful reframe I arrived at during this work was that the governance model was the product — the component library was evidence that the governance was working. A design system without a contribution workflow and a clear decision process is a snapshot. Snapshots age. The longevity of the system depended on whether the process structures we built alongside the components were robust enough to handle the real complexity of five independent teams with different priorities operating under delivery pressure.

The decision to incentivise adoption rather than mandate it was the right call, but it required discipline to maintain. Mandates are easier to enforce than incentives are to sustain — and there were moments when delivery pressure produced arguments for making adoption compulsory. The argument we held to was that a system teams were required to use but did not find faster was a system they would circumvent. The only durable adoption was adoption that made rational sense under the conditions squads actually operated in.

The audit before building was the most important investment in the project. The component duplication data changed the stakeholder conversation from a design principle argument — “a design system would be valuable” — to an operational cost argument — “we are paying this specific cost right now, and it is growing.” Those are different conversations with different levels of organisational traction.

If I were approaching this work again, I would involve engineers earlier in the token naming convention decisions — specifically before those conventions were finalised. The semantic layer we built was technically correct, but some of the naming choices that felt natural from a design perspective required translation effort from engineers accustomed to different naming conventions in their codebases. Earlier cross-functional input would have produced naming that was more immediately legible on both sides of the handoff.

The cross-squad visibility the contribution process created was an unexpected secondary outcome. Squads that had not previously had structured communication about design decisions found themselves in productive conversations about shared problems because the contribution workflow surfaced them. The system produced organisational alignment as a side effect of the governance process — something that had been a deliberate goal of the design but that emerged more strongly than anticipated once the process was running.

What this work confirmed above all: design systems problems are organisational problems that manifest technically. The component duplication, the handoff friction, the onboarding overhead — these were symptoms of teams that had never been given a viable path to shared decisions. Building that path was the design work. The components were what it produced.