Case Study — Enterprise Design System · Governance & Architecture
bpCore
Architecting bp’s enterprise design system from first principles — building the token architecture, component model, and governance structure that enabled five independent product squads to ship consistently across web and mobile without central bottlenecks.
Role
Design System Lead · Architecture & Governance
Domain
Enterprise Design System · Web & Mobile
Period
2022–Present
Scope
System architecture · Token taxonomy · Governance model · Cross-squad adoption across 5 product teams
01 — Industry Context
Why enterprise design systems fail, and what it costs when they do
A design system at enterprise scale is not a Figma library. It is a governance infrastructure — a set of technical decisions, process structures, and organisational agreements that determine whether independent product teams make consistent decisions without requiring a central authority to review every one. At the scale of a multi-product portfolio with teams operating on independent roadmaps, the limiting factor is almost never the quality of the components. It is whether teams actually use them — and whether the system survives the first year of production without fragmenting back into the state it was built to replace.
The failure mode most organisations encounter is predictable. A design system is built with genuine technical quality, shipped with internal fanfare, and quietly abandoned within 18 months. The components are too rigid to accommodate real product complexity. The contribution pathway is too slow or too opaque for squads to use. Governance sits with a small central team that becomes a bottleneck rather than an enabler. Teams learn to work around the system rather than through it, and the component library that was supposed to consolidate debt instead becomes another layer of debt sitting alongside the existing one.
What separates a design system that survives from one that doesn’t is rarely technical. The token architecture can be well-structured, the components well-built, the documentation thorough — and the system can still fail if the organisational conditions for adoption are not deliberately designed. Adoption has to be more clearly faster than non-adoption, or rational teams will find ways around the overhead. Contribution has to be lightweight enough to use under delivery pressure, or gaps in the system will be filled locally and never consolidated. Governance has to produce decisions quickly enough to be useful, or it will be routed around.
bp’s product portfolio had reached the scale where the cost of fragmented design was measurable: in handoff overhead, in onboarding time for new designers, in the rework that accumulated when inconsistent components were discovered late in delivery cycles. The problem was not that the individual squads were building poorly — it was that they were each building the same things independently, and no mechanism existed to consolidate those decisions into shared value.
Enterprise Design System — Maturity Levels & Failure Points
Maturity Level
What Exists
What Typically Breaks
Component Catalogue
Shared UI files, ad-hoc patterns, no semantic layer
Brand updates require manual find-and-replace across every team's codebase
Token Layer
Semantic naming, Figma variables, design/code mapping
Without governance, tokens drift between design and engineering over time
Governed System
Contribution model, cross-team review, documented decisions
Governance too heavyweight → teams build outside the system rather than through it
Platform
Cross-squad alignment, adoption tracking, measurable velocity impact
What bpCore was built to achieve and sustain
02 — Problem Definition
Design debt compounding faster than any squad could address alone
bp’s product portfolio had grown to span multiple web and mobile products, each built and maintained by independent squads. Without a shared foundation, the same design problems were being solved independently across every team — producing subtle variations that compounded into a measurably incoherent experience across the suite.
01
Redundant Component Production
Every squad maintained its own version of the same primitives — buttons, form fields, data tables, status indicators. The same design decisions were being made five times, independently, producing divergent outputs with no mechanism to consolidate them.
02
Divergent Visual Language
Subtle variations in colour application, spacing, type weight, and interaction patterns had accumulated across squads. No single violation was egregious — the aggregate effect was a product suite that did not read as a coherent platform, even where individual products were well-executed.
03
Fragile Design-to-Dev Handoff
Without shared component definitions, every design handoff required engineers to interpret component intent from Figma files. Spec clarification cycles were routine. Rework after handoff — when engineers discovered ambiguities that had not been caught at review — was a consistent source of delivery overhead.
04
No Propagation Mechanism
Brand updates or spacing scale changes required manual search-and-replace across every squad's Figma files and codebase. There was no single source of truth from which changes could propagate. Every global decision was a co-ordination exercise across five independent teams.
05
Onboarding Without a Baseline
New designers joining any product squad faced months of reverse-engineering implicit conventions — learning the squad's specific patterns, component naming, and layout rules from existing work rather than from documented standards. Reaching production-ready output required a long implicit apprenticeship.
06
No Contribution Pathway
When a squad hit a UI requirement not covered by shared patterns, they built locally. There was no route for that solution to become shared value. Patterns that could have benefited every team were instead built, used once, and left to accumulate as unreachable squad-specific debt.
03 — Audit & Diagnosis
Mapping the actual state before deciding what to build
Before any system architecture decisions were made, a structured audit was conducted across all five product squads — cataloguing every component in active production use, mapping the variations that existed across teams, and identifying where debt was most severe. The audit was not a design exercise; it was an organisational diagnosis. The goal was to understand the actual state of the portfolio, not the stated state.
Component-level duplication was more severe than any single team had visibility into. Each squad could see their own debt clearly; no one had visibility across all five simultaneously. The audit made the aggregate visible — and made the case for a shared foundation in concrete operational terms rather than design principle arguments.
Interviews with designers across squads surfaced the workflow cost of the fragmented state: the time spent reverse-engineering other squads’ components when work required cross-product consistency, the overhead of maintaining squad-specific documentation for patterns that every team needed, and the recurring tension between delivery speed and visual quality when no shared component existed for the thing being built.
The audit also identified which components had converged naturally across squads — cases where teams had independently reached similar solutions — and which had diverged most severely. Convergent components became the first candidates for consolidation; divergent components revealed where the design problem was genuinely unresolved rather than merely under-communicated.
Component Audit — Duplication Across Five Product Squads
Component
Squads with own version
Variants found
Design debt
Button variants
5 of 5
6+
Critical
Data table
5 of 5
5+
Critical
Form fields
4 of 5
4
High
Status indicators
5 of 5
6+
Critical
Navigation patterns
3 of 5
3
High
Modal / dialog
4 of 5
4
High
Empty states
3 of 5
3+
Medium
Loading patterns
4 of 5
5
High
Finding 01
No component had a single agreed definition
Every shared UI primitive had at least two competing implementations across squads. There was no canonical version — only competing local conventions, none of which had enough authority to become the standard without a deliberate process to establish it.
Finding 02
Design debt was invisible at the squad level
Individual squads had no visibility into the aggregate. Each team knew their own debt. The cross-portfolio picture — five teams separately maintaining 6+ button variants — was only visible from the audit, and it changed the stakeholder conversation significantly.
Finding 03
Handoff cost was structural, not behavioural
The spec clarification overhead in design-to-dev handoff was not caused by unclear designers or inattentive engineers. It was caused by the absence of shared component contracts. The problem could not be solved by better communication — only by shared definitions.
Finding 04
Adoption would be voluntary or it would fail
The audit interviews surfaced a consistent signal: squads would not adopt a new system under delivery pressure unless it was demonstrably faster than the status quo. Mandate without value was a path to the same failure pattern the previous tools had followed.
04 — System Architecture
Token-first, composable by design
The architectural decisions made at the foundation level determined what was possible everywhere above it. Getting the token taxonomy right was not a design detail — it was the structural commitment that made propagation, platform adaptability, and design/code parity viable. Every subsequent decision was made in service of the same goal: a system that was genuinely faster to use than not using it.
Architectural Decision
Rationale
Layer
Semantic tokens as foundation
Named design decisions by their purpose, not their value. color.ink.primary is a semantic role; #1A1A1A is an implementation. A rebrand updates the implementation — nothing else changes. This was the decision that made global propagation viable.
Foundation
Three-tier token hierarchy
Global tokens (raw values) → Alias tokens (semantic roles) → Component tokens (contextual use). Components never reference global tokens directly. This isolation meant component-level changes could not inadvertently affect unrelated surfaces.
Architecture
Composition over enumeration
Structured the component library around composition rules rather than a catalogue of pre-built variants. Primitive components assembled into patterns; patterns into templates. Fewer total components expressing a wider range of UI — and a model that squads could extend without forking.
Components
Figma / code parity as an organisational commitment
Tokens exported from a single source via Style Dictionary — changes in the token file propagated to Figma and code simultaneously. Parity was not a state to be maintained manually; it was a structural property of the architecture. Squads could trust that what they saw in Figma was what engineering would produce.
Parity
Platform-adaptive components
Components designed to serve both web and mobile surfaces from shared token foundations, with platform-specific rendering logic handled at the component level. Product teams did not need to maintain separate web and mobile design files for the same component.
Platform
Documentation as a first-class deliverable
Every component shipped with usage guidelines, do/don’t examples, accessibility notes, and token references — authored alongside the component, not added retrospectively. Documentation that arrived after adoption had already happened was too late to influence decisions.
Enablement
Token Architecture — Three-Tier Hierarchy
Tier
Type
Example
Architectural Purpose
Global
Raw values
#1A1A1A · 16px · 400
Foundation — never referenced directly in components or product code
Alias
Semantic roles
color.ink.primary → #1A1A1A
Named by purpose, not value. A rebrand updates the alias target — nothing else changes
Component
Contextual use
button.label.color → color.ink.primary
Scoped to component usage. Components never reference global tokens directly
05 — Governance Model
Governance designed for adoption, not enforcement
The governance model was the product. The component library was evidence that the governance worked. A system without contribution workflows eventually becomes a snapshot — correct at launch, increasingly outdated as products evolve around it. The governance model was what made bpCore a living system rather than a well-documented starting point that squads forked on day one.
The core design decision was incentive over mandate. Teams were never required to adopt bpCore — they were given strong reasons to want to. The first reason was time: using a bpCore component was consistently faster than building an equivalent from scratch, because it came with tokens, accessibility, documentation, and a code implementation already resolved. The second reason was quality: the review process that governed contributions meant every component in the system had been validated to a standard individual squads rarely had time to apply under delivery pressure.
Contribution was treated as a first-class workflow, not an afterthought. When a squad hit a UI requirement not covered by the existing component library, they had a clear, lightweight path to propose it for inclusion. Proposals were reviewed against explicit criteria — semantic correctness, accessibility compliance, token alignment, usage generalisability — and decisions were documented publicly regardless of outcome. A rejected proposal with a documented rationale was more useful to the squads than an approved one that had been accepted silently.
Cross-squad visibility was built into the governance process from the start. Contribution proposals were visible to all squads before decisions were made — which surfaced cases where multiple teams had independently identified the same gap, and allowed combined input to produce a better-specified solution than any single squad would have produced alone. The process created cross-squad communication that had not previously existed.
Contribution Workflow — Before & After Governance
Before — Ungoverned Contribution
- 1
Squad hits a UI gap not covered by existing patterns
- 2
Builds a local solution in their own component file
- 3
Solution ships to production inside the squad's product
- 4
Pattern is never reviewed, never documented, never reusable
- 5
Same gap appears in another squad → rebuilt again independently
Debt compounded each sprint · no consolidation path · knowledge siloed per squad
After — Governed Contribution Model
- 1
Squad identifies gap → opens a contribution proposal with usage context
- 2
Core review: evaluated against existing patterns, accessibility, token compliance
- 3
Decision with documented rationale — merged to core or alternative recommended
- 4
If merged: component documented, tested, available to all squads immediately
- 5
Proposing squad gets credit for the contribution — incentive, not mandate
Debt eliminated at source · every gap becomes shared value · adoption self-reinforcing
Governance Principle
How it operated in practice
Purpose
Proposal-based contribution
Any squad could submit a contribution proposal. The bar was not perfection — it was a clear use case, token compliance, and evidence of real product need. Proposals without a real use case were declined with rationale.
Quality
Public review decisions
All review outcomes — approved, rejected, deferred — were documented in the contribution log, visible to all squads. Rationale was always provided. This made the governance process legible and predictable.
Trust
Adoption tracked, not assumed
Squad adoption was tracked quarterly — which components were in use, where the system had been extended locally, and which gaps remained unaddressed. Tracking made adoption visible to stakeholders and identified the next priority contributions.
Accountability
No breaking changes without migration support
Any change to a shipped component that would require squads to update their implementation was accompanied by a documented migration path and a deprecation window. Breaking changes without migration support destroyed trust faster than any benefit they could provide.
Safety
06 — Enterprise Constraints
Constraints that shaped every architectural decision
A design system serving a multi-product enterprise portfolio operates under constraints that a squad-level component library does not. The system had to be correct at scale, not just in isolation — and correctness at scale requires accounting for the organisational and technical complexity the system lives inside.
Multi-Platform Delivery
bpCore served both web and mobile product surfaces — different rendering environments, different interaction paradigms, different performance constraints. The token architecture provided a shared semantic foundation; platform-specific rendering was handled at the component implementation level. Squads building for mobile did not maintain a separate design system.
Independent Squad Release Cadences
Five product squads shipped on different cycles, with different engineering priorities and different product managers. The governance model could not assume synchronised releases or simultaneous adoption. Version management and deprecation windows had to accommodate squads in different states of adoption at the same time.
Accessibility — WCAG 2.1 AA
Enterprise software procurement in regulated industries requires demonstrable accessibility compliance. Colour contrast ratios, keyboard navigation, focus management, and screen reader compatibility were validated as part of the component review process — not assessed by individual squads during product delivery. Accessibility was a property of the system, not a squad responsibility.
Brand Governance vs. Product Flexibility
bp operates a global brand with defined visual identity standards. bpCore had to implement those standards precisely while giving product teams the flexibility to compose interfaces that served different functional contexts. The token layer was the mechanism that made this possible — brand values lived in the global token tier; product teams composed from the semantic tier above it.
Figma / Code Synchronisation Integrity
Token parity between Figma and code was a structural commitment, not a best-effort goal. Any state where the design environment and the engineering environment diverged — even temporarily — eroded designer confidence in the system and reintroduced the spec clarification overhead the system had been built to eliminate.
Onboarding New Designers Mid-Cycle
Product squads hired designers throughout the year, not only at system launch. The documentation and onboarding pathway had to be sufficient for a designer joining six months after launch to reach production-ready output without needing a walkthrough from a system team member. Documentation that required institutional knowledge to interpret was a system failure.
07 — Outcomes
Operational outcomes across design, engineering, and the product organisation
Outcomes were tracked against baselines established during the audit phase — component duplication counts, spec clarification frequency, onboarding time, and cross-squad consistency measures. The most significant outcomes were not the metrics; they were the organisational behaviours the system changed.
Area
Signal
Outcome
Handoff Friction
60% reduction in spec clarification cycles
Measured by reduction in design-to-engineering clarification tickets and rework rounds after handoff. Shared component contracts replaced per-handoff specification — engineers could implement against the system definition rather than interpreting intent from Figma files. The reduction was tracked over three release cycles after full adoption.
Designer Onboarding
Day-one production-ready baseline
New designers joining any product squad after bpCore adoption reached production-ready output from their first working day — onboarding against documented standards rather than reverse-engineering squad conventions. The previous average was measured in weeks, not days. This had a direct impact on squad delivery capacity during ramp periods.
Component Debt
Zero new component duplicates post-adoption
After all five squads had adopted bpCore, the audit process confirmed no new duplicate components had been introduced. Gaps in the system were handled through the contribution workflow — solutions became shared value rather than local debt. The compounding effect that had produced the pre-bpCore state was structurally interrupted.
Brand Propagation
Single-source updates across web and mobile
The first brand token update after full adoption required changes to a single token file — and propagated to both Figma and all production surfaces automatically. The equivalent update in the pre-bpCore state would have required coordinated manual changes across five squad codebases and Figma libraries, with a high probability of inconsistent application.
Cross-Squad Consistency
Unified visual language across 5 squads
Qualitative assessment confirmed by design review: the product suite read as a coherent platform following adoption. The subtle variations that had accumulated across squads — spacing inconsistencies, type weight deviations, status pattern divergence — were resolved not by correction but by structural elimination. Squads composing from the same token layer could not produce the divergence that had previously required ongoing review to identify.
Governance Sustainability
Contribution model operating without central bottleneck
The contribution review process was handling squad proposals within the defined review window consistently twelve months after launch — without requiring the system team to scale headcount proportionally to the squad workload it served. The governance model had been designed to distribute the review load rather than concentrate it, and that design held under production conditions.
08 — Reflection
What this work taught me about design systems leadership
The most useful reframe I arrived at during this work was that the governance model was the product — the component library was evidence that the governance was working. A design system without a contribution workflow and a clear decision process is a snapshot. Snapshots age. The longevity of the system depended on whether the process structures we built alongside the components were robust enough to handle the real complexity of five independent teams with different priorities operating under delivery pressure.
The decision to incentivise adoption rather than mandate it was the right call, but it required discipline to maintain. Mandates are easier to enforce than incentives are to sustain — and there were moments when delivery pressure produced arguments for making adoption compulsory. The argument we held to was that a system teams were required to use but did not find faster was a system they would circumvent. The only durable adoption was adoption that made rational sense under the conditions squads actually operated in.
The audit before building was the most important investment in the project. The component duplication data changed the stakeholder conversation from a design principle argument — “a design system would be valuable” — to an operational cost argument — “we are paying this specific cost right now, and it is growing.” Those are different conversations with different levels of organisational traction.
If I were approaching this work again, I would involve engineers earlier in the token naming convention decisions — specifically before those conventions were finalised. The semantic layer we built was technically correct, but some of the naming choices that felt natural from a design perspective required translation effort from engineers accustomed to different naming conventions in their codebases. Earlier cross-functional input would have produced naming that was more immediately legible on both sides of the handoff.
The cross-squad visibility the contribution process created was an unexpected secondary outcome. Squads that had not previously had structured communication about design decisions found themselves in productive conversations about shared problems because the contribution workflow surfaced them. The system produced organisational alignment as a side effect of the governance process — something that had been a deliberate goal of the design but that emerged more strongly than anticipated once the process was running.
What this work confirmed above all: design systems problems are organisational problems that manifest technically. The component duplication, the handoff friction, the onboarding overhead — these were symptoms of teams that had never been given a viable path to shared decisions. Building that path was the design work. The components were what it produced.