When an order lands in our system, we have less time than a single eye-blink to decide where it goes — and that decision involves checking the live capacity, SLA standing, geographic fit, and price across every PSP we're connected to. Here's how we built that.
Why routing speed matters
A storefront doesn't care which printer fulfills an order. It cares that the order is accepted, produced on time, and shipped to the right place. But behind the scenes, every millisecond between "order placed" and "PSP confirmed" is a millisecond your checkout flow is waiting on us.
If we're slow, we become the bottleneck. If we're flaky, we become the problem. The goal isn't just to route correctly — it's to make routing feel like it isn't happening at all.
The bar: 99.9% of routing decisions complete in < 200ms, including a real-time availability check against every candidate PSP.
The integration tax
Most teams that scale across multiple print providers run into the same wall. Each PSP has its own API, its own product catalog model, its own status vocabulary, and its own quirks. Connecting to one is straightforward. Connecting to fourteen is a maintenance nightmare.
We didn't want to build fourteen integrations. We wanted to build one interface that could speak fourteen languages. — Toby Dawson, Founder
The shape of the problem is roughly this: every order has to be translated into the canonical form, decisions have to be made about who's the best fit, and then the canonical form has to be translated back into whatever dialect that specific PSP expects.
Our approach
We split the system into three layers, each with a single responsibility:
- Intake. Normalize incoming orders into a canonical product + spec model. This is the only layer that knows about your storefront.
- Routing. Score every candidate PSP against the order, in parallel. This is the only layer that knows about both sides.
- Adapters. Translate canonical → PSP and PSP → canonical. This is the only layer that knows about a specific PSP.
Inside the routing engine
The routing engine itself is a stateless service that fans out to every PSP adapter in parallel, gathers a score per provider, and picks a winner. The scoring function takes into account capability fit, current SLA standing, cost, geography, and a configurable bias from the customer.
A simplified version of the candidate fan-out looks like this:
// Score every PSP in parallel; reject any that aren't viable. const candidates = await Promise.all( psps.map(async (psp) => { const caps = await psp.checkCapabilities(order); if (!caps.viable) return null; return { psp, score: score(order, caps, psp.sla, psp.region) }; }) ); const winner = candidates .filter(Boolean) .sort((a, b) => b.score - a.score)[0];
The interesting part isn't the code — it's the budget. Every adapter gets a strict 120ms timeout. If a PSP doesn't respond in time, it's silently dropped from the candidate pool for that order. We absorb the cost of a slow PSP rather than passing it on to the storefront.
Designing for failure
Things go wrong. PSPs go down. APIs return 500s. Networks blip. The system has to keep working when individual components don't.
Three patterns earn their keep here:
- Bulkheads. Each adapter runs in an isolated pool. A slow or failing PSP can't starve the others.
- Circuit breakers. Repeated failures from a PSP take it out of the candidate pool until health probes recover.
- Idempotency keys. If we retry an order against a different PSP, we never accidentally double-print.
What we learned
The architecture is now in production routing live orders. The headline numbers are good: p50 routing latency at 84ms, p99 at 187ms, both well inside the target. But the more interesting result is what we didn't have to build.
Once the abstraction was right, adding a new PSP turned into a one-week project — write the adapter, define the capability matrix, ship. The routing engine doesn't need to change. The storefront doesn't need to change. Everything in the middle just absorbs the new option.
That's the bet behind Oruve, and it's the thing we'll keep writing about: what a control plane for print can be when you stop thinking integration-by-integration and start thinking system-wide.