BOON PIN
All Notes
ERPMicroservicesNestJSSystem Design

Designing ERP System Architecture for Scale

March 15, 20268 min read

Key considerations and pitfalls when transitioning from monolithic manufacturing systems to event-driven microservices.

Most manufacturing ERPs start as monoliths — and that's fine. A single deployable unit with shared state is simpler to build, test, and reason about in the early stages. The problems emerge when teams grow, domains multiply, and a change in the billing module breaks the shopfloor scheduler at 2am.

The transition to an event-driven architecture isn't about microservices for their own sake. It's about drawing explicit boundaries between business domains — Order Management, Manufacturing Execution, Warehouse, Finance — so that each can evolve independently. The key insight is that these domains don't actually need to talk to each other in real time. They need to react to what happened.

Domain events are the currency. When an order is confirmed, an OrderConfirmed event is emitted. The MES subscribes and schedules production. The WMS subscribes and reserves materials. Neither module calls the other directly. This decoupling means a WMS outage doesn't block order processing — events queue and are processed when the service recovers.

The practical pitfalls I've encountered: teams try to go microservices all at once, which creates distributed monolith problems — tightly coupled services that just happen to be deployed separately. A better approach is modular monolith first. Enforce strict module boundaries inside a single deployment. Use NestJS modules with explicit import/export surfaces. Only extract a module into its own service when you have a clear operational reason — independent scaling, different deployment cadence, or a separate team.

Schema versioning is underestimated. Events are contracts. Once consumers depend on an event shape, you can't change it without coordinating every subscriber. Build event versioning in from the start: version your event types, and use an event registry pattern so producers and consumers agree on schemas. This pays back hundredfold during a domain refactor.

For teams inheriting a legacy ERP, the strangler fig pattern works well. Introduce a new event bus alongside the existing system. New modules emit and consume events. Legacy code continues calling stored procedures. Over time, each domain gets migrated. You never have a big-bang rewrite, and you never go dark.