Design Tip: Enforcing Constraints Leads to Simpler, More Powerful Systems
0 net
Tags
The conventional wisdom in event-driven systems is that more capability means more power. Give every component the ability to publish, subscribe, transform, and route, and developers can build anything. This is true. It is also the source of most complexity in event-driven architectures: when everything can do everything, nothing about the system is predictable from its structure alone. You must read every component's implementation to understand the topology. You must trace message flows at runtime because the configuration cannot tell you what connects to what. Rust developers already understand this problem. The language itself is a case study in how constraints create power. The borrow checker constrains memory access and eliminates data races. Option constrains null handling and eliminates null pointer exceptions. Result constrains error handling and eliminates unhandled exceptions. In each case, the constraint feels restrictive at first and then reveals itself as the source of the language's reliability guarantees. What if the same principle applies to event-driven systems? Not more capability, but less. Constrain each component to one of three roles, enforce those constraints at the system boundary, and see what happens. That is the hypothesis behind Emergent , an event-driven workflow engine written in Rust. The name is deliberate. Three Roles, No Exceptions Every component in an Emergent pipeline is exactly one of three types. Primitive Can Publish Can Subscribe Role Source Yes No Brings data into the system Handler Yes Yes Transforms data Sink No Yes Sends data out of the system Sources cannot receive messages. Sinks cannot publish messages. Handlers do both. That is the entire design. This looks restrictive, and it is. Sources are blind: they emit events without knowing who receives them. Sinks are terminal: they consume events without producing new ones. Only Handlers sit in the middle, subscribing and publishing. If you need conditional logic that decides whether to emit a downstream event, that logic lives in a Handler. If you need to write to a database, that is a Sink. If you need to ingest from a webhook, that is a Source. The Rust SDK enforces this constraint through separate types. EmergentSource has a publish method but no subscribe method. EmergentSink has a subscribe method but no publish method. EmergentHandler has both. These are not different configurations of the same generic type. They are distinct structs with distinct capabilities, and the Rust compiler enforces the boundaries. If you write a Sink that tries to publish, it will not compile. The constraint is not a convention documented in a README. It is a property of the type system. This is the same pattern Rust developers use everywhere: make invalid states unrepresentable. You cannot create a String that is not valid UTF-8. You cannot use a MutexGuard after dropping it. You cannot publish from a Sink. The compiler catches the mistake before your code runs. The constraint eliminates choices that cause problems. Sources never depend on other components' output. Sinks never produce events that trigger unexpected downstream behavior. When a system misbehaves, the three-primitive constraint narrows the search space: data enters through Sources, transforms through Handlers, exits through Sinks. If something is wrong, the primitive type tells you where to look. What Emerges from the Constraint The useful properties of the system are not features that were designed in. They are consequences that emerge from the constraint. Configuration becomes architecture. When each component can only publish, only subscribe, or both, a configuration file that declares these relationships is a complete topology. Here is a pipeline: a timer emits ticks, a filter passes every fifth tick, and a console prints the result. [ [ sources ] ] name = "timer" path = "./target/release/timer" publishes = [ "timer.tick" ] [ [ handlers ] ] name = "filter" path = "./target/release/filter" subscribes = [ "timer.tick" ] publishes = [ "timer.filtered" ] [ [ sinks ] ] name = "console" path = "./target/release/console" subscribes = [ "timer.filtered" ] This is not a documentation artifact. It is the executable specification. The engine parses this TOML, spawns each process, and enforces the declared contracts at runtime: a Source cannot subscribe, a Sink cannot publish, and messages route only to declared subscriptions. The implication goes deeper than "the config file is readable." In conventional event-driven systems, the architecture lives in the code. The config file tells you what processes to run, but you have to read each process's source to understand what it publishes, what it subscribes to, and how it connects to other processes. The architecture is distributed across N codebases, and any component can change its behavior without the configuration reflecting the change. When components are constrained to three primitives, the configuration captures the full topology because the constraints limit what each component can do. A Source listed with publishes = ["timer.tick"] can only publish. You do not need to open the source code to verify that it is not also subscribing to something. The TOML file is not an approximation of the architecture. It is the architecture. This changes how teams work. A new team member reads one file and understands the entire system's data flow. Architecture reviews happen over a configuration file, not a codebase tour. Refactoring a pipeline means editing TOML and swapping binaries. If two teams need to integrate, they share a TOML snippet and a binary, and the topology is explicit. The constraint that components can only do one of three things is what makes this possible. Without it, the config file would be lying by omission. Lifecycle ordering follows from the constraint. Because the three roles have a natural data-flow direction (Sources produce, Handlers transform, Sinks consume), the engine can derive the correct startup and shutdown order automatically. Sinks must be ready before Handlers emit, and Handlers must be ready before Sources produce. The engine starts Sinks first, then Handlers, then Sources. Shutdown reverses the order: Sources stop producing, Handlers drain in-flight messages, Sinks finish consuming. This matters because incorrect lifecycle ordering is one of the most common sources of bugs in event-driven systems. If a Source starts before its downstream Sink is ready, messages get dropped or queued in an unbounded buffer. If a Sink shuts down before its upstream Handler finishes draining, events vanish. In conventional systems, you solve this with explicit health checks, readiness probes, or startup ordering configuration. All of that is manual coordination that someone must get right and keep right as the topology evolves. Emergent's three-phase lifecycle needs no configuration. The engine inspects each primitive's role, builds the dependency graph from the publish/subscribe relationships in the TOML file, and starts components in the correct order with a brief settling delay between tiers. The ProcessManager starts Sinks first, then Handlers, then Sources. Shutdown runs graceful_shutdown in the reverse order: stop Sources via SIGTERM, broadcast system.shutdown to Handlers and wait for them to drain, then broadcast to Sinks and wait for them to finish. The three-phase lifecycle is a direct consequence of the three roles. You do not configure it because there is nothing to configure. The constraint determines the order. Language independence follows from process isolation. Because each primitive is a standalone process communicating over Unix sockets, the engine does not care what language it is written in. A Rust Source can feed a Python Handler that publishes to a TypeScript Sink. Language choice becomes a per-component decision: use Python where pandas matters, TypeScript for web integrations, Rust for performance-critical paths. The real-world configuration file for Emergent's example pipeline demonstrates this. A Rust timer Source, a Rust filter Handler, a Deno-based colored console Sink, and a Python webhook Sink all run in the same pipeline. Each primitive is a separate process spawned by the engine, connecting over the same Unix socket using MessagePack serialization. The engine manages all of them identically because the three-primitive constraint means "Source, Handler, Sink" is the only taxonomy it needs. This polyglot capability was not a design goal. It fell out of the constraint that primitives are isolated processes with constrained interfaces. Event sourcing follows from centralized routing. Every message passes through the engine for routing. Once messages flow through a central point, persisting them is trivial: the engine appends to both a JSON log file and a SQLite database automatically. Full event history, causation chains, and replay capability without the developer configuring anything. What the event store actually gives you is more interesting than "a log." Every message in Emergent carries a unique MessageId (a TypeID with a msg_ prefix, based on UUIDv7, so IDs are time-sortable by default). When a Handler produces an output message, it can link that output to the input that caused it using with_causation_from_message . The engine persists both the causation ID and an optional correlation ID for every message. This means you can query the SQLite event store to reconstruct the full processing history of any event. "This alert was produced by handler X at time T, caused by message Y from source Z." You can query by time range to see everything that happened in a window. You can query by correlation ID to trace an entire request-response chain across multiple handlers. You can query by message type to see every instance of a particular event. // Querying causation chains from the event store let events = store . query_by_correlation ( & correlation_id ) ? ; let timeline = store . query_by_time_range ( start_ms , end_ms ) ? ; let ticks = store . query_by_type ( "timer.tick" ) ? ; The event store is a free consequence of the routing architecture, which itself follows from the constraint that primitives cannot communicate directly. If primitives could talk to each other over direct channels, the engine would not see the messages, and you would need to instrument each component individually for observability. The constraint that all communication routes through the engine is what makes event sourcing automatic. What the Constraint Hides The three-primitive constraint is the visible design. Underneath, it rests on acton-reactive , a Rust actor framework I built that applies the same philosophy one layer down: constrain the runtime, and reliability emerges. Each primitive runs as an isolated actor with its own bounded inbox. If a Handler panics, acton-reactive catches the failure without affecting other actors in the system. No cascade. The engine applies Erlang-style supervision strategies (restart just the failed primitive, restart all primitives, or restart everything downstream of the failure) with exponential backoff to prevent restart storms. A developer writing a Sink never configures any of this. The constraint (one primitive = one actor = one process) makes isolation and supervision automatic. Why does this matter beyond "it handles crashes"? Because supervision changes the failure model. Without supervision, a failing component means a failing pipeline. Operators get paged. Someone restarts the process manually or adds a systemd unit with restart logic. With actor supervision, a failing primitive is a transient event. The engine detects the failure (the child monitoring task sends a ChildExited message to the actor), applies the configured restart policy, and the primitive comes back. If the failure persists, exponential backoff prevents restart storms. If the failure is permanent, the primitive enters a Failed state and the rest of the pipeline keeps running. The system degrades gracefully instead of failing atomically. Backpressure follows from the same architecture. Every actor's inbox is bounded. When a slow Sink cannot keep up with a fast Source, the bounded channel applies natural flow control rather than dropping messages silently. The IPC layer adds per-connection rate limiting on top. None of this requires configuration because the constraint that all messages route through actors with bounded inboxes makes backpressure a structural property, not an opt-in feature. The engine itself uses acton-reactive's type-state pattern, and this is where the design gets deeply Rustic. An actor in the Idle state can register message handlers. An actor in the Started state can process messages. These are not runtime flags. They are generic type parameters: // ManagedActor has register methods let mut actor = runtime . new_actor :: < PrimitiveActorState > ( name ) ; actor . mutate_on :: < ChildSpawned > ( | actor , envelope | { /* ... */ } ) ; // .start() consumes the Idle actor, returns a Started handle let handle = actor . start ( ) . await ; // actor.mutate_on(...) would not compile here - actor was moved The Rust compiler prevents registering handlers after an actor starts because start() consumes self by move. The Idle type is gone. If you try to call a registration method on the started handle, you get a compile error, not a runtime panic. This is the type-state pattern that Rust developers use for builders, connection states, and protocol sequences. In acton-reactive, it enforces the actor lifecycle at compile time. The entire framework is built with zero unsafe code. For Rust developers, this is a meaningful guarantee. It means the actor isolation, message passing, and lifecycle management are all built on safe Rust's ownership and borrowing rules. There are no FFI boundaries where safety guarantees disappear, no raw pointer manipulation that could invalidate the memory model. The safety of the primitives' runtime infrastructure is verified by the same compiler that checks the primitives' business logic. The helper pattern reduces each primitive to its essential shape: run_sink ( Some ( "console" ) , & [ "timer.filtered" ] , | msg | async move { println! ( "{}" , msg . payload ( ) ) ; Ok ( ( ) ) } ) . await ? ; That is a complete Sink. Connection, backpressure, fault isolation, signal handling, graceful shutdown, and IPC are handled by the framework. Look at the type signature of run_sink : pub async fn run_sink < F , Fut > ( name : Option < & str > , subscriptions : & [ & str ] , consume_fn : F , ) -> HelperResult < ( ) > where F : Fn ( EmergentMessage ) -> Fut + Send + Sync , Fut : Future < Output = Result < ( ) , String >> + Send , The consume_fn takes an EmergentMessage and returns a future that resolves to a Result . The Fn trait bound (not FnOnce ) means the closure is called repeatedly for each message. The Send + Sync bounds mean it can safely cross thread boundaries, which the Tokio runtime requires. The Future output is Send so it can be polled on any worker thread. Every constraint in this signature is enforced by the compiler, and together they guarantee that your Sink callback is safe to run in a concurrent, multi-threaded async runtime. The TypeScript and Python SDKs expose the same pattern: await runSink ( "console" , [ "timer.filtered" ] , async ( msg ) => { console . log ( msg . payloadAs < { sequence : number } > ( ) ) ; } ) ; Because a Sink can only subscribe, the helper knows exactly what to set up: connect, subscribe to the declared types, run the callback for each message, disconnect on shutdown. No configuration matrix. No mode selection. The primitive type determines everything. A Handler helper is the same pattern with one addition: it also publishes. run_handler ( Some ( "filter" ) , & [ "timer.tick" ] , | msg , handler | async move { let output = EmergentMessage :: new ( "timer.filtered" ) . with_causation_from_message ( msg . id ( ) ) . with_payload ( json! ( { "filtered" : true } ) ) ; handler . publish ( output ) . await . map_err ( | e | e . to_string ( ) ) } ) . await ? ; The with_causation_from_message call links output to input, creating traceable event chains. Every message carries a TypeID (a self-describing, time-sortable identifier), and derived messages link back to their parent. The causation ID is not a stringly-typed field. It is a CausationId newtype that the type system distinguishes from MessageId and CorrelationId , preventing accidental misuse. You cannot pass a CorrelationId where a CausationId is expected. This is Rust's newtype pattern applied to event tracing, giving you the same protection that separating Meters from Feet would give a physics calculation. You can reconstruct the full processing history of any event from the event store. This traceability was not designed as a feature; it follows from the constraint that all messages route through the engine. Where the Constraint Breaks Emergent runs on a single node. If you need distributed event processing across machines, you need Kafka or NATS. The three-primitive model describes process coordination, not network coordination. For problems with high coupling between components, where every change to one region invalidates others, simple composition through constrained primitives is not sufficient. You need explicit coordination. The three-primitive model works for the broad class of problems where data flows through a pipeline: ingest, transform, output. That class is larger than most people assume. Log aggregation is ingest-transform-output. Webhook processing is ingest-transform-output. Sensor data collection, ETL pipelines, notification fanout, CI/CD event routing, monitoring and alerting. All of these decompose naturally into Sources that produce, Handlers that decide, and Sinks that act. Once you start looking for the pattern, most event-driven problems that teams solve with message brokers or hand-rolled channel coordination turn out to be pipeline-shaped. The problems that genuinely require unconstrained topologies exist, but they are rarer than the tooling landscape implies. Agentic AI Is Pipeline-Shaped Agentic AI systems look like they need unconstrained topologies. Agents reason, plan, use tools, delegate to other agents, loop until satisfied. The dominant pattern is an orchestrator, a manager agent that assigns work to specialist agents and coordinates their outputs. But look at what each agent actually does. It receives context, reasons about it, and produces a decision or action request. That is a Handler. A planning agent subscribes to tasks and publishes sub-tasks. A tool-calling agent subscribes to action requests and publishes results. A routing agent subscribes to user input and publishes to the right specialist. Each agent consumes and produces. That is the Handler constraint. The triggers that start a chain of reasoning (user prompts, scheduled tasks, webhook events) are Sources. The effects on the world (sending an email, updating a database, responding to a user) are Sinks. The agents are Handlers in between. Here is what a multi-agent code review pipeline looks like as an Emergent configuration: # Code review agentic pipeline [ [ sources ] ] name = "github-webhook" path = "uv" args = [ "run" , "--project" , "./agents/webhook" , "python" , "main.py" ] publishes = [ "pr.opened" ] [ [ handlers ] ] name = "code-analyzer" path = "./target/release/code-analyzer" subscribes = [ "pr.opened" ] publishes = [ "analysis.complete" ] [ [ handlers ] ] name = "review-agent" path = "uv" args = [ "run" , "--project" , "./agents/reviewer" , "python" , "main.py" ] subscribes = [ "analysis.complete" ] publishes = [ "review.ready" ] [ [ handlers ] ] name = "approval-gate" path = "./target/release/approval-gate" subscribes = [ "review.ready" ] publishes = [ "review.approved" , "review.changes-requested" ] [ [ sinks ] ] name = "github-commenter" path = "uv" args = [ "run" , "--project" , "./agents/commenter" , "python" , "main.py" ] subscribes = [ "review.approved" , "review.changes-requested" ] [ [ sinks ] ] name = "slack-notifier" path = "deno" args = [ "run" , "--allow-env" , "--allow-net" , "./agents/notifier/main.ts" ] subscribes = [ "review.approved" ] A GitHub webhook Source emits pull request events. A Rust code-analyzer Handler performs fast static analysis. A Python review-agent Handler runs the LLM inference (because that is where the ML ecosystem lives). A Rust approval-gate Handler applies deterministic policy rules. A Python Sink posts comments back to GitHub. A Deno Sink notifies Slack. Each agent is an isolated process. Each agent can be written in whatever language makes sense for its job. The entire pipeline's data flow is visible in the configuration. The constraint is especially powerful here because agentic systems are where observability matters most. When an AI agent makes a decision, you need to trace why: what input triggered it, what context it had, what it decided. Causation chains give you that. Every agent decision links back through the event store to the event that caused it. When the review-agent publishes review.ready , that message carries a causation ID pointing to the analysis.complete message that triggered it, which in turn points to the pr.opened event from the webhook. You can follow the chain from the Slack notification all the way back to the pull request that started everything. The polyglot angle lands here too. Your inference Handler runs in Python because that is where the ML ecosystem lives. Your routing logic runs in Rust because it needs to be fast. Your Slack notification Sink runs in TypeScript. Each agent is an isolated process. A hallucinating or failing agent gets caught by supervision and restarted without cascading through the rest of the pipeline. The alternative to an orchestrator is the same principle that makes ant colonies work : constrain each agent to local decisions (subscribe to what you understand, publish what you produce) and let coordination emerge from the topology. No manager agent. The configuration file is the coordination. The Design Question Every system I have built that worked well shared a property: the right behavior was cheap and the wrong behavior was expensive or impossible. Not through documentation. Not through code review. Through structural constraints that made the system's shape enforce its correctness. Rust developers already live this philosophy. The borrow checker makes data races impossible. Option makes null dereferencing impossible. Result makes ignored errors impossible. Emergent applies the same principle one level up: the three-primitive constraint makes invalid topologies impossible. A Sink cannot create feedback loops by publishing. A Source cannot create hidden dependencies by subscribing. The configuration file cannot describe a system that violates these invariants because the type system and the engine both enforce them. The question that matters in event-driven systems is not "what capabilities should I give my components?" It is "what constraints would cause the right behavior to emerge?" Three primitives. A TOML file. A binary. Complex behavior from simple composition. The code is at github.com/Govcraft/emergent . What Happens When You Constrain an Event-Driven System to Three Primitives - Roland Rodriguez