RFC: Hardening bounded contexts and dependency boundaries in Tuist Server

Summary

This RFC proposes a more disciplined internal architecture for the Tuist server monolith. The goal is not to split the server into services, nor to introduce a new global workflow layer, nor to decompose the codebase into a large number of small units. The goal is to make the existing monolith easier to understand and safer to evolve by strengthening the boundaries that already exist in the code.

The server is already organized around recognizable business areas such as Accounts, Projects, Builds, Tests, Bundles, AppBuilds, Billing, Automations, Registry, and Slack. That existing structure is a strong foundation. What is missing is a system of explicit dependency rules, narrower public entrypoints, clearer ownership, and gradual enforcement in continuous integration. This RFC proposes that we keep the monolith, keep the broad contexts, and make the dependency graph between them more intentional over time.

This direction is informed by Shopify’s long-running work on modularizing its Rails monolith. The most useful lesson from Shopify is not that every system should invent an application or workflow layer, but that large monoliths become manageable when they are organized into business components with explicit ownership, declared dependencies, public entrypoints, and tooling that prevents new architectural violations from accumulating.

Motivation

The Tuist server is already much closer to a modular monolith than to an undifferentiated ball of mud. The code under server/lib/tuist is grouped by business capability rather than only by technical layer, and the web layer lives separately under server/lib/tuist_web. This is the right overall shape. The problem is that the boundaries between those areas are still mostly social boundaries rather than enforced ones.

Several current patterns point to the same architectural weakness. The web layer still reaches into internals that should be owned by the business contexts. Controllers call workers directly. LiveViews and helpers preload associations through Tuist.Repo. Templates and controllers depend on domain structs and helper functions that live below a context’s intended public surface. The result is not catastrophic, but it makes the code harder to change because behavior leaks through many seams at once.

Another issue is that the current Boundary setup is too broad to carry much weight on its own. The boundary compiler is enabled and TuistWeb depends on Tuist, but the top-level Tuist boundary exports a large surface that includes not only context facades but also schemas, worker modules, analytics modules, and even Repo. In practice, this means the structure is visible, but not very restrictive.

This matters because the main pain of a large monolith is usually not that everything lives in one repository or deploys together. The pain comes from hidden dependencies. Once cross-context calls, preloads, and job orchestration can happen from anywhere, the amount of context required to make a safe change rises steadily. Review becomes harder, onboarding becomes slower, and changes in one area start to have surprising consequences elsewhere.

Shopify’s own writing reinforces this point. Their public articles repeatedly describe dense and circular dependency graphs as the real obstacle to making a monolith feel smaller. They also note that simply adding a public interface can become pointless indirection if the dependency graph underneath remains tangled. That is the most relevant lesson for Tuist. We do not need more layers for their own sake. We need a cleaner graph and clearer seams.

Current Architecture

The current server architecture can be understood as having three visible strata. The first is server/lib/tuist_web, which contains Phoenix controllers, LiveViews, plugs, and API-facing schemas. The second is server/lib/tuist, which contains context modules, schemas, workers, and most of the business logic. The third is the infrastructure and runtime substrate, including repositories, external service integrations, telemetry, storage, configuration, and background processing.

This is not a bad structure. In fact, it is already more intentional than many Phoenix applications of similar age. The problem is that the edges between these strata are loose enough that they do not reliably communicate ownership.

At the moment, the interface layer can reach through a context facade and talk to worker modules directly. It can also preload records through Tuist.Repo, which effectively lets the web layer make data-shaping decisions on behalf of any context. Several contexts expose root facades, but the broader boundary configuration also exports their underlying structs and helper modules, which makes it easy for calling code to skip the facade when convenient.

The result is a system that is organized by domain on disk, but not yet fully organized by domain in terms of change control. Files are in roughly the right places. Dependencies are not always flowing through the right doors.

Architecture Inventory

This proposal is based on a lightweight static inventory of the current server/ codebase rather than on an abstract target state. The inventory is intentionally simple. It looks at the current top-level contexts under server/lib/tuist, counts their implementation and test files, approximates cross-context references, and highlights the most visible interface-layer leaks. The numbers do not claim to be a perfect architectural model, but they are good enough to identify where risk is concentrated and where low-risk migration work can begin.

At the time of writing, server/lib/tuist contains 46 top-level context directories. A few of those contexts are clearly much larger or more central than the rest. MCP is the largest by implementation file count, with 44 library files and 7 test files. Tests is also large and relatively well covered, with 24 library files and 13 test files. Marketing has 23 library files but only 2 test files. Accounts has 18 library files and 10 test files. Automations has 10 library files and 7 test files. Builds and ResultBundle each have 9 library files, but only Builds currently has a moderate test surface. This tells us immediately that the server is not flat. Some contexts are clearly broad ownership units, while others are smaller and more bounded.

The dependency picture is even more informative. Looking only at direct static references between top-level contexts, the most heavily depended-on contexts are Accounts, Projects, and Tests, followed by CommandEvents, Ingestion, and Builds. In other words, those areas sit near the center of the current graph. They are exactly the kinds of contexts where an incautious cleanup could create regressions or force broad rewrites. On the other side of the graph, the largest outgoing dependers are MCP, Tests, Automations, Accounts, Ops, Slack, Authorization, Builds, ResultBundle, and Projects. This is another sign that some areas are acting as orchestration or fan-out hubs and should not be the first migration targets.

A few dependency edges stand out as especially strong. MCP depends heavily on Tests, Builds, and CommandEvents. ResultBundle depends heavily on CommandEvents. Tests depends heavily on Ingestion and also reaches into Accounts. Automations depends materially on Tests. These are not necessarily bad relationships in themselves, but they do mean that some areas are already tightly intertwined and therefore riskier to reshape early.

The interface layer confirms the same story. Within server/lib/tuist_web, the heaviest domain-facing references are to Projects, Accounts, Marketing, Builds, Docs, Tests, and AppBuilds. A special case is Utilities, which is referenced very frequently from the web layer but behaves more like shared support code than like a business context, so it should not drive the migration order on its own. What does matter more is that TuistWeb still reaches into internals through several concrete patterns. It calls worker modules directly in the API controllers for cache cleaning, test processing, build processing, and bundle threshold evaluation. It also performs many direct Tuist.Repo.preload calls in controllers and LiveViews, which means the interface layer is still making domain-owned data-shaping decisions in many places. Templates and helpers also reference internals such as Tuist.Projects.Project and Tuist.Accounts.User directly instead of always going through deliberate public contracts.

The inventory also helps identify where the first safe pilot should be. AppBuilds is small, but it currently has no dedicated test files, which makes it a weaker first migration candidate. Cache is even smaller, but it also has very little test surface and a narrow public footprint, which makes it less useful as a proving ground. Alerts is small and has tests, but it has little interface-layer surface, so it would not exercise the main architectural problems that the RFC is trying to solve. Bundles, by contrast, sits in a better middle ground. It has a small implementation surface, a matching set of focused tests, limited but real web-layer exposure, and only moderate coupling to the rest of the graph. That makes it a strong pilot candidate because it is large enough to be representative and small enough to be migrated safely.

From this inventory, the server can be grouped into three practical migration tiers. The first tier contains low-risk contexts that are relatively small, have some test coverage, and are not central dependency hubs. Bundles, Alerts, Billing, Cache, and possibly AppBuilds fit that profile, though AppBuilds likely needs test strengthening before it becomes a good candidate. The second tier contains medium-risk contexts that are still manageable but already more connected or more behaviorally rich, such as Slack, Gradle, Automations, Shards, and parts of Builds. The third tier contains high-risk hubs such as Accounts, Projects, Tests, CommandEvents, Ingestion, and MCP. Those should come last, after the migration pattern, reporting, and CI enforcement are already proven on safer ground.

Proposal

The proposal is to strengthen the server around broad bounded contexts and explicit dependency direction. Rather than introducing a new universal workflow layer, we should make the existing contexts the primary architectural units and make their public surfaces much narrower and more intentional.

Each major business area should be treated as a bounded context with three characteristics. First, it owns its data structures, jobs, and business rules. Second, it exposes a small public surface through one or a few entrypoint modules. Third, it declares which other contexts it is allowed to depend on. This is much closer to Shopify’s component model than a service-object-heavy approach. Their system centers on components, package dependencies, and public entrypoints, not on a mandatory global layer for orchestration.

In practical terms, the root context modules such as Tuist.Accounts, Tuist.Projects, Tuist.Builds, Tuist.Tests, Tuist.Bundles, and Tuist.AppBuilds should become the primary doors through which other parts of the system interact with those areas. Internal structs, workers, and helper modules should remain inside the context unless there is a very deliberate reason to expose them. The question should stop being “can this code technically reference that module” and start being “is this one of the modules this context intends other areas to use.”

This proposal also assumes that broad contexts are still the right level of ownership for Tuist today. We should resist the temptation to over-fragment the server into many tiny internal packages before the current broader seams are healthy. Shopify itself later reflected that package privacy can become too seductive because it feels easy, while the deeper work is really about managing the dependency graph. That lesson applies here. If we introduce too many small units too early, we risk spending energy on graph purity while leaving the higher-value leaks in place.

The intended end state is therefore not a proliferation of miniature architectural concepts. The intended end state is a monolith in which each context is easier to reason about because other parts of the codebase must interact with it through a deliberate surface and along declared dependency lines.

Public Entry Points and Dependency Direction

The central design choice in this RFC is to treat dependency direction as the main architectural constraint. Shopify’s Packwerk work began as a way to declare packages and make their dependencies explicit. That is the spirit this RFC recommends following.

For Tuist, this means that contexts should depend only on a small, documented set of peer contexts. If Projects depends on Accounts, that should be explicit. If Builds needs to collaborate with Projects, that should also be explicit. If two areas form a cycle, we should treat that as architectural debt to be reduced rather than normal system behavior.

The interface layer should be allowed to talk to context entrypoints, but it should not bypass those entrypoints to reach into context-owned workers or persistence details. In other words, controllers and LiveViews should ask a context to perform work, not manually reconstruct the internals of how that context works. A controller should not enqueue a worker owned by another area any more than it should reach into a foreign schema and start preloading records through Repo.

This does not mean every interaction must collapse into a single facade function with a giant argument map. The point is not to hide all structure behind generic commands. The point is to make each context clearly responsible for its own actions and read models. For some simple use cases, that may still mean exposing a small query module or a focused public helper. What should change is that these become explicit public contracts rather than incidental implementation details.

Orchestration and Dependency Direction

Shopify’s writing emphasizes components, ownership, dependency declarations, public APIs, Rails engines, and inversion of control where necessary. It does not present a central workflow layer as the universal answer to cross-component behavior. More importantly, Shopify later warned that public interfaces can become a harmful extra layer of indirection if the underlying dependency graph remains dense. That is a useful caution for Tuist.

For Tuist, a better approach is to let orchestration live in one of three places, depending on the nature of the use case. If the behavior clearly belongs to one context, that context should own it through its public API. This should be the default for synchronous business operations and anything that participates in core invariants. If the behavior is fundamentally reactive and one-way, inversion of control can be used with mechanisms such as PubSub, telemetry events, or background jobs to reverse the dependency direction. If the behavior truly spans multiple contexts in a way that does not belong naturally to any one of them, a small dedicated coordination module may still exist, but it should be introduced as a specific exception, not as the primary architecture pattern for the whole server.

This is a more conservative and, I believe, more Shopify-like position. It preserves room for use-case modules where needed, but it does not turn them into a required intermediate layer for every action in the system.

Inversion of Control and Events

One of the most practical lessons from Shopify is their use of inversion of control to fix bad dependency direction. In their 2020 monolith writeup, they explicitly describe using publish-subscribe mechanisms such as ActiveSupport::Notifications to invert dependencies when straightforward direct calls would point in the wrong direction.

Tuist already has some suitable primitives for this style of design. The system uses background jobs through Oban and also has internal messaging and telemetry mechanisms available. Those tools should be used intentionally when one domain needs to react to another domain’s behavior without introducing a direct dependency that makes the graph denser or more circular. They should not become the default contract for business logic that must complete synchronously or transactionally.

In practice, direct context APIs should remain the default integration mechanism. If Builds needs information or behavior from Projects in order to complete a business operation correctly, a direct and explicit context-level contract is usually the right choice. If a caller needs an immediate answer, if failure must be visible right away, or if correctness depends on the callee having run successfully, an event is usually the wrong abstraction.

Events or jobs are better suited to side effects that are naturally asynchronous, retryable, and eventually consistent. If Builds needs to trigger downstream notification behavior, it may be more appropriate to emit an event or enqueue a domain-owned job than to add a direct dependency on Slack or another integration-focused context. If Projects needs follow-up cleanup behavior that does not determine whether the initiating action succeeds, the cleanup can also be modeled asynchronously rather than assembled manually in the controller layer.

Even in those cases, the contract should not be loose. GitLab’s public EventStore guidance is helpful here: events are treated as public interfaces, their payloads are validated against explicit schemas, and they are evolved through compatibility-minded multi-rollout changes. That is a good default bar. If Tuist uses events between contexts, they should be named domain events with owned schemas, explicit publishers and subscribers, idempotent handlers, and tests that cover both publication and consumption. The system should also distinguish between side effects that can tolerate eventual consistency and behavior that must complete as part of the main business transaction. If the work has to be part of the main transaction, it should remain a direct call rather than an asynchronous event.

This style of architecture does not remove orchestration. It places orchestration where the dependency graph remains healthiest.

Boundary Enforcement

This RFC proposes a smaller and more pragmatic first enforcement target than “full component isolation.” The first task should be to stop the most expensive leaks.

The most important initial rule is that TuistWeb should stop using Tuist.Repo directly for domain-owned data shaping. Once the web layer can decide its own preloads and record composition for any context, it becomes very difficult to reason about where a domain’s contract really lives. Some exceptions may still be needed during migration, but the direction should be clear.

The second rule is that TuistWeb should not call context-owned workers directly. Worker enqueueing is part of a context’s behavior and lifecycle, and it should be exposed through a context entrypoint or another explicit contract owned by that context. A controller should not have to know which worker performs the job, just as it should not have to know how the underlying tables are wired.

The third rule is that cross-context access to structs and helper modules should become more selective. A context should not automatically export all of its types just because other code happens to use them today. Some public data structures may remain necessary, especially around API serialization, but they should be consciously designated rather than passively inherited from the entire namespace being open.

These rules are modest compared with a full package-privacy system, but they attack the most consequential leaks first. They also fit Shopify’s own later conclusion that dependency direction deserves more attention than privacy theater.

Ownership

Shopify’s Packwerk work also highlights the value of package-level ownership metadata. Their 2020 Packwerk article notes that package-specific ownership can make cross-team collaboration much easier than ownership assigned only at a larger domain level.

Tuist should adopt the same spirit. Each broad bounded context should have an explicit owner or owning team, even if that team is small today. At a minimum, that ownership should be reflected in CODEOWNERS and in a small architecture map that lists each context, its public entrypoints, and its declared dependencies.

This matters not because ownership alone solves architecture, but because architecture becomes much easier to maintain when each area has an obvious steward. A clean dependency graph is not just a technical asset. It is also a coordination asset.

Rollout Plan

The rollout should be structured so that each phase is independently executable, independently reviewable, and safe to stop after. No phase should require us to land a large architectural rewrite before the system is back in a stable state. Each phase should ideally fit into one pull request or a very small set of tightly related pull requests. An agent should be able to take one phase as input, complete only that phase, and leave the codebase in a healthy condition without needing work from a later phase to make the result coherent.

The governing principle throughout the rollout should be to prefer visibility before restriction, tests before refactors, one context at a time before multi-context rewrites, and additive enforcement before breaking enforcement. Whenever there is a choice between a faster cleanup and a safer cleanup, this plan prefers the safer path. The phase order below is based on the inventory above and is intentionally designed to start away from the graph’s central hubs.

Phase 1: Add Report-Only Architecture Checks

Goal:

  • Add architecture checks that report current violations without failing the build.
  • Make dependency movement measurable in CI.

Checks to add:

  • Direct TuistWeb -> Repo usage.
  • Direct TuistWeb -> Workers usage.
  • Direct TuistWeb -> context internals that bypass declared entrypoints.
  • Optional: top-level context dependency edge reporting.

Allowed changes:

  • Add CI jobs.
  • Add local commands or Mix tasks.
  • Add report output or generated artifacts.

Not allowed:

  • No enforcement that blocks merges.
  • No production behavior changes.
  • No refactors outside what is strictly needed to wire the checks.

Deliverables:

  • A report-only CI job.
  • A local command that reproduces the same checks when possible.
  • Output that is easy to read and easy to diff over time.

Done when:

  • CI surfaces the targeted patterns reliably.
  • The build stays green unless the new tooling itself is broken.

Phase 2: Select and Prepare the First Low-Risk Pilot

Goal:

  • Select the first pilot from the low-risk tier identified in the inventory.
  • Make that pilot safe to refactor.
  • Define what counts as its public surface.

Selection rule:

  • Choose from the low-risk tier identified in the inventory.
  • Prefer the context with the best balance of real web-layer exposure, focused tests, and limited coupling.
  • Current recommendation: Bundles.

Tasks:

  • Select the pilot context explicitly in the repository or implementation notes.
  • Add or strengthen characterization tests around the selected context.
  • Document or encode the intended public entrypoints for the selected context.
  • List current TuistWeb usages of the selected context’s internals.

Allowed changes:

  • Add tests.
  • Add metadata.
  • Add minimal non-behavioral cleanup if needed to make tests possible.

Not allowed:

  • No broad API redesign.
  • No multi-context refactor.
  • No behavior changes unless fixing an already failing or obviously broken test.

Deliverables:

  • A selected pilot context.
  • A protected test surface for that context.
  • A declared public API for that context.
  • A short list of web-layer call sites that later phases must migrate.

Done when:

  • The selected pilot has enough tests to refactor with confidence.
  • The intended entrypoints are explicit.

Phase 3: Migrate the First Low-Risk Pilot

Goal:

  • Route the selected pilot’s usage through its intended public API.
  • Remove that pilot’s boundary leaks from the web layer.

Tasks:

  • Replace direct TuistWeb -> <Pilot>.Workers.* calls with explicit pilot entrypoints.
  • Replace direct TuistWeb -> <Pilot> internals with declared public modules.
  • Keep behavior identical from the caller’s perspective.

Allowed changes:

  • Refactor the selected pilot’s internals.
  • Add concrete public entrypoint methods.
  • Add compatibility adapters if another context is touched indirectly.

Not allowed:

  • No migration of other contexts as part of the same phase.
  • No generic abstraction layer just to hide the old flow.
  • No broad cleanup outside the selected pilot’s call sites.

Deliverables:

  • Pilot call sites in TuistWeb go through declared entrypoints.
  • Tests cover the migrated behavior.

Done when:

  • No known web-layer call site reaches into the selected pilot’s internals.
  • The selected pilot’s test surface stays green.

Phase 4: Protect the First Low-Risk Pilot with Narrow Enforcement

Goal:

  • Prevent regressions in the migrated pilot area without blocking unrelated legacy debt.

Tasks:

  • Turn the pilot-specific checks from report-only into enforcement.
  • Fail only on reintroduced or new violations in the migrated pilot.

Allowed changes:

  • Tighten CI for the selected pilot.
  • Add targeted validation around the selected pilot’s entrypoints and call sites.

Not allowed:

  • No repo-wide enforcement.
  • No unrelated cleanup bundled into the same change.

Deliverables:

  • CI that blocks pilot backsliding.
  • Error output that explains what violated the rule.

Done when:

  • A reintroduced pilot boundary leak fails CI.
  • Existing unrelated debt elsewhere does not block merges.

Phase 5: Expand to the Remaining Low-Risk Tier

Goal:

  • Repeat the pilot pattern on the low-risk tier.

Target order:

  • Continue with the remaining low-risk contexts after the first pilot.
  • Current candidates: Alerts, Billing, Cache.
  • AppBuilds only after test strengthening.

Execution rule:

  • Treat each target context as its own sub-phase.
  • Run the same sequence for each one: tests and surface definition, call-site migration, narrow enforcement.

Not allowed:

  • No batching multiple low-risk contexts into one large PR.
  • No skipping the test-prep step for weakly tested contexts.

Done when:

  • At least two additional low-risk contexts are migrated and protected.
  • The migration pattern has been proven repeatable beyond the first pilot.

Phase 6: Expand to the Medium-Risk Tier

Goal:

  • Apply the same pattern to more connected contexts without touching the central hubs yet.

Target candidates:

  • Slack
  • Gradle
  • Automations
  • Shards
  • selected seams within Builds

Execution rule:

  • One context or one seam per PR.
  • Reuse the same migration pattern as the low-risk tier.
  • Prefer a smaller, behavior-preserving slice over a conceptually complete cleanup.

Not allowed:

  • No full Builds rewrite in a single phase.
  • No simultaneous migration of multiple medium-risk contexts.

Done when:

  • Several medium-risk contexts have been migrated successfully.
  • The approach still works once fan-out and behavior complexity increase.

Phase 7: Tackle High-Coupling Hubs Deliberately

Goal:

  • Start hardening the graph’s central hubs only after the migration pattern is proven elsewhere.

Target candidates:

  • Accounts
  • Projects
  • Tests
  • CommandEvents
  • Ingestion
  • Builds
  • MCP

Execution rule:

  • One seam at a time, not one whole context at a time if the context is too large.
  • Require tests before each seam change.
  • Add enforcement only after each migrated seam is stable.

Not allowed:

  • No “fix the whole hub” PR.
  • No cross-hub mega-refactor.
  • No new abstraction layer introduced just to make the diff look uniform.

Done when:

  • The main hubs no longer function as uncontrolled shortcuts.
  • Their highest-value seams participate in the same explicit entrypoint and dependency model as the rest of the monolith.

Scope

This RFC is intentionally narrow in its ambition. It is about hardening the internal boundaries of the existing server/ monolith. It is not about changing deployment topology. It is not about forcing the entire server into an umbrella architecture. It is not about creating dozens of tiny internal packages. It is not about introducing a brand new universal application layer. It is about making the current domain-oriented structure more truthful in practice.

That means the expected outcomes are straightforward. We should end up with clearer public entrypoints for contexts, a less permissive top-level export surface, fewer direct leaks from TuistWeb into domain internals, an explicit dependency map, and CI support that helps contributors preserve those properties over time.

Trade-offs

The main advantage of this approach is that it targets the highest-value problems without forcing the codebase into a more elaborate structure than it needs. It accepts the reality that a monolith can remain a single deployable unit and still become much easier to change if the internal dependency graph is made clearer. It also matches the current shape of the server closely enough that adoption can be incremental rather than disruptive.

The main disadvantage is that it does not provide the immediate neatness of a fully formalized packaging system. Some ambiguity will remain while legacy code is being migrated. The proposal also relies on discipline in identifying real public entrypoints, which is harder than creating a generic layer and routing everything through it mechanically. In a few cases, introducing inversion of control may also make runtime behavior less obvious at first glance, even while improving architectural direction overall.

Still, these trade-offs appear preferable to the alternatives. A global workflow layer would risk adding new indirection before the core dependency problems are solved. A highly fragmented packaging strategy would raise the cognitive cost of contributing before the broader seams have been made reliable. And doing nothing would preserve the convenience of the current system at the cost of continued architectural erosion.

Alternatives Considered

One alternative is to keep the current context-only structure and continue relying on review discipline. This is appealing because it avoids structural churn, but it does not address the fact that the current export surface and interface-layer leaks make architectural regression easy. The current boundaries are not holding on their own.

Another alternative is to adopt a formal workflow layer as the main architectural move. That approach has some merit, especially in systems with many cross-domain user flows, but it is not the most faithful reading of Shopify’s path. Their emphasis is on component boundaries, dependency graphs, ownership, and inversion of control. More importantly, a workflow layer could become exactly the kind of extra indirection Shopify later warned about if it is introduced before the dependency graph itself is improved.

A third alternative is to pursue a much more aggressive internal package decomposition from the start. This would be closer to a full Packwerk-style ambition, with many small architectural units and strict privacy rules. The problem is that Shopify’s own retrospective warns that privacy enforcement can distract from the more important work of understanding and improving dependency relationships. For Tuist, it would be better to first make the broad contexts real and trustworthy before deciding whether some of them should later be split into smaller internal packages.

Open Questions

  • How should public entrypoints be designated: root context modules by default, explicit allowlist, or both?
  • Which read-oriented helper modules, if any, should be public without exposing persistence details?
  • What should the first enforcement mechanism be: custom Mix task, lightweight static check, tighter Boundary configuration, or a combination?
  • Where should the repository store the architecture map and migration tier metadata?
  • How should small cross-context coordination modules be named and placed when they are truly necessary?

References