Server-Side xcactivitylog Processing

marekfort · March 5, 2026, 5:12pm

Summary

Move xcactivitylog parsing from the CLI (client-side) to a dedicated server-side processing pipeline. Instead of parsing logs locally and uploading structured data, the CLI would upload raw .xcactivitylog files to S3, and a server-side service would parse and analyze them asynchronously.

Motivation

We’ve received multiple complaints from large-scale users about tuist inspect build being too slow in CI environments. Client-side parsing of xcactivitylogs can take up to 45 seconds on large builds — a 3–5% overhead on CI jobs that run 10–15 minutes. This has led some teams to disable build metrics collection in CI entirely.

The current client-side approach has served us well, but we’re hitting its limits:

Performance overhead on client machines. Parsing is CPU-intensive and blocks CI pipelines. As we add more analytics, this will only get worse.
Debugging requires user cooperation. When issues arise, we need to ask users to share their .xcactivitylog files. With server-side storage, we’d have direct access for debugging.
Bug fixes require CLI upgrades. Any fix to parsing logic requires users to upgrade the CLI, creating friction. Server-side fixes deploy instantly.
Constant performance pressure. Every new metric or analysis we add to the CLI increases parse time, making it a constant fight to keep the command fast enough.

Current Architecture

Today, tuist inspect build:

Locates the most recent .xcactivitylog in derived data
Parses it locally using XCLogParser (TuistXCActivityLog/XCActivityLogController.swift)
Extracts build targets, files, issues, cache operations, CAS outputs, duration, and category
Collects environment metadata (git info, CI info, Xcode version, machine info, custom tags)
Uploads the structured result to the server via POST /builds

Similar flows exist for tuist inspect test (parsing .xcresult bundles) and tuist inspect bundle (analyzing app bundles).

Proposed Architecture

CLI Changes

The CLI would:

Locate the .xcactivitylog (same as today)
Collect lightweight metadata: git info, CI info, Xcode version, machine info, custom tags/values
Zip the .xcactivitylog together with the CAS metadata directories (see CAS Metadata below) into a single archive
Upload the archive to S3 via a presigned URL
POST metadata + S3 reference to the server
Return immediately — no local parsing

A --mode local|remote flag would control whether parsing happens client-side or server-side. The default is remote only when the CLI is connected to a known Tuist-hosted server — for self-hosted servers, the default remains local so processing stays on the client and self-hosting teams don’t need to run the additional processing infrastructure.

Since .xcactivitylog files are gzip-compressed, they typically range from a few KB to single-digit MBs even for large projects. A simple presigned S3 PUT (which supports up to 5GB) is sufficient — we’ll skip multipart upload initially and revisit if we encounter files large enough to warrant it.

This reduces the CLI’s job to file upload + metadata collection, which should complete in seconds rather than tens of seconds.

Server Changes

New upload endpoint:

POST /accounts/{account_handle}/projects/{project_handle}/builds/upload — returns a presigned S3 URL for the xcactivitylog upload and creates a build record in “processing” state

Processing pipeline (Oban worker):

When the upload endpoint receives metadata and confirms the S3 upload, it enqueues a ProcessBuildWorker Oban job:

ProcessBuildWorker.perform/1 generates a short-lived, scoped token for the specific project/build
SSHs into the Hetzner processing machine (using the existing Tuist.SSHClient abstraction, following the TestWorker pattern from QA)
Runs tuist inspect build --mode local with the token and server URL — since tuist runs on Linux, we reuse the exact same parsing and upload logic with zero code duplication
The CLI downloads the archive from S3, unpacks it, parses it, and uploads results to the server — the same flow as when a user runs tuist inspect build locally, just happening on the processing machine with the scoped token
The raw archive remains in S3 with a 7–30 day retention policy for debugging

The worker would use max_attempts: 3 for retries on transient failures, and unique constraints keyed on the build ID to prevent duplicate processing.

Dashboard changes:

Show a “Processing your build…” state for builds that haven’t been analyzed yet
Once processing completes, display the full analytics as today

Processing Machine

A dedicated Hetzner server auction machine with strong single-thread CPU performance and ample RAM — xcactivitylog parsing is CPU-bound.

Unlike the QA TestWorker pattern which creates and destroys ephemeral Namespace instances, the processing machine is long-lived — the Oban worker simply SSHs in, runs the command, and reads the output. Multiple jobs can run concurrently on the same machine since each operates on its own unpacked archive in a temporary directory.

If a single machine becomes a bottleneck, we can add more machines behind a simple round-robin or least-loaded selection in the worker.

CLI Version Management

The processing machine is configured with NixOS, following the same pattern as the cache nodes in cache/platform/. Since tuist already publishes fully static musl-based Linux binaries on GitHub Releases (for both x86_64 and aarch64), we’d write a simple Nix derivation that fetches the tarball and pins it by version + hash. Bumping the version is just updating those two values in the derivation.

The CLI release workflow would be extended to also bump the tuist version in the processing machine’s Nix configuration, similar to how it already triggers the Homebrew formula update. This ensures the processing machine always runs a known, tested version of the parsing logic and avoids drift.

Authentication

The processing machine needs to write build analytics for arbitrary projects. The Oban worker handles this by generating a short-lived, scoped token for each job and passing it to the Hetzner machine as part of the command. The flow is:

The Oban worker creates a short-lived token scoped to the specific project and build
The Oban worker SSHs into the Hetzner machine and runs tuist inspect build --mode local with the token and server URL
The CLI parses the archive and uploads the results — the exact same flow as when a user runs tuist inspect build locally, just authenticated with the scoped token

This keeps the Oban worker lightweight — it only orchestrates the job and generates credentials — while the processing machine handles both parsing and uploading using the standard CLI flow. The short-lived, narrowly scoped tokens ensure the processing machine never has broad access: each token is only valid for a single project/build and expires shortly after issuance.

CAS Metadata

Build analytics depend heavily on CAS (Content Addressable Storage) metadata that Xcode’s caching system writes to disk during compilation. This data lives outside the .xcactivitylog — in ~/.local/state/tuist/ (or $XDG_STATE_HOME/tuist/) — and is required for cache efficiency metrics, artifact size distributions, and download/upload performance analysis.

Directory structure:

~/.local/state/tuist/
├── nodes/          # Node ID → checksum mappings (~64 bytes each)
├── cas/            # Checksum → size/duration/compressedSize JSON (~200 bytes each)
└── keyvalue/       # Cache key operation timings
    ├── read/
    └── write/

A small-to-medium build might have 30–100 CAS outputs (~8KB of metadata), but large projects can have up to 15k cacheable tasks and 40k CAS outputs. At ~264 bytes per output (node mapping + metadata JSON), that’s roughly 10MB uncompressed — still modest, and it compresses well since the files are small repetitive JSON.

Upload approach: The CLI bundles the .xcactivitylog and the relevant nodes/, cas/, and keyvalue/ directories into a single zip archive, then uploads it to S3 via a presigned URL. The processing service unpacks the archive and runs the analysis with the full context available — exactly as the CLI does today locally.

This avoids any changes to how CAS metadata is structured or read, and the processing service can reuse the same CASNodeStore / CASOutputMetadataStore / KeyValueMetadataStore code paths that exist in the CLI.

Backward Compatibility

The existing client-side parsing mode would remain available:

Local development: Users who want instant feedback can continue using client-side parsing
On-premise deployments: Self-hosted users may prefer client-side parsing to avoid the additional infrastructure
A flag or configuration option (--mode local|remote) could control the behavior, with remote as the default for projects set up against the Tuist-hosted server.

Scope

In scope (Phase 1): `tuist inspect build`

Build log parsing is the most impactful case — it’s where users are hitting performance issues today.

Future phases: `tuist inspect test` and `tuist inspect bundle`

tuist inspect test parses .xcresult bundles, which depend on Xcode command-line tools (xcresulttool). This makes server-side processing harder since we’d need macOS or a compatible toolchain on the server.
tuist inspect bundle analyzes .app/.ipa/.xcarchive/.aab/.apk files. Some analysis (e.g., parsing .xcassets) depends on tools that may not be available on Linux.

These commands haven’t had performance complaints yet, but the same architecture could be extended to them if Linux-compatible parsing is feasible.

Trade-offs

Advantages

Eliminates client-side performance overhead. CI jobs are no longer blocked by log parsing.
Debugging access. Raw logs stored in S3 — no need to ask users for files.
Instant bug fixes. Server-side parsing fixes deploy without CLI upgrades.
Decouples analytics evolution from CLI releases. New metrics can be added server-side without client changes.
Scales independently. Processing capacity can be scaled without affecting CLI or main server.

Disadvantages

Increased infrastructure cost. A dedicated processing machine (though Hetzner auction servers are cost-effective).
Delayed results. Builds won’t have analytics immediately — requires a “processing” state in the dashboard.
More data leaves the client. Raw .xcactivitylog files must be uploaded, which may be large. Some users may have concerns about uploading raw build logs.
Additional infrastructure complexity. A new service to deploy, monitor, and maintain.
On-premise complexity. Self-hosted users would need to run the processing service or stick with client-side parsing.

Open Questions

Remote execution mechanism: The current proposal uses SSH (reusing the existing SSHClient from QA) for the Oban worker to trigger processing on the Hetzner machine. Alternatives worth considering: a lightweight HTTP API on the processing machine, a shared Oban queue (the processing machine running its own Oban consumer against the same database), or a container job. Is SSH the right trade-off between simplicity and robustness, or would one of these alternatives scale better?

References

XCLogParser — compiles and runs on Linux
XCMetrics — prior art for server-side xcactivitylog processing
Current CLI implementation: cli/Sources/TuistXCActivityLog/XCActivityLogController.swift
Current upload service: cli/Sources/TuistServer/Services/CreateBuildService.swift

pepicrft · March 5, 2026, 6:05pm

How is the build attributed to the build? I was wondering if it’s a good time to embrace Stripe’s API pattern for file uploads where you have endpoints dedicated to uploads, which take a purpose attribute. In this case, it could be xcactivity_log, and then the ID can be passed by the client to connect the build with the activity log upload.

Have you compared Hetzner’s approach with Daytona from a pricing perspective? The way we own the machine comes with better cost control, but we need better monitoring and a multi-node pool. Considering we’ll eventually have environments ourselves :), it might not be a bad idea to introduce the interface of sandbox_provider, one of which is Daytona, and down the line, your own account’s pool of environments, which we’ll manage.

How do you attribute the CAS metadata to the build that you are processing the data for if we persist everything into ~/.local/state/tuist/, including state from previous builds?

I have mixed feelings about the system and the release complexity that comes with it, but I’m also fine with testing it out and iterating.

One pattern I’ve tinkered with, and that I believe makes systems like this much simpler, is the combination of ephemeral environments, which are getting commoditized and whose prices are going down, with Deno. Instead of having to deploy CLI updates on release and scale the pool as our runtime demands increase, we can have an environment that spawns a deno process, which resolves the ES module graph via the HTTP transport (the server serves the JS modules).

For on-prem customers, we can ship the Docker image with Deno in it and mount a volume for downloads and processing, all on the same machine. With this approach:

We don’t need to figure out scale (Daytona and in the future us will do it)
We don’t need to figure out how to release the CLI into environments
We don’t need to exclude on-premise from our design

The one caveat is that we need to move the logic to another language and introduce a runtime, but I’m not that against that.

marekfort · March 5, 2026, 9:17pm

We can do that, I’m not super opinionated about this one. The /upload endpoint is following our existing conventions and the build_id would have been passed through the body, so in a lot of ways, it’s similar to what you’re proposing. I’m not sure if I’m super onboard for having an endpoint for all file uploads, I can see that becoming messy. But if we do need to make multipart upload work, it would make some things easier.

Here’s what Claude thinks:

Daytona: 20k builds (taken from last 24 hours) × 1 min = ~333 hours/day × $0.08/hour = ~$27/day = ~$800/month (and growing)

Hetzner: A beefy auction server at ~€40-50/month can easily handle many concurrent parsing jobs. Even if we need 2-3 machines, that’s ~€150/month flat, regardless of volume

The 1 min/build is pessimistic, although in a sandbox, there’s also a setup and teardown step that wouldn’t be immediate. Regardless, I don’t think you can beat Hetzner on price and we already have the pieces we need from the cache nodes. It feels wasteful to do sandbox for something that doesn’t benefit much from that since we’d be repeatedly running a single command, predefined by us. Curious to get @cschmatzler take on this who has more experience with dealing with Hetzner and our caching nodes.

But I think I’d first start with Hetzner and if the maintenance turns out to be a pain, then we can always pivot, rather than starting with an option we know will be always more expensive.

You can’t associate the CAS data with builds without pre-processing the .xcactivitylog, mostly defeating the purpose of all of this. I think the answer will have to be to prune the CAS metadata often enough, so we don’t upload too much that’s unrelated. Also, compressing the CAS files will be quite efficient given the content in them will repeat quite a bit.

I don’t think I follow how deno makes things much better. In the sandbox environment, you can put anything in the Docker image already prebuilt, why make a dependency on a specific runtime? CLI releases (and node release for that matter) are all automated anyways.

Addiiontally, I feel quite strongly that we don’t want to maintain XCLogParser in a new language, unless we had a really good reason. But especially since the library builds on Linux, I don’t think we do.

We’re not excluding on-premise from our design. The same way you can already self-host caching nodes, you could self-host processing nodes. And I’d argue that running a single machine that you point to is a simpler approach than the sandbox one. And there’s always --mode local for companies where the tuist inspect build is fine.

pepicrft · March 6, 2026, 8:34am

Let’s stick to what we have for now.

With those numbers, the maths are clear, so let’s go with VPSs.

My point was that if we can simplify the system so that on-prem doesn’t feel excluded by its complexity. Our cache nodes are a different story. Low latency is one of the most critical traits; we need to host the node close to the compute. But here, what they need is compute, which, coincidentally, is what the machine that runs the web server has. The two sources of complexity that I’d look into simplifying for them are:

Can we run this system in the same environment where they are deploying the app? I don’t see why we can’t do that, with the option to outsource to a pool of machines in case they need to scale (our case), but they can go a long way toward scaling the number and speed of cores in that server.
Can we design the system such that the version of the server (i.e., commit) is bound to the version of the business logic for doing the parsing, processing, and upload, such that if I’m self-hosting, I don’t need additional deployment automation?

If we extract the inspection logic into a small executable, include it in the Docker image, and consume it from the VPS servers (or the web app instance), we have a model that works for us (we can scale it), for our on-premise users (low complexity), and we don’t need to put them in the position of “if you don’t feel like hosting this trade pipeline speed with a --mode local” flag.

Also, they wouldn’t be able to avoid the extra node if the inspection requires macOS, for which I might understand having to decide between “local parsing” vs “hosting a macOS” node, but I’d avoid it for Linux if possible.

Note on the Deno: I missed the need for the native library. Deno can pull the program from the server as an ES module graph, so we can skip building and bundling the executable when building the server image. But since we need the Swift library, this idea falls apart.

cschmatzler · March 6, 2026, 10:00am

Have you compared Hetzner’s approach with Daytona from a pricing perspective? The way we own the machine comes with better cost control, but we need better monitoring and a multi-node pool.

I think sandboxes mostly make sense when running untrusted code, which in this case we don’t, so the engineering and compute overhead of sandboxing doesn’t actually bring any value. Agree on the monitoring part of having a virtual/dedicated server, but we’ve already built quite a good pipeline for that for the cache nodes, and assuming that we’ll use the same deployment mechanism for the… processing nodes (this is now the official working name), it’s mostly a copy and paste while adjusting which metrics we collect.

This is something we’re already getting better at operationally (I adjusted a bunch of alerts this week because things broke) so we can reuse that knowledge.

Remote execution mechanism: The current proposal uses SSH (reusing the existing SSHClient from QA) for the Oban worker to trigger processing on the Hetzner machine. Alternatives worth considering: a lightweight HTTP API on the processing machine, a shared Oban queue (the processing machine running its own Oban consumer against the same database), or a container job. Is SSH the right trade-off between simplicity and robustness, or would one of these alternatives scale better?

I think SSH is actually the wrong choice here. It’s stateful, breaks during network blips or deployments, and we don’t really have anything in the process that would benefit from statefulness from what I can see. Maybe progress reporting but there’s other ways to do that.

I think I see two possible architectures here:

Processing node polls the S3 bucket for unprocessed uploads, writes processed data directly to Postgres/Clickhouse. No communication at all between server and processing nodes. This has the benefit of completely decoupling the two services; processed data shows up on the dashboard through LiveView without the server having active knowledge of it. This has the downside of completely decoupling the two pieces; CLI uploads small initial data to server, artifact to S3, and things need to reconcile nicely from both sides.
Server tells processing node to process something, processing node pings back “I’m done!”; either through webhooks or message passing through PG2. This has the big upside of everything being ordered and not dealing with weird out-of-order database writes, and the downside of things being less parallelised and having to deal with networking and a bigger API surface.

Both are significantly more robust than SSH, in my point of view.

Instant bug fixes. Server-side parsing fixes deploy without CLI upgrades.

This is actually one of the most underrated points here being able to add support for new Xcode weirdness without asking people to upgrade 40 CLI versions with a bunch of other unrelated changes is huge.

marekfort · March 6, 2026, 10:30am

Yes to both, we can build a library/executable dedicated to this that will be a part of the Docker image instead of relying on the release CLI version and then we should be able to execute it with Swift NIFs instead of shelling out. I agree depending on the released CLI is not a good idea, thanks for pointing that out.

And since the processing node will be an Elixir node (another reason why sandboxes are not a good idea for this), the server instance could share the same code and instead of delegating the work to a separate node, it could run things itself.

cschmatzler:

Processing node polls the S3 bucket for unprocessed uploads, writes processed data directly to Postgres/Clickhouse. No communication at all between server and processing nodes. This has the benefit of completely decoupling the two services; processed data shows up on the dashboard through LiveView without the server having active knowledge of it. This has the downside of completely decoupling the two pieces; CLI uploads small initial data to server, artifact to S3, and things need to reconcile nicely from both sides.

Server tells processing node to process something, processing node pings back “I’m done!”; either through webhooks or message passing through PG2. This has the big upside of everything being ordered and not dealing with weird out-of-order database writes, and the downside of things being less parallelised and having to deal with networking and a bigger API surface.

I agree SSH is not the right fit and was definitely the piece I was most unsure about when writing the RFC originally.

I think I’m for doing a combination of these two options. Server delegates work to process nodes through Oban, so we have direct visibility into the work from server, including retries.

But the processing nodes would then write directly to ClickHouse (so we don’t have to pass around potentially pretty large payloads, which has been working fine, but long-term, can become a problem as we will track more build data). It does mean the code for writing builds would need to move to the processing node in this scenario, but since we’re in a monorepo, we can share the code between the two to ensure we don’t break on-prem setups.