Summary
Move xcactivitylog parsing from the CLI (client-side) to a dedicated server-side processing pipeline. Instead of parsing logs locally and uploading structured data, the CLI would upload raw .xcactivitylog files to S3, and a server-side service would parse and analyze them asynchronously.
Motivation
We’ve received multiple complaints from large-scale users about tuist inspect build being too slow in CI environments. Client-side parsing of xcactivitylogs can take up to 45 seconds on large builds — a 3–5% overhead on CI jobs that run 10–15 minutes. This has led some teams to disable build metrics collection in CI entirely.
The current client-side approach has served us well, but we’re hitting its limits:
- Performance overhead on client machines. Parsing is CPU-intensive and blocks CI pipelines. As we add more analytics, this will only get worse.
- Debugging requires user cooperation. When issues arise, we need to ask users to share their
.xcactivitylogfiles. With server-side storage, we’d have direct access for debugging. - Bug fixes require CLI upgrades. Any fix to parsing logic requires users to upgrade the CLI, creating friction. Server-side fixes deploy instantly.
- Constant performance pressure. Every new metric or analysis we add to the CLI increases parse time, making it a constant fight to keep the command fast enough.
Current Architecture
Today, tuist inspect build:
- Locates the most recent
.xcactivitylogin derived data - Parses it locally using XCLogParser (
TuistXCActivityLog/XCActivityLogController.swift) - Extracts build targets, files, issues, cache operations, CAS outputs, duration, and category
- Collects environment metadata (git info, CI info, Xcode version, machine info, custom tags)
- Uploads the structured result to the server via
POST /builds
Similar flows exist for tuist inspect test (parsing .xcresult bundles) and tuist inspect bundle (analyzing app bundles).
Proposed Architecture
CLI Changes
The CLI would:
- Locate the
.xcactivitylog(same as today) - Collect lightweight metadata: git info, CI info, Xcode version, machine info, custom tags/values
- Zip the
.xcactivitylogtogether with the CAS metadata directories (see CAS Metadata below) into a single archive - Upload the archive to S3 via a presigned URL
- POST metadata + S3 reference to the server
- Return immediately — no local parsing
A --mode local|remote flag would control whether parsing happens client-side or server-side. The default is remote only when the CLI is connected to a known Tuist-hosted server — for self-hosted servers, the default remains local so processing stays on the client and self-hosting teams don’t need to run the additional processing infrastructure.
Since .xcactivitylog files are gzip-compressed, they typically range from a few KB to single-digit MBs even for large projects. A simple presigned S3 PUT (which supports up to 5GB) is sufficient — we’ll skip multipart upload initially and revisit if we encounter files large enough to warrant it.
This reduces the CLI’s job to file upload + metadata collection, which should complete in seconds rather than tens of seconds.
Server Changes
New upload endpoint:
POST /accounts/{account_handle}/projects/{project_handle}/builds/upload— returns a presigned S3 URL for the xcactivitylog upload and creates a build record in “processing” state
Processing pipeline (Oban worker):
When the upload endpoint receives metadata and confirms the S3 upload, it enqueues a ProcessBuildWorker Oban job:
ProcessBuildWorker.perform/1generates a short-lived, scoped token for the specific project/build- SSHs into the Hetzner processing machine (using the existing
Tuist.SSHClientabstraction, following theTestWorkerpattern from QA) - Runs
tuist inspect build --mode localwith the token and server URL — since tuist runs on Linux, we reuse the exact same parsing and upload logic with zero code duplication - The CLI downloads the archive from S3, unpacks it, parses it, and uploads results to the server — the same flow as when a user runs
tuist inspect buildlocally, just happening on the processing machine with the scoped token - The raw archive remains in S3 with a 7–30 day retention policy for debugging
The worker would use max_attempts: 3 for retries on transient failures, and unique constraints keyed on the build ID to prevent duplicate processing.
Dashboard changes:
- Show a “Processing your build…” state for builds that haven’t been analyzed yet
- Once processing completes, display the full analytics as today
Processing Machine
A dedicated Hetzner server auction machine with strong single-thread CPU performance and ample RAM — xcactivitylog parsing is CPU-bound.
Unlike the QA TestWorker pattern which creates and destroys ephemeral Namespace instances, the processing machine is long-lived — the Oban worker simply SSHs in, runs the command, and reads the output. Multiple jobs can run concurrently on the same machine since each operates on its own unpacked archive in a temporary directory.
If a single machine becomes a bottleneck, we can add more machines behind a simple round-robin or least-loaded selection in the worker.
CLI Version Management
The processing machine is configured with NixOS, following the same pattern as the cache nodes in cache/platform/. Since tuist already publishes fully static musl-based Linux binaries on GitHub Releases (for both x86_64 and aarch64), we’d write a simple Nix derivation that fetches the tarball and pins it by version + hash. Bumping the version is just updating those two values in the derivation.
The CLI release workflow would be extended to also bump the tuist version in the processing machine’s Nix configuration, similar to how it already triggers the Homebrew formula update. This ensures the processing machine always runs a known, tested version of the parsing logic and avoids drift.
Authentication
The processing machine needs to write build analytics for arbitrary projects. The Oban worker handles this by generating a short-lived, scoped token for each job and passing it to the Hetzner machine as part of the command. The flow is:
- The Oban worker creates a short-lived token scoped to the specific project and build
- The Oban worker SSHs into the Hetzner machine and runs
tuist inspect build --mode localwith the token and server URL - The CLI parses the archive and uploads the results — the exact same flow as when a user runs
tuist inspect buildlocally, just authenticated with the scoped token
This keeps the Oban worker lightweight — it only orchestrates the job and generates credentials — while the processing machine handles both parsing and uploading using the standard CLI flow. The short-lived, narrowly scoped tokens ensure the processing machine never has broad access: each token is only valid for a single project/build and expires shortly after issuance.
CAS Metadata
Build analytics depend heavily on CAS (Content Addressable Storage) metadata that Xcode’s caching system writes to disk during compilation. This data lives outside the .xcactivitylog — in ~/.local/state/tuist/ (or $XDG_STATE_HOME/tuist/) — and is required for cache efficiency metrics, artifact size distributions, and download/upload performance analysis.
Directory structure:
~/.local/state/tuist/
├── nodes/ # Node ID → checksum mappings (~64 bytes each)
├── cas/ # Checksum → size/duration/compressedSize JSON (~200 bytes each)
└── keyvalue/ # Cache key operation timings
├── read/
└── write/
A small-to-medium build might have 30–100 CAS outputs (~8KB of metadata), but large projects can have up to 15k cacheable tasks and 40k CAS outputs. At ~264 bytes per output (node mapping + metadata JSON), that’s roughly 10MB uncompressed — still modest, and it compresses well since the files are small repetitive JSON.
Upload approach: The CLI bundles the .xcactivitylog and the relevant nodes/, cas/, and keyvalue/ directories into a single zip archive, then uploads it to S3 via a presigned URL. The processing service unpacks the archive and runs the analysis with the full context available — exactly as the CLI does today locally.
This avoids any changes to how CAS metadata is structured or read, and the processing service can reuse the same CASNodeStore / CASOutputMetadataStore / KeyValueMetadataStore code paths that exist in the CLI.
Backward Compatibility
The existing client-side parsing mode would remain available:
- Local development: Users who want instant feedback can continue using client-side parsing
- On-premise deployments: Self-hosted users may prefer client-side parsing to avoid the additional infrastructure
- A flag or configuration option (
--mode local|remote) could control the behavior, withremoteas the default for projects set up against the Tuist-hosted server.
Scope
In scope (Phase 1): tuist inspect build
Build log parsing is the most impactful case — it’s where users are hitting performance issues today.
Future phases: tuist inspect test and tuist inspect bundle
tuist inspect testparses.xcresultbundles, which depend on Xcode command-line tools (xcresulttool). This makes server-side processing harder since we’d need macOS or a compatible toolchain on the server.tuist inspect bundleanalyzes.app/.ipa/.xcarchive/.aab/.apkfiles. Some analysis (e.g., parsing.xcassets) depends on tools that may not be available on Linux.
These commands haven’t had performance complaints yet, but the same architecture could be extended to them if Linux-compatible parsing is feasible.
Trade-offs
Advantages
- Eliminates client-side performance overhead. CI jobs are no longer blocked by log parsing.
- Debugging access. Raw logs stored in S3 — no need to ask users for files.
- Instant bug fixes. Server-side parsing fixes deploy without CLI upgrades.
- Decouples analytics evolution from CLI releases. New metrics can be added server-side without client changes.
- Scales independently. Processing capacity can be scaled without affecting CLI or main server.
Disadvantages
- Increased infrastructure cost. A dedicated processing machine (though Hetzner auction servers are cost-effective).
- Delayed results. Builds won’t have analytics immediately — requires a “processing” state in the dashboard.
- More data leaves the client. Raw
.xcactivitylogfiles must be uploaded, which may be large. Some users may have concerns about uploading raw build logs. - Additional infrastructure complexity. A new service to deploy, monitor, and maintain.
- On-premise complexity. Self-hosted users would need to run the processing service or stick with client-side parsing.
Open Questions
- Remote execution mechanism: The current proposal uses SSH (reusing the existing
SSHClientfrom QA) for the Oban worker to trigger processing on the Hetzner machine. Alternatives worth considering: a lightweight HTTP API on the processing machine, a shared Oban queue (the processing machine running its own Oban consumer against the same database), or a container job. Is SSH the right trade-off between simplicity and robustness, or would one of these alternatives scale better?
References
- XCLogParser — compiles and runs on Linux
- XCMetrics — prior art for server-side xcactivitylog processing
- Current CLI implementation:
cli/Sources/TuistXCActivityLog/XCActivityLogController.swift - Current upload service:
cli/Sources/TuistServer/Services/CreateBuildService.swift