Summary
This RFC proposes adding test sharding to Tuist, allowing users to split their test suites across multiple CI runners for faster feedback loops. The system will distribute tests across shards using timing data collected by the Tuist server, with fallback strategies when no historical data is available. The splitting granularity differs by build system: module-level for Xcode (test targets) and suite-level for Gradle (test suites, i.e. Gradle test classes), reflecting the conventions and tooling capabilities of each ecosystem. While the initial focus is on GitHub Actions integration, the design is CI-agnostic.
Motivation
As projects grow, test suites become the bottleneck in CI pipelines. A monorepo with dozens of test targets can take 30+ minutes to run sequentially on a single machine. Teams work around this by manually splitting tests across CI jobs, but this approach is fragile, hard to maintain, and leads to unbalanced shards where one job takes 20 minutes while others finish in 5.
Tuist is uniquely positioned to solve this because:
- Tuist already knows the project graph – it understands which test targets exist and their dependencies.
- The Tuist server already collects test timing data – per-module and per-test-case durations from
tuist testresult uploads, stored in ClickHouse withrecent_durationsandavg_durationfields. - Tuist already detects CI environments – GitHub Actions, GitLab CI, CircleCI, Buildkite, Bitrise, and Codemagic.
The missing piece is an orchestration layer that uses this data to produce balanced shard assignments and integrates with CI matrix strategies.
Prior Art
Buildkite Test Engine Client (bktec)
Buildkite uses a bin-packing algorithm with historical timing data to distribute tests so all parallel workers finish at roughly the same time. It supports file-level and example-level splitting, marks files exceeding 70% of a worker’s estimated time for finer-grained splitting, and suggests parallelism counts that keep all workers within a ~2-minute completion window. New tests default to an estimated 1000ms until real data is available.
CircleCI circleci tests split
CircleCI offers three strategies: by name (round-robin alphabetically), by timing (historical execution times from store_test_results), and by file size. The parallelism: N key spins up N containers, each aware of its index via $CIRCLE_NODE_INDEX / $CIRCLE_NODE_TOTAL. Timing data accumulates automatically from uploaded test results.
Bazel shard_count
Bazel sets TEST_TOTAL_SHARDS and TEST_SHARD_INDEX environment variables. The test runner selects tests via index % total_shards == shard_index. Purely count-based with no timing optimization.
Gradle / Develocity
Gradle’s built-in maxParallelForks uses round-robin across JVM forks (single machine). Develocity Test Distribution (commercial) uses timing-based partitioning across remote agents with real-time work-stealing.
Proposed Solution
Overview
The sharding workflow has three phases:
- Plan – The CLI (or Gradle plugin) queries the server for test timing data and computes a shard assignment.
- Execute – Each CI runner receives its shard index and runs only its assigned tests.
- Report – Each shard uploads its test results to the server as it does today.
Shard Configuration
Sharding is configured via CLI flags on tuist test --build-only or tuist xcodebuild build-for-testing (for Xcode) or environment variables (for Gradle). This allows users to experiment with sharding in feature branches before rolling it out — the CI workflow file is branch-specific.
See the sharding flags table in Section 1 (Xcode projects) below for the full list of options.
Running a Specific Shard
There are two execution paths depending on the build system.
1. Xcode projects
Sharding is built into two command layers:
tuist test(recommended for Tuist-generated projects) — the existingtuist testcommand gains--shard-*flags. When--build-onlyis combined with sharding flags, it generates the project, builds for testing, computes shards, and outputs the matrix. When--without-buildingis used with shard environment variables, it pulls the filtered.xctestrunand runs the assigned tests. This is the recommended path for Tuist-generated projects because it handles project generation, selective testing, and sharding in a single command.tuist xcodebuild build-for-testing/test-without-building(for non-generated projects) — the same sharding behavior, but without project generation or selective testing. This is the path for projects that manage their own.xcodeproj/.xcworkspace.
Under the hood, both paths use the same .xctestproducts bundle mechanism — the only difference is that tuist test also handles project generation and selective testing.
Test module discovery via .xctestrun: The .xctestrun plist file (embedded inside the .xctestproducts bundle produced by xcodebuild build-for-testing -testProductsPath) contains an entry for every test target in the scheme. This is the authoritative source of “what test modules exist in this project right now” — it works regardless of whether the project uses Tuist manifests.
This solves three problems:
- New modules: A newly added test target appears in the
.xctestrunfile immediately. The server won’t have timing data for it, so it gets a default duration estimate. It will still be included in a shard and tested. - Removed modules: A deleted test target disappears from the
.xctestrunfile. The server may still have historical timing data, but since the module isn’t in the discovered set, it’s excluded from shard computation. Stale server data is harmlessly ignored. - First run: No bootstrapping problem. The
.xctestrunfile provides the full module list even when the server has no historical data at all. All modules get default estimates, producing a round-robin-like distribution.
Build-once, test-many pattern across machines: In CI, the build step and shard test steps typically run on different machines. The shard runners cannot reference local files from the build agent. To handle this:
- The plan step (
tuist test --build-onlyortuist xcodebuild build-for-testing) auto-injects the-testProductsPathflag, producing a.xctestproductsbundle — a self-contained, portable artifact that packages the.xctestrunfile alongside the compiled.xctestbundles. - When sharding flags are present, the bundle is uploaded to the Tuist server as part of the shard session. Each shard runner downloads the bundle with a filtered
.xctestruncontaining only its assigned test targets. No CI-provider-specific artifact sharing is needed.
The .xctestproducts bundle format (validated experimentally):
MyApp.xctestproducts/
├── Info.plist # Maps test plans to xctestrun file paths
├── Tests/
│ └── 0/
│ ├── MyApp.xctestrun # The xctestrun file
│ └── Debug -> ../../Binaries/0/Debug # Symlink to binaries
└── Binaries/
└── 0/
└── Debug/
├── AppTests.xctest/ # Compiled test bundles
├── CoreTests.xctest/
└── ...
Key properties:
- Self-contained and portable: The bundle contains everything needed to run tests on another machine — no source code, intermediate build artifacts, or DerivedData.
__TESTROOT__resolves automatically: The.xctestrunfile uses__TESTROOT__placeholders. Inside the bundle,Tests/0/Debugsymlinks to../../Binaries/0/Debug, so__TESTROOT__/Debug/*.xctestresolves correctly.xcodebuild test-without-building -testProductsPathconsumes this bundle directly. Filtering works by modifying the.xctestruninside the bundle to remove test target entries — only the targets present in the.xctestrunare executed.
Step 1: Plan job:
# Tuist-generated projects (recommended):
# Generates the project, builds for testing, computes shards, outputs the matrix.
tuist test --build-only --shard-max 6
# Non-generated projects:
# Build, compute shards, push to server, and output the shard matrix — all in one command.
tuist xcodebuild build-for-testing \
-workspace MyApp.xcworkspace \
-scheme MyApp \
-destination 'platform=iOS Simulator,name=iPhone 16' \
--shard-max 6
Sharding flags (available on both tuist test and tuist xcodebuild build-for-testing):
| Flag | Description | Default |
|---|---|---|
--shard-max N |
Maximum number of shards | Number of test modules |
--shard-min N |
Minimum number of shards | 1 |
--shard-total N |
Exact number of shards (overrides min/max) | Auto-determined |
--shard-max-duration N |
Target max shard duration (seconds) | None |
These are Tuist-specific flags (not passed through to xcodebuild). The presence of any --shard-* flag activates sharding.
Automatic -testProductsPath injection: When sharding is active, the CLI auto-injects -testProductsPath (e.g., .tuist/test-products/<scheme>.xctestproducts) so the bundle is produced in a known location. For tuist xcodebuild, users can override this by passing their own -testProductsPath.
When sharding flags are present, the plan step (either tuist test --build-only or tuist xcodebuild build-for-testing) extends its normal behavior with:
- Auto-injects
-testProductsPathif not already present. - Runs
xcodebuild build-for-testingwith the passthrough arguments. - Locates the
.xctestrunfile inside the produced.xctestproductsbundle. - Parses the
.xctestrunplist to discover test targets (each entry inTestConfigurations[0].TestTargetsis a test module with aBlueprintName). - Sends the module list,
.xctestrunfile, and shard configuration (min/max/total/max-duration) to the server. The server fetches timing data, computes shard assignments via bin-packing, and stores the.xctestrun+ assignments tagged with a shard session ID. - Receives the shard assignments back from the server.
- Outputs the shard matrix to the CI provider:
- GitHub Actions: Writes
matrix={"shard":[0,1,2,...]}directly to$GITHUB_OUTPUT(detected via theGITHUB_OUTPUTenvironment variable). - Other CI providers: Writes a
tuist-shard-matrix.jsonfile. Future integrations can add native output for other providers.
- GitHub Actions: Writes
Without sharding flags, the command behaves exactly as it does today.
The .xctestproducts bundle is uploaded to the Tuist server as part of the shard session. Shard runners download it from the server alongside the filtered .xctestrun, so no CI-provider-specific artifact sharing is needed.
Step 2: Shard jobs:
# Tuist-generated projects (recommended):
# Downloads filtered .xctestrun, runs assigned tests, uploads results.
tuist test --without-building
# Non-generated projects:
# Same behavior, but without project generation or selective testing.
tuist xcodebuild test-without-building \
-destination 'platform=iOS Simulator,name=iPhone 16'
When the TUIST_SHARD_INDEX environment variable is set, the shard step (either tuist test --without-building or tuist xcodebuild test-without-building) extends its normal behavior with:
- Downloads the
.xctestproductsbundle for this shard from the Tuist server (session ID auto-detected from CI environment). The bundle contains a filtered.xctestrunwith only the test targets assigned to this shard. - Places the bundle at the known location (
.tuist/test-products/<scheme>.xctestproducts) and auto-injects-testProductsPath. - Runs
xcodebuild test-without-building -testProductsPath <bundle-path>with the passthrough arguments. - After tests complete, uploads test results to the server (as today), with shard metadata attached.
Without TUIST_SHARD_INDEX, the command behaves exactly as it does today — all tests run.
The server stores the original .xctestproducts bundle and the shard assignments. When a shard runner requests its bundle, the server removes the TestTargets entries that don’t belong to that shard from the .xctestrun’s TestConfigurations and returns the modified bundle.
Shard detection for Xcode shard runners:
| Env Var | Description |
|---|---|
TUIST_SHARD_INDEX |
The index of this shard (0-based) |
This is set in the CI workflow (e.g., from GitHub Actions matrix.shard). The total number of shards is already stored in the shard session on the server — the runner only needs to know its own index.
Coupling plan and shard jobs — shard session ID: The build and test steps need a shared identifier so shard runners can find the correct .xctestrun on the server. This is handled via a shard session ID derived from the CI environment:
| CI Provider | Session ID derived from |
|---|---|
| GitHub Actions | github-{GITHUB_RUN_ID}-{GITHUB_RUN_ATTEMPT} |
| CircleCI | circleci-{CIRCLE_WORKFLOW_ID} |
| Buildkite | buildkite-{BUILDKITE_BUILD_ID} |
| GitLab CI | gitlab-{CI_PIPELINE_ID} |
| Other / local | Explicit --session <id> flag required |
Since the Tuist CLI already detects CI environments, the session ID is auto-detected in most cases. The plan job and shard jobs within the same CI run share the same environment variables, so they produce the same session ID without any manual passing.
For retries: GITHUB_RUN_ATTEMPT is included so a retried workflow run gets a fresh session, avoiding stale shard assignments from a previous attempt.
2. Gradle projects: Tuist Gradle plugin
For Gradle projects, sharding is integrated into the Tuist Gradle plugin (dev.tuist:tuist-gradle-plugin). The plugin already hooks into Gradle’s test lifecycle for test insights and quarantine; sharding extends this with a prepareTestShards task and a test filtering step. To align with the Xcode workflow, shard configuration is passed as flags to the prepareTestShards task rather than being declared in the Gradle DSL.
Why suite-level splitting for Gradle: Gradle projects vary widely in modularization. Many Gradle projects follow a multi-module architecture (:feature:home, :core:network, etc.), but it’s equally common to see projects with a handful of modules or even a single monolithic :app module containing all tests. Module-level sharding would be useless in the latter case. Since the Tuist Gradle plugin already collects per-suite timing data (Gradle test classes map to Tuist test suites) via its TestListener, and Gradle’s filter.includeTestsMatching() API natively supports suite-level filtering (already used for test quarantine), suite-level splitting is both more practical and more effective.
Configuration: Sharding is configured via flags on the prepareTestShards Gradle task, mirroring the --shard-* flags used by tuist test and tuist xcodebuild for Xcode:
| Flag | Description | Default |
|---|---|---|
--shard-max <n> |
Maximum number of shards | Required |
--shard-min <n> |
Minimum number of shards | 1 |
--shard-max-duration <s> |
Target max shard duration (seconds) | — |
The TUIST_SHARD_INDEX environment variable tells the plugin which shard this runner is. When absent and prepareTestShards is invoked, the plugin runs the plan step — it discovers test suites, sends them to the server, and outputs the shard matrix (same CI integration as Xcode: GitHub Actions $GITHUB_OUTPUT, Buildkite buildkite-agent pipeline upload, or tuist-shard-matrix.json). When TUIST_SHARD_INDEX is set, the plugin runs in shard mode — it pulls its assigned test suites from the server and filters accordingly.
How it works:
Plan step (./gradlew prepareTestShards --shard-max <n>):
- The plugin compiles test sources and scans the test classpath to discover all current test suites. This is the source of truth for what exists now (same principle as
.xctestrunfor Xcode). - The plugin packages the compiled test runtime classpath (compiled classes, application classes, and dependencies).
- The plugin calls the Tuist server’s shard session endpoint, uploading the packaged classpath along with the discovered test suites and shard configuration from the task flags.
- The server fetches per-suite timing data from
test_suite_runs, computes shard assignments via bin-packing, and stores the session alongside the classpath. - The plugin receives the shard assignments and outputs the shard matrix to the CI provider.
- Tests do not run in this step — the plan step only compiles, uploads, and computes the matrix.
Shard step (TUIST_SHARD_INDEX set):
- The plugin downloads the compiled test classpath from the Tuist server (session ID auto-detected from CI environment, same as Xcode).
- The plugin pulls the assigned test suites for this shard from the server.
- The plugin uses Gradle’s
filter.includeTestsMatching()API (the same mechanism used for test quarantine today) to include only the assigned test suites, and configures thetesttask to use the downloaded classpath — skipping compilation entirely. - Tests run and results are uploaded as usual, with shard metadata included.
# Plan step — compiles, uploads test classpath to server, computes shards, outputs matrix
./gradlew prepareTestShards --shard-max 6
# Shard step — downloads compiled test classpath from server, runs assigned tests
TUIST_SHARD_INDEX=${{ matrix.shard }} ./gradlew test
Build-once, test-many for Gradle: Like Xcode’s .xctestproducts bundle, the plan step packages and uploads the compiled test runtime classpath (compiled classes, application classes, and dependencies) to the Tuist server. Shard runners download it and run tests without recompilation — the same pattern Develocity Test Distribution uses when transferring compiled test binaries to remote agents.
Partitioning Strategy
The initial implementation uses a single strategy: timing-based bin-packing. The algorithm is the same for both build systems; what differs is the unit of distribution:
- Xcode: test modules (targets) — data from
test_module_runs - Gradle: test suites — data from
test_suite_runs
timing (default and only strategy)
Uses historical test durations from the Tuist server to create balanced shards via a greedy bin-packing algorithm (Longest Processing Time first, or LPT). The algorithm runs server-side.
The core idea is simple: if you’re packing items of different sizes into a fixed number of bins, you get the most even distribution by placing the largest item first into the emptiest bin, then repeating. Applied to test sharding, each “item” is a test unit (module or class) with a known duration, and each “bin” is a shard. The algorithm minimizes the longest shard’s total duration, which is what determines overall CI wall-clock time.
Steps:
- Fetch
avg_durationfor each unit (module or class) from ClickHouse (scoped to the project and default branch). - Sort units by duration descending (longest first).
- For each unit, assign it to the shard with the lowest total estimated duration so far.
Example: Given 5 modules with durations [30s, 25s, 20s, 15s, 10s] and 3 shards:
- Shard 0 ← 30s → total: 30s
- Shard 1 ← 25s → total: 25s
- Shard 2 ← 20s → total: 20s
- Shard 2 ← 15s → total: 35s (was lowest at 20s)
- Shard 1 ← 10s → total: 35s (was lowest at 25s)
- Result: shards of 30s, 35s, 35s — well-balanced despite uneven module sizes.
Units with no timing data are assigned an estimated duration equal to the median of known units (or a default of 30 seconds for modules / 5 seconds for classes if no data exists at all). When no timing data is available at all (e.g., a project that hasn’t uploaded test results yet), all units are assigned equal estimated durations, which effectively produces a round-robin distribution.
Future strategies
The --strategy flag is reserved for future extensibility. If the need arises, we could add strategies such as:
round-robin– Distributes test modules alphabetically in round-robin order. No server communication needed. Could serve as an explicit offline fallback for projects not connected to the Tuist server.uniform– Distributes test modules to produce an equal count per shard (±1), ignoring timing data. Useful when test modules have roughly similar execution times.dynamic– A queue-based approach (similar to Knapsack Pro) where runners pull work from a server-side queue at runtime, enabling real-time load balancing. This would require significant server-side infrastructure but would produce optimal shard balance.
We intentionally start with a single strategy to keep the initial implementation focused and gather real-world usage data before investing in alternatives.
Auto-Determining Shard Count
When --shard-total is not specified and --shard-min/--shard-max bounds are set, the timing strategy auto-determines the optimal shard count:
- Compute total estimated test duration from server data.
- Set target shard duration to
total / Nfor candidate N values within[min, max]. - Select the smallest N where the longest shard (after bin-packing) is within 20% of the target duration.
If --shard-max-duration is set, start from ceil(total_duration / max_duration) and clamp to [min, max].
Server API
Shard session creation endpoint (for Xcode)
The CLI sends the discovered module list and shard configuration. The server fetches timing data, computes shard assignments, and returns the result along with an upload URL for the test artifacts.
Step 1: Create shard session
POST /api/projects/:project_handle/tests/shards
Request:
{
"session_id": "github-12345-1",
"modules": ["AppTests", "CoreTests", "NetworkTests", "NewFeatureTests"],
"shard_min": 1,
"shard_max": 6,
"shard_max_duration": null
}
The server:
- Queries the
test_module_runsClickHouse table for timing data (filtered to CI runs on the default branch). - Modules with no timing data get a default estimated duration (median of known modules, or 30 seconds if no data exists at all).
- Modules in the server’s history but not in the request (removed modules) are ignored.
- Determines the optimal shard count based on the configuration (min/max/total/max-duration).
- Computes shard assignments via the bin-packing algorithm.
- Stores the shard assignments and returns an S3 upload URL for the
.xctestproductsbundle.
Response:
{
"session_id": "github-12345-1",
"shard_count": 4,
"shards": [
{ "index": 0, "test_targets": ["AppTests", "CoreTests"], "estimated_duration_ms": 45000 },
{ "index": 1, "test_targets": ["NetworkTests", "AuthTests"], "estimated_duration_ms": 43000 }
],
"upload_url": "https://storage.tuist.dev/..."
}
Step 2: Upload test artifacts
After receiving the response, the CLI uploads the .xctestproducts bundle (compressed) directly to S3 via the presigned upload_url. The server uses a conventional path based on the project and session ID, so shard runners can retrieve it later.
Step 3: Download shard (called by each shard runner):
GET /api/projects/:project_handle/tests/shards/:session_id/:shard_index
Response: A JSON with the shard assignment and a download URL for the .xctestproducts bundle. The server returns the bundle with a pre-filtered .xctestrun — test targets not assigned to this shard are already stripped from the TestConfigurations array. The shard runner downloads the bundle and runs tests directly, with no client-side filtering needed.
{
"test_targets": ["AppTests", "CoreTests"],
"download_url": "https://storage.tuist.dev/..."
}
Shard sessions are ephemeral — the server can garbage-collect them after a configurable TTL (e.g., 24 hours).
Shard session creation endpoint (for Gradle plugin)
The Gradle plugin uses the same session-based approach as Xcode. The plan step creates a session; shard runners pull their assignments by index.
The Gradle plugin uses the same POST /api/projects/:handle/tests/shards endpoint as Xcode, but sends test_suites instead of modules:
Request:
{
"session_id": "github-12345-1",
"test_suites": [
"com.example.auth.LoginTest",
"com.example.auth.SignupTest",
"com.example.core.UtilsTest",
"com.example.core.DatabaseTest"
],
"shard_max": 4
}
The response includes an upload_url for the compiled test classpath (same pattern as .xctestproducts for Xcode). The plugin uploads the packaged classpath to S3 after creating the session.
Shard runners call GET /api/projects/:handle/tests/shards/:session_id/:shard_index and receive their assigned test suites plus a download_url for the compiled classpath:
{
"test_suites": [
"com.example.auth.LoginTest",
"com.example.core.DatabaseTest"
],
"download_url": "https://storage.tuist.dev/..."
}
This follows the same pattern as Xcode: the plan step provides the source of truth for what exists, the server provides timing data and computes balanced assignments, and shard runners only need their index to pull their assignments and artifacts.
Shard Computation Location
The shard computation happens on the server. The CLI sends the discovered module list (or suite list for Gradle), the .xctestrun file, and the shard configuration. The server fetches timing data from ClickHouse, runs the bin-packing algorithm, stores the .xctestrun and shard assignments, and returns the result to the CLI. This keeps the algorithm centralized (one implementation shared across Xcode and Gradle paths), allows the server to evolve the algorithm without CLI updates, and ensures the computation has direct access to timing data without an extra round-trip.
Dashboard Integration
Sharded test runs should appear as a single test run in the dashboard, not as separate entries per shard. This preserves the user’s mental model: “I ran my tests” produces one result, regardless of how many shards executed in parallel.
Each shard uploads its test results independently (as it does today), but tagged with the shard session ID and shard index. The server merges results into the single parent test run:
Dashboard UI Changes
The test run detail page (test_run_live) needs adjustments to surface shard information:
- Overview tab: Show shard metadata when the run is sharded — total shard count, per-shard durations (e.g., a bar showing how balanced the shards were), and which shard was the bottleneck.
- Test Cases / Test Suites / Test Modules tabs: Add a “Shard” column or filter, so users can see which shard ran which tests. A shard filter dropdown lets users drill into a specific shard’s results.
- Failures tab: Each failure should show which shard it came from, helping users reproduce failures on the right shard.
- Shard balance visualization: A simple bar chart or breakdown showing per-shard duration and test count. This helps users understand whether sharding is well-balanced and whether they should adjust
--shard-max.
GitHub Actions Integration
Xcode — Tuist-generated projects (using tuist test)
This is the recommended path for projects that use Tuist manifests. tuist test handles project generation, selective testing, and sharding in a single command.
name: Tests
on: [pull_request]
jobs:
plan:
runs-on: macos-15
outputs:
matrix: ${{ steps.build.outputs.matrix }}
steps:
- uses: actions/checkout@v4
- name: Install Tuist
run: mise install
# Generates the project, builds for testing, computes shards, and
# writes the matrix to $GITHUB_OUTPUT — all in one step.
# Generates the project, builds for testing, uploads the .xctestproducts
# bundle to the Tuist server, computes shards, and writes the matrix
# to $GITHUB_OUTPUT — all in one step.
- name: Build and prepare shards
id: build
run: tuist test --build-only --shard-max 6
test:
runs-on: macos-15
needs: plan
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.plan.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
- name: Install Tuist
run: mise install
# Downloads the .xctestproducts bundle and filtered .xctestrun from
# the Tuist server, runs the assigned tests, and uploads results.
- name: Run shard tests
env:
TUIST_SHARD_INDEX: ${{ matrix.shard }}
run: tuist test --without-building
Xcode — non-generated projects (using tuist xcodebuild)
For projects that manage their own .xcodeproj / .xcworkspace without Tuist manifests.
name: Tests
on: [pull_request]
jobs:
plan:
runs-on: macos-15
outputs:
matrix: ${{ steps.build.outputs.matrix }}
steps:
- uses: actions/checkout@v4
- name: Install Tuist
run: mise install
- name: Build and prepare shards
id: build
run: |
tuist xcodebuild build-for-testing \
-workspace MyApp.xcworkspace \
-scheme MyApp \
-destination 'platform=iOS Simulator,name=iPhone 16' \
--shard-max 6
test:
runs-on: macos-15
needs: plan
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.plan.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
- name: Install Tuist
run: mise install
- name: Run shard tests
env:
TUIST_SHARD_INDEX: ${{ matrix.shard }}
run: |
tuist xcodebuild test-without-building \
-destination 'platform=iOS Simulator,name=iPhone 16'
Gradle (plugin-driven sharding)
Sharding is configured via flags on the prepareTestShards task. Shard runners use TUIST_SHARD_INDEX to pull their assignments.
name: Tests
on: [pull_request]
jobs:
plan:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.plan.outputs.matrix }}
steps:
- uses: actions/checkout@v4
- name: Set up JDK
uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
# Compiles test sources, discovers test suites, computes shards,
# and writes matrix to $GITHUB_OUTPUT — all via the Gradle plugin.
- name: Plan shards
id: plan
run: ./gradlew prepareTestShards --shard-max 6
test:
runs-on: ubuntu-latest
needs: plan
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.plan.outputs.matrix) }}
steps:
- uses: actions/checkout@v4
- name: Set up JDK
uses: actions/setup-java@v4
with:
java-version: '17'
distribution: 'temurin'
- name: Run tests
env:
TUIST_SHARD_INDEX: ${{ matrix.shard }}
run: ./gradlew test
Integration with Other CI Providers
The system works with any CI provider that supports parallel jobs. The key contract is:
- A plan job runs the plan step (
tuist test --build-only,tuist xcodebuild build-for-testing, or./gradlew prepareTestShards) and produces a shard matrix. - Shard jobs set
TUIST_SHARD_INDEXand run their assigned subset (tuist test --without-building,tuist xcodebuild test-without-building, or./gradlew test).
The CLI outputs the matrix in a CI-native format when possible (GitHub Actions $GITHUB_OUTPUT, Buildkite buildkite-agent pipeline upload) and falls back to writing a tuist-shard-matrix.json file for other providers.
CI providers with native parallelism (CircleCI parallelism, GitLab CI parallel) provide their own index/total environment variables. These map directly to TUIST_SHARD_INDEX:
# CircleCI — Tuist-generated project (Xcode)
TUIST_SHARD_INDEX=$CIRCLE_NODE_INDEX tuist test --without-building
# CircleCI — non-generated project (Xcode)
TUIST_SHARD_INDEX=$CIRCLE_NODE_INDEX \
tuist xcodebuild test-without-building -destination 'platform=iOS Simulator,name=iPhone 16'
# GitLab CI — Gradle (plugin-driven)
TUIST_SHARD_INDEX=$CI_NODE_INDEX ./gradlew test
Unsupported or custom CI providers can use tuist-shard-matrix.json directly. After the build step, the file contains the full shard assignment. Users read it to spawn parallel jobs in whatever way their CI supports:
# Plan step: build and write tuist-shard-matrix.json (use tuist test --build-only for generated projects)
tuist xcodebuild build-for-testing -scheme MyApp -destination '...' --shard-max 6
# Read the matrix
cat tuist-shard-matrix.json
# {"shard_count":4,"shards":[{"index":0,...},{"index":1,...},...]}
# Shard steps: set the shard index and run (how you spawn these depends on your CI)
TUIST_SHARD_INDEX=0 tuist test --without-building
# or: TUIST_SHARD_INDEX=0 tuist xcodebuild test-without-building -destination '...'
The tuist-shard-matrix.json format is stable and documented, so it can be consumed by any scripting or CI orchestration layer.
Alternatives Considered
Test-case-level splitting for Xcode
Splitting individual test methods (or suites) across shards for Xcode projects would give the most granular control and best balance. However, it requires enumerating all test cases before running (expensive for large suites), creates complex --only-testing argument lists, and breaks test fixtures that assume suite-level setup/teardown. Module-level splitting avoids these issues while still providing meaningful parallelism for Xcode, where projects managed by Tuist tend to be well-modularized. (Note: for Gradle, we do use suite-level splitting because Gradle’s filtering API handles it cleanly and Gradle projects are often not well-modularized.)
Future Direction: Tuist-Managed Runners
This RFC focuses on static shard assignment — the plan step computes a fixed partition, and each CI job runs its assigned subset independently. This requires users to configure CI matrix strategies and artifact sharing themselves.
A natural evolution is Tuist-managed test distribution, where a single tuist test or ./gradlew test invocation provisions remote runners, distributes tests across them, and streams results back in real time — similar to Develocity Test Distribution. This would eliminate the need for CI matrix configuration entirely: users would just run tuist test and Tuist would handle parallelism transparently.
This capability is explicitly out of scope for the current RFC. The static sharding design proposed here lays the groundwork (server-side timing data, bin-packing algorithm, session management) that a future dynamic distribution system would build on.
Open Questions
-
Should we support test-suite-level splitting for Xcode? Some Xcode projects have a single monolithic test target. Module-level sharding would not help here. We could support an opt-in
--granularity suitemode in a future phase (for Gradle, suite-level is already the default). -
Should sharding activation be explicit? Currently, the presence of any
--shard-*flag implicitly activates sharding. An alternative would be an explicit--shardflag (or similar) to opt in, with--shard-max,--shard-min, etc. as configuration. The implicit approach is more concise, but an explicit flag would make intent clearer in CI workflows. -
Should Gradle shard configuration live in
settings.gradle.ktsinstead of task flags? The current proposal uses flags onprepareTestShards(e.g.,--shard-max 6) to align with the Xcode CLI approach. An alternative is a DSL block insettings.gradle.kts, closer to how Develocity configures test distribution:// settings.gradle.kts tuist { testSharding { maxShards = 6 // minShards = 2 // maxDuration = 300 // isEnabled = System.getenv("CI") != null } }
The DSL approach might be more idiomatic for Gradle users and allows configuration to be checked in once rather than repeated in CI workflow files, but I’m not sure how common that would be. The task flags approach is simpler and more consistent across build systems.