Based on some extra feedback, we decided that suite-level splitting should be supported from day one also for Xcode. Here’s how it would work technically.
Filtering mechanism
The .xctestrun plist already supports class-level filtering natively. Each TestTarget entry accepts:
OnlyTestIdentifiers: array of identifiers to includeSkipTestIdentifiers: array of identifiers to exclude
The identifier format is ClassName or ClassName/testMethodName. So the server can inject OnlyTestIdentifiers per test target to restrict each shard to its assigned classes, the same way it currently strips entire TestTarget entries for module-level sharding, but one level deeper:
<key>OnlyTestIdentifiers</key>
<array>
<string>CalculatorTests</string>
<string>NetworkClientTests</string>
</array>
No -only-testing flags needed. The filtered .xctestrun is self-contained.
Test suite discovery
For module-level sharding, the .xctestrun plist is the source of truth. Each TestTarget entry’s BlueprintName gives the complete module list. For suite-level sharding, the .xctestrun doesn’t list individual classes inside each target, so we need an additional enumeration step.
The plan step already runs xcodebuild build-for-testing. After building, xcodebuild test-without-building -enumerate-tests (Xcode 16+) enumerates all test targets, classes, and methods from the built products without executing them. The client sends this class list to the server the same way it sends the module list for module-level sharding. The server does the same bin-packing either way.
Performance: enumerate-tests doesn’t execute any tests, but it does load the test bundles into a simulator or test host process to reflect on XCTestCase subclasses. This can add 10-30 seconds on top of the build depending on project size and whether the simulator is already booted. Since the plan step already runs build-for-testing (which typically boots a simulator), the incremental cost should be modest. An alternative would be parsing the .xctest Mach-O binaries directly with nm to extract test* symbols — this is instant but fragile across Swift name mangling changes and wouldn’t catch dynamically generated tests. enumerate-tests is the safer default. If it turns out to be too slow for large projects, we could revisit this.
How it fits into the existing design
The change is minimal. It’s the same .xctestrun filtering mechanism, just at a finer granularity:
- Module-level (current default): Server removes
TestTargetentries not assigned to the shard. - Suite-level (opt-in, e.g.,
--granularity suite): Server keeps allTestTargetentries but addsOnlyTestIdentifiersto each, filtering to the assigned classes. - Timing data: Uses
test_suite_runs(avg_duration per class) instead oftest_module_runs. - Bin-packing: Same LPT algorithm, just operating on classes instead of modules.
In a very similar way, we could also do sharding at the individual test case level.