Add support for running all tests and upload their results

marekfort · November 15, 2024, 4:17pm

Need/problem

Some organizations want to use the selective testing as follows:

on main, selective test results should not be fetched, but they should be uploaded
on PRs, selective tests results should be fetched, but should not be uploaded

The flows for PR was solved by implementation of this RFC that introduced the --no-upload command.

However, as of now, there’s no way to skip fetching selective test results and uploading them after they are run.

To skip fetching selective test results, developers can run tuist test --no-selective-testing – but that will also skip the upload.

Proposed solution

I’m proposing to add a new flag tuist test --run-all-tests that will force run of all tests regardless of the state of the remote and local cache. At the end of a successful run, the command would still store the results in the local cache and upload them to the remote cache (unless --no-upload would be passed as well).

Drawbacks

We’re adding complexity to the tuist test command. There would be now three flags for augmenting the selective testing behavior:

--no-upload → skips uploading test results, but stores them locally
--no-selective-testing → fully disables selective testing feature
--run-all-tests → runs all tests, but still stores selective test results locally and remotely

Alternatives

I have considered changing the behavior of the --no-selective-testing flag to be the same as the proposed --run-all-tests. But if we did that, there would be no way of skipping storing selective test results locally – and skipping hashing the graph for selective results purposes. So, I’m currently more in favor of keeping the current behavior where --no-selective-testing completely disables the selective testing feature and all the logic associated with it.

Unresolved questions

What do you think about the proposed flag name and its behavior? Any other alternatives that we should consider?

pepicrft · November 15, 2024, 4:34pm

I wonder if we can model this problem differently to avoid the complexity we are introducing at the flag level. If I understood the scenarios correctly, there are three variables:

Whether to run selectively
Whether to persist results
Where to persist the results

What about having two flags, --selective-testing, --selective-testing-store:

# Current behavior: it runs selectively and persists in the results
# locally and remotely
tuist test

# Don't run selectively but persist results locally 
tuist test --no-selective-testing --selective-testing-store local

# Don't run selectively but persist results locally and remotely
tuist test --no-selective-testing --selective-testing-store local,remote

# Run selectively but don't persiste the results
tuist test --no-selective-testing-store

marekfort · November 15, 2024, 4:49pm

As was reasoned here, --no-upload was introduced, so we can have a common flag for disabling any uploads. Introducing --selective-testing-store would mean we’d either duplicate the behavior or have a breaking change in the CLI.

And with your proposal, we wouldn’t technically end up with fewer flags. And there’s the (albeit small) additional overhead of --selective-testing-store being an array.

I’m leaning to keeping --no-upload and --no-selective-testing to make this change purely additive instead of introducing a breaking change and a behavioral change for an existing flag for, in my opinion, very minor wins for the CLI flags complexity (if any, that’s debatable)

pepicrft · November 18, 2024, 10:23am

I forgot to mention it with the proposal, but with the introduction of the flag I’d then deprecate --no-upload in favor of --selective-testing-store.

Naming is difficult, subjective, and difficult to get right in the first shot when we don’t know the use cases that will show up down the road. I lean on evolving ours as we iterate on the feature, even if that means a bit of nudging on the user through deprecation warnings and a bit of complexity on our end. In the long term, that’s better than adding more flags.

We can defer this discussion and have it if we see we keep adding more flags, but I see these as opportunities to reflect and challenge what we’ve decided in the past, and correct the curse if needed. Feel free to make the call here.

marekfort · November 18, 2024, 10:42am

We can go with your proposal if users raise a need for the extra flexibility.

Changing the behavior of --no-selective-testing would be breaking as well, so I’m in favor of introducing --run-all-tests which is imho a better name than --no-selective-testing and the flag will be an additive change.

And if we end up implementing --selective-testing-store and --no-selective-testing-store down the line, we can deprecate the --no-selective-testing flag.