Continuous Integration for Swift

Table of Contents

Introduction
Pull Request Testing
Cross Repository Testing
ci.swift.org bots

Introduction

FIXME: FILL ME IN!

Pull Request Testing

In order for the Swift project to be able to advance quickly, it is important that we maintain a green build [1]. In order to help maintain this green build, the Swift project heavily uses pull request (PR) testing. Specifically, an important general rule is that all non-trivial checkins to any Swift Project repository should should at least perform a smoke test if simulators will not be impacted or a full validation test if simulators may be impacted. If in addition one is attempting to make a source breaking change across multiple repositories one should follow the cross repo source breaking changes workflow. We now continue by describing the Swift system for Pull Request testing, @swift-ci:

@swift-ci

swift-ci pull request testing is triggered by writing a comment on this PR addressed to the GitHub user @swift-ci. Different tests will run depending on the specific comment that you use. The current test types are:

Smoke Testing
Validation Testing
Benchmarking.
Lint Testing

We describe each in detail below:

Smoke Testing

    Platform     | Comment
    ------------ | -------------
    All supported platforms     | @swift-ci Please smoke test
    All supported platforms     | @swift-ci Please smoke test and merge
    OS X platform               | @swift-ci Please smoke test OS X platform
    Linux platform              | @swift-ci Please smoke test Linux platform

A smoke test on macOS does the following:

Builds LLVM/Clang incrementally.
Builds Swift clean.
Builds the standard library clean only for macOS. Simulator standard libraries and device standard libraries are not built.
lldb is not built.
The test and validation-test targets are run only for macOS. The optimized version of these tests are not run.

A smoke test on Linux does the following:

Builds LLVM/Clang incrementally.
Builds Swift clean.
Builds the standard library clean.
lldb is built incrementally.
Foundation, SwiftPM, LLBuild, XCTest are built.
The swift test and validation-test targets are run. The optimized version of these tests are not run.
lldb is tested.
Foundation, SwiftPM, LLBuild, XCTest are tested.

Validation Testing

    Platform     | Comment
    ------------ | -------------
    All supported platforms     | @swift-ci Please test
    All supported platforms     | @swift-ci Please test and merge
    OS X platform               | @swift-ci Please test OS X platform
    Linux platform              | @swift-ci Please test Linux platform

The core principles of validation testing is that:

A validation test should build and run tests for /all/ platforms and all architectures supported by the CI.
A validation test should not be incremental. We want there to be a definitiveness to a validation test. If one uses a validation test, one should be sure that there is no nook or cranny in the code base that has not been tested.

With that being said, a validation test on macOS does the following:

Removes the workspace.
Builds the compiler.
Builds the standard library for macOS and the simulators for all platforms.
lldb is /not/ build/tested [2]
The tests, validation-tests are run for all simulators and macOS both with and without optimizations enabled.

A validation test on Linux does the following:

Removes the workspace.
Builds the compiler.
Builds the standard library.
lldb is built.
Builds Foundation, SwiftPM, LLBuild, XCTest
Run the swift test and validation-test targets with and without optimization.
lldb is tested.
Foundation, SwiftPM, LLBuild, XCTest are tested.

Benchmarking

    Platform       | Comment
    ------------   | -------------
    OS X platform  | @swift-ci Please benchmark

Lint Testing

    Language     | Comment
    ------------ | -------------
    Python       | @swift-ci Please Python lint

Cross Repository Testing

Currently @swift-ci pull request testing only supports testing changes against individual repositories. This is something that will most likely be fixed in the future. But in the short term, please follow the following workflow for performing cross repository testing:

Make sure that all repos have been checked out:
./swift/utils/update-checkout --clone
On Darwin and Linux run:
./swift/utils/build-toolchain local.swift

If everything passes, a .tar.gz package file will be produced in the . directory.

Create a separate PR for each repository that needs to be changed. Each should reference the main Swift PR and create a reference to all of the others from the main PR.
Gate all commits on @swift-ci smoke test and merge. As stated above, it is important that all checkins perform PR testing since if breakage enters the tree PR testing becomes less effective. If you have done local testing (using build-toolchain) and have made appropriate changes to the other repositories then perform a smoke test and merge should be sufficient for correctness. This is not meant to check for correctness in your commits, but rather to be sure that no one landed changes in other repositories or in swift that cause your PR to no longer be correct. If you were unable to make workarounds to th eother repositories, this smoke test will break after Swift has built. Check the log to make sure that it is the expected failure for that platform/repository that coincides with the failure your PR is supposed to fix.
Merge all of the pull requests simultaneously.
Watch the public incremental build on ci.swift.org to make sure that you did not make any mistakes. It should complete within 30-40 minutes depending on what else was being committed in the mean time.

ci.swift.org bots

FIXME: FILL ME IN!

[1] Even though it should be without saying, the reason why having a green build is important is that:

A full build break can prevent other developers from testing their work.
A test break can make it difficult for developers to know whether or not their specific commit has broken a test, requiring them to perform an initial clean build, wasting time.
@swift-ci pull request testing becomes less effective since one can not perform a test and merge and one must reason about the source of a given failure.

[2] This is due to unrelated issues relating to running lldb tests on macOS.