Automate testing of Infusion PRs

Description

It would be great if Infusion PRs could be tested by CI. GPII's CI automation (and automated PR testing) hasn't been replicated for Fluid due to Jenkins limitations, namely:

  • An unmaintained Jenkins PR builder plugin

  • Buggy multijob plugin that doesn't honour timeout values

  • A vast world of pain that is Jenkins plugins management

We've looked into GitLab CI which addresses the issues above and also provides nice features such as pipelines, simpler YAML job configuration files, pull-based CI agents, and a more pleasant UI. The main disadvantage with this option is that GitLab's CI offering is packaged as a larger offering meant to replace GitHub. They provide very limited GitHub repository mirroring which is triggered on an hourly basis. GitHub PRs can't be tested using just their software – more information is logged in the GPII-2473 issue.

Recently I came across Buildkite, a CI service that addresses all of the Jenkins and GitLab shortcomings. They provide free accounts to open source projects so I set up a test Fluid organization. You will need to create an account and be invited to the org to see CI results. Here is an example of an Infusion fork's PRs and branch merges being tested:

The config file used to test the above:

The disadvantages of using Buildkite are:

  • Proprietary service which can't be self-hosted – although hosting Jenkins or GitLab involves considerable on-going maintenance costs in terms of securtity issues, testing new releases, support, etc.

  • Job failures and successes are shown in the GitHub UI but Buildkite accounts are required to view detailed CI results

  • I don't think PR jobs can be manually triggered using a confirmation string such as "ok to test" but I've contacted their support team to verify

  • They don't provide a contributor whitelist feature so anyone can open a PR, potentially containing malicious code, and trigger CI jobs

There might be a workaround for the last point. When Michelle sent a test PR a

BUILDKITE_BUILD_CREATOR="michelled" environment variable was available to the Buildkite agent. A list of trusted GitHub account names could be maintained and checked against before running any jobs on the agent.

Environment

None

Activity

Show:

Justin Obara December 6, 2017 at 4:12 PM

For buildkite to report the build results, the fluid-bot needed to be given write access to the repository. This was done by creating a new "services" team with the correct permission and adding the fluid-bot to it.

Colin Clark December 5, 2017 at 10:54 PM

I merged the pull request at 3d20742aa38ce60cd9a0167960d1f4cad43841d8 and have added the Github webhook for build kite. Anything else you need, ?

Avtar Gill December 4, 2017 at 9:03 PM

Here's a workaround that tackles the above:

https://github.com/avtar/unblock-buildkite-pr-job

Avtar Gill November 13, 2017 at 8:55 PM
Edited

I created a separate issue requesting a feature that will allow for the blocking of unrecognized contributors' pull requests.

Here's a summary of what I tried last week. A block step for pull request branches will pause CI jobs. Buildkite fires webhook events whenever builds take place for any projects in an organization. One can specify to only have build.finished events fired. Then have the following handy:

  • Buildkite API token with read_builds and write_builds privileges

  • fluid-bot account name

  • fluid-bot's email address

Verify the incoming payload has the following properties:

  • event: build.finished

  • build.blocked: true (due to the fact a block step was used for the PR branch)

  • build.url: <whatever url is sent>

Make a GET request using the build.url value and obtain the job that has the "state": "blocked" and "unblock_url" properties. Issuing a request such as the following, using the unblock_url value, will resume the CI job:
 

 It's still a workaround until Buildkite implements a feature that makes processing pull request CI jobs less risky but at leas the advantage of doing it this way for now is that the webhook only needs to be set up in one location, in the Buildkite organization's settings area.

 

Avtar Gill November 9, 2017 at 3:04 PM
Edited

> In the meantime is there anything else we could do?

I would add your +1s to that issue

> I wonder if we have access to the pull request event and if we can check for anything ourselves?

We could have a proxy service but it will start looking like GPII-2473 again, albeit this GitHub -> Buildkite proxy would need to do less work than what's mentioned in the GPII issue. Two webhook URLs per repository would need to be configured – the first one provided by Buildkite (for non-PR branches) and the second pointing to the proxy. The proxy would handle the GitHub account verification. If a recognized account was used to issue the PR then it would start a build on the Buildkite side. If an unrecognized account was used then it would ask for confirmation using PR comments before starting a build. Seems clunky and brings us back to the scenario where we need to maintain this proxy and host additional infrastructure but so far all the CI options on the table seem to have their own clunky attributes.

Fixed

Details

Assignee

Reporter

Components

Priority

Created November 2, 2017 at 11:25 PM
Updated July 22, 2024 at 2:35 PM
Resolved December 5, 2017 at 10:54 PM