Pipeline as a Service

Site: EOSC-Synergy
Course: Software Quality Assurance as a Service
Book: Pipeline as a Service
Printed by: Guest user
Date: Thursday, 21 November 2024, 5:14 PM

1. Intro

The Pipeline as a Service module offers developers with a graphical means to compose workflows, aka CI/CD pipelines, that carry out the software verification and validation (V&V) processes in an automated fashion. Hence, the CI/CD pipelines are associated with code repositories, since it is the code or documentation (docs-as-code) the main target of the quality checks that are defined in them.

Bringing over good practices to research

Through the web-based composition of the CI/CD pipelines, the SQAaaS aims at bringing over their proven benefits to researchers that develop their own software solutions, without the need to permeate down into the nitty-gritty of current technology offerings, such as Jenkins, GitHub Actions or Travis CI (just to mention a few), which can be a daunting task oftentimes.

With this approach, individuals and communities within the research ecosystem will be more aware of the positive impact that relying on CI/CD pipelines brings in the short and long term when it comes to manage the software development life cycle.

What can the Pipeline as a Service do for you?

1. Pipeline composition & sharing

The main feature and objective of the Pipeline as a Service module is to compose on-demand and ready-to-use CI/CD pipelines based on the feedback of the user. Thus, the web interface provides a means to distribute the resultant pipeline so it can be successfully added to the target code repository: 1. Downloading the pipeline as a ZIP file, so that the developer needs to manually extract & push to the desired repository. 2. Through a GitHub's pull request. This option will automatically create a pull request to a given target repository. By simply merging the pull request, the pipeline will be added to your code repository.

2. Pipeline testing & refactoring

Being able to test the brand-new pipeline is a helpful feature to secure its operation once is added to the code repository. Any failure or unintended behaviour can be worked out by refactoring the stages that the pipeline was broken down into.

The process of making up a CI/CD pipeline

The Pipeline as a Service guides you through a series of steps where settings for the main actors in a CI/CD pipeline can be filled up, in particular:

  1. The repositories, where the code and/or documentation lies.
  2. The services that will take part in the software V&V.
  3. The criteria, which groups the checks into specific software quality-related criteria. Each criterion can have multiple checks, and one check represents a different stage in the pipeline.

    There are contraints in regards to the supported technologies and standards for the pipeline actors above (check out the pipeline characteristics section for additional details):

The last step (step 4) provides a summary of the resultant pipeline and access to the testing-refactoring-sharing features described in the previous section.

Characteristics of the pipelines

  1. CI/CD pipelines can be placed either in the same repository as the code or documentation, or in a separate repository. The first option is the recommended approach since the pipeline will react promptly to the repository events (e.g. push & pull operations) and be triggered automatically without the need of additional customization.
  2. CI/CD pipelines are technology-specific1:
    • Git is the de-facto tool for source code or documentation version control, and thus, the solution being adopted by SQAaaS. This means that only git-based code repositories are supported.
    • Jenkins Pipeline as Code are the underlying technology of the CI/CD pipelines composed through the current SQAaaS module. This means that a Jenkins CI service is always required for the CI/CD pipelines to be executed.
    • Docker Compose is the container orchestration engine used to deploy the services that take part in the CI/CD work. This means that the Jenkins CI service shall be configured to support Docker Compose agents.

  1. For certain features, such as pipeline execution and sharing (via pull requests), the current version of the Pipeline as a Service module only supports GitHub plaform through the GitHub API. This is not an issue from the end user's perspective, as the resultant CI/CD pipelines can be used in any Git environment or social coding platform other than GitHub, such as Gitlab↩︎

2. The 2-step process

Composing your CI/CD pipeline

The Pipeline as a Service will guide you through a 2-step process that will allow you to create ready-to-be-used CI/CD pipelines.

Naming your pipeline As an initial requirement, the web will ask you to provide a name for the pipeline. It is mainly required for the internal operation of the SQAaaS (e.g to create the associated code repository, to be used as the job name in the CI system), so it is not crucial for the pipeline itself, but it is highly recommended to provide a meaningful name.

Once set, click on Create pipeline in order to start the process.

The 2-step process

  1. The Repositories
  2. The Quality Criteria

The Summary view

After completing the previous steps, the web interface will redirect you to the Summary view and show a popup message with the exit status result from the pipeline creation process.

The Pipeline as a Service offers a number of features in order to visualize, share and try out your brand new CI/CD pipeline. We explain them on the Summary section.

3. The Repositories

Changes in the code repositories are the source of all the work performed by the CI/CD pipelines. By reacting in an automated fashion to those changes, pipelines help developers (of code, documentation and/or whatever plain-text based data) in the maintenance tasks.

There are two main approaches when it comes to linking CI/CD pipelines with code: 1. Place your pipeline next to the code, i.e. in the same repository where the source code is handled. 2. Place your pipeline in a separate repository.

Handling both code and the CI/CD pipeline in the same repository is the recommended approach (unless there are specific constraints within the software project), and it is indeed the default behavior within most CI solutions.

As a matter of fact, if your pipeline only needs to work with the existing code, this step can be skipped (this is the reason behind marking it as optional) since the contents will be fetched automatically.

Defining repositories

The Repository view let's you define any "external" (i.e. not the repo that contains the CI/CD pipeline) repository that shall be accessible during the pipeline execution. There are a number of reasons why you would like to do that as when the documentation (docs-as-code) or deployment files (Infrastructure as Code) are maintained outside the main repository.

First, enable the repository definition by clicking on the Yes checkbox. The + Add repository form will open up so that you can specificy the i) URL and ii) branch from the repository to be fetched. You can add as many external repositories as you want.

Advanced options

The Repository view offers additional features for more complex scenarios: credentials and environment customization.

Credentials

Credentials (Add Credentials section) are used whenever the pipeline is required to access an external service that enforces authentication, such as private code repositories or push-permissions to container registries.

Since the current implementation of the SQAaaS uses the JePL library for the pipeline definition, which in turn relies on Jenkins CI, the [type of credentials are those supported by this latter technology] (https://www.jenkins.io/doc/book/using/using-credentials/). As a result, the credential identifier to be used has to be previously defined in Jenkins before being used in this section. This is a limitation we expect to solve in future versions.

Once the credentials' form has been filled out, click on Add Credentials for the SQAaaS to track them.

Environment

The Customize Environment section allows you to set environment variables that will be accessible at runtime. You can add as many variables as you like by clicking on Add Env Var button.

4. The Criteria

The step is the essential part of the pipeline composition since it is at this stage where the work to be carried out by the pipeline is defined. Hence, this is a mandatory step in order to successfully create a CI/CD pipeline with the SQAaaS platform.

The pipeline work is represented by the quality criteria, and that means that at least one criterion has to be defined. The SQAaaS is aligned with quality standards for software and services, and you will notice that by the codes or IDs that are used in the main Criteria dropdown list:

A description of each criterion will appear immediately after it is selected, so it can help you to make use of the most appropriate for the planned work.

The SQAaaS does not validate that the work defined by the user corresponds or is related with the criterion it has been associated with. Thus, the user is responsible to select the appropriate criterion for the task at hand.

A brief overview of the existing quality criteria

You will notice that the web interface takes some seconds to load the available criteria. This is because the corresponding metadata is loaded dynamically from a remote repository every time the Criteria view is accessed.

In the current version, only software-related and data FAIRness criteria are available. The following table summarizes the available criteria:

QC.Acc Promotes the accessibility of the source code as a public resource
QC.Doc Formulates the good-to-have documents associated with code, both in terms of covering the target audience and those related to enabling external contributions
QC.FAIR Catch-all criterion for data FAIR assessment. It encompasses all the potential FAIR indicators that can be validated by the associated tools
QC.Lic Resolves the legal aspects for source code reuse through the presence of license files
QC.Met Promotes the identification and credit of software in research publications
QC.Sec Sets out the path for detecting unsecure patterns when writting code
QC.Sty Fosters readability of the code by following a style standard
QC.Uni Refers to the type of tests to be performed to verify the code

Adding the tools

Based on the criterion selected, the SQAaaS supports a set of popular tools that can help you cover its purpose. These tools are accessible through the Tool selection section, and in particular by clickin on the Choose a Tool dropdown. Each tool has its own set of arguments, including both positional and optional, that can be used to refine the tool's work. When ready, click on the Add tool button in order to add it as part of the criterion validation.

You can add more than one tool per criterion, but all of them shall be added when defining the given criterion. You cannot modify or edit the work from criteria that have been already defined.

Once all the required tools are added, be sure to click on Add Criterion so that the defined work is added to the current CI/CD pipeline.

After adding a given criterion, you will notice that it is still selectable in the criteria dropdown list. In the event of redefining an existing criterion, this last one will take precedence, thus overriding the previously defined.

Running the tools with your own services

The supported tools, such as the one in the example above, are available in pre-defined container images, so there is no need to create them and/or define the service or container that will make use of it when the pipeline gets executed.

This feature eases the composition of the CI/CD pipeline, but may not be optimal for all use cases. For such use cases, the SQAaaS platform supports the definition of custom services to execute the tools. This might be handy in the event of requiring a version of the tool that is incompatible with the default one or when you also rely on other applications not available in the default container image, so a custom environment shall be created.

Do I need a custom service?

The main features of the pre-defined images are described in the form of labels that are displayed once the tool has been selected. They provide valuable information to make a decision about whether a new service (with a new image) is needed. Information includes the version of the tool and the name of the container image.

There are two fundamental ways to create a container or service definition, either using an existing container image from a container registry (pull), or building a custom image at runtime (build).

Pulling an image from a container registry

The image has to be already available in a Docker registry for this feature to work. By default, the pipeline will use Docker Hub. Follow the steps on "Docker image name syntax" to use a different registry.

Docker image name syntax

The syntax for the Image Name field follows the Docker syntax for image names. Note that: - In order to use a registry other than Docker Hub (registry-1.docker.io) you need to prefix the image name with such registry's hostname, such as myregistryhost:5000/fedora/httpd:version1.0 - You can use the values of the available environment variables to compose the image name. This includes the ones defined in Step 1 and also the ones exposed by the Jenkins plugins, such as the GIT_* variables from the Git plugin.

The following figure showcases the process of defining a tooling service:

When clicking on Add Service the python3-service in the example will be available when defining the pipeline work in the next Criteria step.

Building the image from a Dockerfile description

In some cases, a custom Docker image needs to be built out of a Dockerfile present in the code repository. The required parameters differ slightly from the ones used when fetching an existing image from an external registry: - Dockerfile Location (required) shall contain the relative path (taken from the root path of the repository) to the Dockerfile. No default value is set - Build arguments (optional) contains a list of key-value items that will be provided to Docker at build time. Follows the Docker convention for build arguments. - Would you like to push the Docker image to a registry? (optional): if required, the built image can be pushed to a Docker registry. To successfully perform this operation credentials are needed. As it happened when accessing private repositories in Step 1, we only support for the time being credentials defined in the Jenkins service. However, there is a workaround to save time if you just want to test the push process out. This implies using a catch-all credential that will push the resultant image to the EOSC-Synergy organization at Docker Hub.

If the Dockerfile is present at the root of the repository, you still need to set the value in the Dockerfile Location. This is due to the fact that both the context (directory name) and dockerfile (file name) values are taken from such value.

Advanced options

Through the Advanced options section, the Tooling view offers a more detailed configuration of some of the Docker parameters: - Hostname is the equivalent of Docker Compose's hostname property, which sets the container hostname so it can be reached from other containers. - Volumes: - Volume Type refers to the type of the volume. Currently, only bind is supported. - Volume Source specifies the source path of the volume (for named volumes, the name of the volume shall be used) - Volume Target points to the destination path where the directory will be mounted in the container.

You can add as many volumes as you need. Remember to click on Add Volume button for each defined volume.

5. The Summary

The Summary view provides a set of features that you can use once the CI/CD pipeline has been successfully created. They are described in the following sections.

Visualize

When first landing on the Summary view, the Config parameters is displayed. It provides a simple view of the main configuration values filled in during the pipeline composition.

The pipeline itself can also be visualized by clicking on JePL files button. As already mentioned, the SQAaaS platform uses the JePL library to compose a ready-to-be-used Jenkins pipeline, and thus, the three main configuration files required by JePL are generated: - Jenkinsfile: this is the only configuration file required by Jenkins, but when using JePL, the content of this file is actually quite simple: it loads the required version of the JePL library and defines a unique stage, which will dynamically create at runtime the stages defined in the config.yml. - config.yml: it is the main configuration file of JePL. It provides a more readable approach to configuring the pipeline (when compared with the Jenkinsfile DSL). The main section is the sqa_criteria where the pipeline work is broken down into several stages mapped to criteria IDs. Check the JePL documentation for detailed info about the capabilities of the library. - docker-compose.yml: JePL leverages Docker Compose to deploy the required tooling services.

Additionally, a series of bash scripts are generated when performing work on external repositories since the workspace changes.

Share

The ultimate goal of the Pipeline as a Service module is to generate working pipelines that can be readily used in the code repositories. The Summary view offers two ways to add your pipeline to the target repository:

Download, commit & push

The initial message that notifies the pipeline's creation success contains a Download button, which will return a ZIP file with the aforementioned JePL files. Hence, you just need to extract and add (commit & push) these files to your code repository and the pipeline will be ready to be executed.

The pipeline will be automatically triggered with no further action only in the case that an existing Jenkins CI service is already configured to scan this repository (e.g. when the repository belongs to a GitHub organisation that is monitored by a given Jenkins instance). Otherwise, you will need to set up a Jenkins service (or configure an existing one) for the pipeline to work.

Via GitHub's pull request

In the event that the target repository is a GitHub repository the most straightforward way to add your pipeline is to use the Pull request feature. You just need to add the URL and target branch and the pull request will be automatically created. Heading to GitHub and merging the pull request are the only steps to make the new pipeline available in your repository.

Execute

It is a good practice to run the CI/CD pipeline before being added to the target repository in order to detect any defect or misbehavior. The Try out button will execute the pipeline in the default Jenkins instance used by the SQAaaS platform and, once completed, provide the pointers so you can review the outputs.

The Try out feature might ask you for the URL and branch of the target repository. This step is required whenever this target repository has not been defined as part of the first step.

Making changes

The previous Execute feature provides important information about how our CI/CD pipeline will behave when running in our own code repositories. Therefore, making changes is common task when creating a pipeline.

The most obvious way to make modifications to the current pipeline is to go back to the Repositories and/or Criteria steps and modify the introduced data, re-running the pipeline, and repeating this process until we get the expected behavior or results.

Maintaining the "definitive" pipeline

Once you have a base pipeline, created through the SQAaaS portal, it is quite easy to make small adjustments acting directly into the pipeline code, which, as already mentioned, uses the JePL library syntax. When more complex changes are needed, our recommendation is to create a new pipeline from scratch using the portal.

Pipelines are not preserved in the SQAaaS portal

If you did not download and/or pull-request your pipeline after the session in the SQAaaS portal gets closed, you are at risk of losing it. This is due to the fact that the SQAaaS does not yet support loading existing pipelines.

However, there might be a chance to get it back. Try to access the generated pipeline repository.

Disclaimer: the alternative presented here might not work since it depends on the presence of temporary code repositories