Skip to main content

Automated OS upgrades

This tutorial describes how to configure the system upgrade mechanism to trigger the OS upgrade at a designated day and time of the week and orchestrate the upgrade among different hosts. In this tutorial we use a Ubuntu/Debian site as an example. Avassa provides OS upgrade agents for a number of distributions. For the latest list see the readme in https://gitlab.com/avassa-public/os-upgrade.

A process to perform OS upgrades by hand can be found in Manual OS upgrade document.

In order to get an understanding of the OS upgrade architecture and mechanisms, read our OS upgrade fundamentals.

The overall procedure to use the Avassa OS upgrade functions are:

  • build the OS upgrade application for your system, we provide different upgrade applications for different OSes.
  • deploy the OS upgrade application to corresponding hosts
  • configure OS upgrade windows
  • monitor the automatic OS upgrades

Building the OS upgrade application implemented by Avassa

In order to build the application for Debian/Ubuntu:

  1. Check out the git repository https://gitlab.com/avassa-public/os-upgrade or download the files mentioned below as needed.

  2. Create an approle and a corresponding approle policy. An approle is the authentication mechanism used by the applications to be able to use the Avassa APIs locally on the host they are running on.

    cat yaml/approle.yaml | supctl create strongbox authentication approles
    cat yaml/approle-policy.yaml | supctl create policy policies

    approle policy

  3. By default an approle authentication mechanism requires two secret parts: one part built into the image and one part is provided at runtime. The secret part to build into the image is called role-id and is generated when the approle is created. To look up the role-id run:

    supctl show strongbox authentication approles os-upgrade --fields role-id
    Example output
    role-id: 4c6eb93b-4c3e-43ff-884c-273ff920ac24

    This value (without the role-id: prefix) may then be used to build a container image using the Dockerfile in the repository above as follows. Make sure to replace the <role-id> in the first line with the actual role-id obtained from the previous command.

    Also replace the VERSION value with the target version to build. latest is fine for development purpose, but locking the application specification to a specific version enables more controlled application upgrades.

    The list of published versions can be found at https://gitlab.com/avassa-public/os-upgrade/container_registry.

    Store role-id for docker build
    echo '<replace me with role-id from above>' > role-id
    Build image with role-id
    docker build \
    -t registry.environment.name.avassa.net/avassa/os-upgrade:<VERSION> \
    --secret id=approle,src=role-id \
    --build-arg IMAGE=registry.gitlab.com/avassa-public/os-upgrade \
    --build-arg VSN=<VERSION> \
    .
    docker push registry.environment.name.avassa.net/avassa/os-upgrade:<VERSION>
    note

    An alternative to this step is to edit the approle to set a fixed role id (developing-with-strongbox-edge#fixed-role-id). This parameter removes the requirement for the secret part built into the image so the container images published by Avassa may be used directly without building an own image.

    This is not recommended as the OS Upgrade workers have access to the operating system, it's important to make sure you're running a validated image.

    By default the application deployment matches label os-type with value debian or rhel to determine the sites where the application must be deployed to. Make sure to either label the sites correspondingly or to modify the application deployment to indicate the sites the worker application should be deployed to.

    Create the application specification:

    name: os-upgrade-debian
    version: "<VERSION>"
    services:
    - name: worker
    mode: one-per-matching-host
    containers:
    - name: main
    image: avassa/os-upgrade:<VERSION>
    cmd: [ "os_upgrade.debian" ]
    approle: os-upgrade
    container-layer-size: 0B
    env:
    API_CA_CERT: ${SYS_API_CA_CERT}
    HOST: ${SYS_HOST}
    APPROLE_SECRET_ID: ${SYS_APPROLE_SECRET_ID}
    LOG_LEVEL: INFO
    mounts:
    - volume-name: systemd-socket
    mount-path: /var/run/dbus/system_bus_socket
    - volume-name: apt-conf-d
    mount-path: /data/apt.conf.d
    user-namespace:
    host: true
    security:
    apparmor:
    disabled: true ## apparmor prevents systemd socket access
    volumes:
    - name: systemd-socket
    system-volume:
    reference: systemd-socket
    - name: apt-conf-d
    config-map:
    items:
    - name: 90avassa
    data-verbatim: |
    // the reboot is handled by the os-upgrade handler
    Unattended-Upgrade::Automatic-Reboot "false";
    APT::Periodic::Update-Package-Lists "always";
    APT::Periodic::Unattended-Upgrade "always";
    Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
    Unattended-Upgrade::Remove-New-Unused-Dependencies "true";

    automating using sed
    sed -e 's|%VERSION%|2024.4.0|' \
    -e 's|%IMAGE%|avassa/os-upgrade:2024.4.0|' \
    yaml/application.debian.yaml.in \
    > application.debian.yaml

Deploy the application to corresponding hosts

With the above application specification you can now deploy that to sites where you have Debian running. Assume you have labelled those with os-type you can use a deployment shown below:

name: os-upgrade-debian
application: os-upgrade-debian
placement:
match-site-labels: os-type = debian

OS upgrade configuration

Now, everything is in place to configure a maintenance window that will control when the OS is upgraded. This configuration must be created by a site provider tenant.

OS Upgrade Configuration
supctl create os-upgrade <<EOF
worker-applications:
- name: os-upgrade-debian
maintenance-windows:
- days-of-week: Friday, Saturday
start-time: 01:00
timezone: site-local
duration: 4h
EOF

UI-settings

This configuration tells the system that the OS upgrades should run on any site that has the os-upgrade-debian application deployed to at least one host and the upgrades should be initiated each week on Friday and Saturday, at 01:00 local time and must not exceed 4 hours. The controller assumes that all hosts running a service instance that belongs to os-upgrade-debian application will receive the commands as a part of the OS upgrade process.

Inspect the OS upgrade status on a specific site

In order to inspect whether the OS upgrade mechanism is configured and running as expected on site site-name use the following command:

supctl show -s site-name os-upgrade

Example output:

worker-applications:
- name: os-upgrade-debian
- name: os-upgrade-rhel
maintenance-windows:
- days-of-week: Friday, Saturday
start-time: 01:00
timezone: site-local
duration: 4h
status: idle
next-upgrade-in: 1d4h18s
scheduled-workers:
- host: host01
application: os-upgrade-debian
- host: host02
application: os-upgrade-debian
- host: host03
application: os-upgrade-debian

os-upgrade-site

This tells us that the OS upgrade is currently idle (no upgrade in progress). The next OS upgrade is scheduled to start in 1 day 4 hours and 18 seconds. If the upgrade was to start now, then hosts host01, host02 and host03 would be included in the upgrade, because each of them is running an instance of a service from os-upgrade-debian worker application. If a host is not mentioned in this list, it is assumed that the OS upgrades for this host are externally managed, so an upgrade would still succeed even if not all hosts on the site are managed.

A different example output shows the upgrade in progress:

worker-applications:
- name: os-upgrade-debian
- name: os-upgrade-rhel
maintenance-windows:
- days-of-week: Friday, Saturday
start-time: 01:00
timezone: site-local
duration: 4h
status: in-progress
scheduled-workers:
- host: host01
application: os-upgrade-debian
- host: host02
application: os-upgrade-debian
- host: host03
application: os-upgrade-debian
last-upgrade-info:
start-time: 2023-04-20T01:00:00Z
timeout-in: 3h40m17s
hosts:
- hostname: host01
status: scheduled
- hostname: host02
status: prepared
- hostname: host03
status: upgrading

This example tells us that the upgrade is currently ongoing. It has started at the specified start-time and is expected to time out if not completed within the next 3 hours 40 minutes and 17 seconds (from the time the output was generated). Three hosts are a part of the upgrade: host01 has not replied to the prepare command yet, host02 has completed the prepare phase and is awaiting its turn to upgrade and host03 has completed the prepare phase and has been issued the upgrade command which it has not yet replied to, so the upgrade is ongoing on this host.

When the upgrade is completed (or aborted due to timeout or failure), the last-upgrade-info shows the status of the latest upgrade.

Inspect the software versions as reported by the worker applications

The worker application may detect the version of the OS or the versions of the packages running on the host. They are reported each time the worker notices the change and are stored by the controller. To inspect the latest versions reported by the workers on a specific site:

supctl show -s site-name os-upgrade hosts

Example output:

- hostname: host01
timestamp: 2023-04-17T01:12:24Z
versions:
linux-image: 5.15.83-ubuntu0
docker: 20.10.1
- hostname: host02
timestamp: 2023-04-22T10:14:42Z
versions:
linux-image: 5.15.94-ubuntu0
docker: 20.10.4
- hostname: host03
timestamp: 2023-04-22T10:16:33Z
versions:
linux-image: 5.15.94-ubuntu0
docker: 20.10.4
note

The versions is a key-value mapping published by each worker, so the actual packages or other key names and corresponding reported versions are defined by the worker implementation.

E.g. the kernel version can be seen in the UI

Before upgrade

upgrade b4

After upgrade

upgrade after