Planning for disconnected sites
Many edge deployments must handle scenarios where sites are disconnected from the Control Tower for varying durations, whether planned or unplanned. As long as at least one host in a site has a connection to the Control Tower, the entire site can be fully managed, and is therefore considered connected. Consequently, a disconnected site is one in which no host has a connection to the Control Tower. Avassa is designed to support intermittent connectivity. The Edge Enforcer operates autonomously, with all required artifacts stored and replicated locally at the site. This ensures that control actions can be executed at the edge, independent of the Control Tower.
A key architectural feature is the loose coupling between Control Tower and Edge Enforcer. Communication is handled via a bidirectional pub/sub bus, allowing messages to be buffered both centrally and at the edge during disconnections. Upon reconnection, these messages are synchronized seamlessly.
The platform also supports deployment and upgrade workflows during disconnected periods, including image housekeeping and configuration of pending upgrades.
This guide helps you configure and plan your edge deployment for relevant disconnected scenarios, including:
- Configuring site disconnected behavior
- Managing upgrade sequences
- Certificate management
- User management
- Monitoring disconnected sites
- Local operations at edge sites
- Local unseal at restart?
Note well that these features are available for the site provider role. An application owner does not have access to these settings or see any site disconnected alerts. (An application owner can inspect the system:events topic to see site-connected and site-disconnected events).
Configuring site disconnected behavior
Edge sites may have different connectivity profiles. Some may have stable connections where any disconnect is an incident, while others may experience frequent, expected disconnects.
You can configure each site’s behavior when disconnected using
the when-disconnected property. Supported values:
treat-as-normal: Temporary disconnects are acceptable; the site is expected to be connected during normal operation. This is the default behavior.treat-as-expected: Disconnects are normal and not alerted; the site may be non-operational for periods, but deployments continue.treat-as-error: Connectivity is required; any disconnect triggers an alert.
When an application, or a new version of an application, is deployed
to a site that is offline, by default the deployment waits for the
site to come back online and actually deploy the application.
However, if a site is known to be offline for a longer period of time,
this behavior is not ideal. By configuring the site's
when-disconnected property to treat-as-expected, the deployment
will continue even if the site is offline.
The following table describes what happens when a site is disconnected. It shows if an alert is generated when a site disconnects, and what happens if a new application version is deployed when the site is disconnected.
| when-disconnected | Alert | Application deployment status |
|---|---|---|
| treat-as-normal | no | Deployment remains in deploying state. The UI will indicate the deployment status as following, one site disconnected: |
| treat-as-expected | no | Deployment continues and the disconnected site is (temporarily) skipped. If no other issues the deployment reaches the deployed state. When/if the site becomes connected the current deployed version will automatically be triggered. No queue is built up in the Control Tower for outstanding application versions deployments. The UI will indicate the deployment status as following, one site disconnected: |
| treat-as-error | yes | Deployment remains in deploying state. Any following deployments will be queued in the Control Tower. At reconnect, they will be processed in order. The UI will indicate the deployment status as following, one site disconnected: |
Consider:
- Should disconnected sites trigger alerts?
- Should disconnected sites be considered an issue for deployments?
Monitoring disconnected sites
Sites that remain disconnected for extended periods can be overlooked. Use the Control Tower UI to filter and sort sites by connection status and duration. For troubleshooting, inspect connect/disconnect events in the site view.
You can also analyze the Volga topic
system:events
for site-connected and site-disconnected events.
To list disconnected sites using supctl:
supctl show system sites --where="connection-state/connected='false'" --fields=name
If when-disconnected is set to treat-as-error,
alerts will be generated for disconnected sites.
Configuring upgrade sequences
If your deployment pipeline releases updates regularly (e.g., bi-weekly), sites disconnected for extended periods may accumulate multiple pending upgrades. By default, these upgrades are applied sequentially upon reconnection.
You can optimize this process by allowing the deployment to skip versions.
Configure this using the
upgrade-from field.
Here's an example of a simple application that supports such upgrades:
name: theater-room-manager
version: "2.2"
services:
- name: theater-operations
mode: replicated
replicas: 3
containers:
- name: digital-assets-manager
image: ...
upgrade-from:
- version-regexp: "."
method: per-service
services:
- name: theater-operations
instances-in-parallel: 1
healthy-time: 30s
See also application upgrades.
Best practice: Ensure your application supports upgrades from any previous version. This will make an upgrade after longer disconnect faster since intermediate versions can be skipped
Certificate management
Avassa uses site-local certificates for security. Certificates have a default TTL and are auto-rotated. Sites should not remain disconnected longer than the certificate TTL or the scheduled rotation period.
Set the offline-grace-period in the system
settings
to match your expected maximum disconnected duration.
Certificate defaults are derived from this value.
Monitor certificate expiration using supctl or the Control Tower UI, which provides alerts for certificates nearing expiration.
If a site misses certificate renewal due to extended disconnection, Avassa provides a built-in recovery mechanism.
See certificate management for disconnected scenarios for detailed instructions on these topics.
User management
When operating sites via Control Tower, authentication is handled centrally (OIDC or local users in Strongbox). For local operations during outages, ensure site-local authentication is enabled.
Create Strongbox users and distribute them to the relevant sites:

This allows local users to authenticate and use supctl and APIs even when disconnected.
Edge site local operations
While disconnected, users can perform local operations using supctl and APIs.
Consider deploying a site-local custom web UI for simplified management,
such as the Site Admin UI example.
Local configuration changes made during disconnection are flagged. Upon reconnection, the central operations team can review and choose to keep local changes or overwrite with central configuration.
Read more: disconnected site operations
Site local unseal
If you have scenarios where sites will restart without connectivity to the Control Tower you might want to allow local unseal.
Note that this setting must be configured before the first host in a site is connected. After that it cannot be changed.