Avassa is an application management platform for distributed container applications. The platform targets use cases where application instances are deployed in many locations and where the user requires precise control of application placement.
This document contains:
- An explanation of why and how this distributed computational model – often referred to as edge computing or edge clouds – differs from today's mainstream public clouds
- An overview of the Avassa architecture
- A summary of the functionality in Avassa, mainly from a developer and DevOps' point of view
The importance of geographically distributed computations is a consequence of the recent snowballing growth of decentralized data and its implications on security, privacy, performance, and scalability. Data processing often needs to be performed locally, rather than first moving the data to centralized applications. Local data processing often includes data filtering or transformation.
We call this distributed model the distributed edge cloud, in contrast to the traditional public cloud, which is centralized. The distributed edge cloud is not a replacement for the public cloud – it is an extension because the public cloud will integrate with data and computed insights from applications located on the distributed edge.
We can summarize the differences between public cloud and distributed edge cloud as follows:
The public cloud model assumes:
- Many computers in a few very large datacenters
- Fast and cheap networking inside the datacenters
- Tools designed for optimizing resource utilization in the datacenters
- Location is not essential and is generally abstracted from the user
The distributed edge cloud assumes:
- Few computers in many locations
- Applications and data distributed across the locations
- Intermittent networking between the locations
- The user explicitly controls the location and placement of edge computations
Public clouds are managed services that provide virtual and elastic resources, secure networking, and scalable infrastructure. The main reasons for using public clouds are simplified operations and application development.
The main reason for using distributed edge clouds, in contrast, is the rapid growth of decentralized data and the associated security, privacy, performance, and scalability requirements that do not exist in public clouds.
The Avassa platform contains two software components:
The Avassa Control Tower provides central management of distributed edge resources and containerized applications through user interfaces and APIs. It is available as a service or installed in a the users' private datacenter.
The Avassa Edge Enforcer is a software agent installed on all hosts in all edge sites. It provides zero-touch host registration functions, local cluster management, application placement and scheduling, and a local container registry. In addition, the Avassa Edge Enforcer provides local APIs for secrets management and distributed event streaming for applications that require such services.
Deep multi-tenancy is the fundamental enabler for allowing tenants to securely deploy applications across shared resources and make sure the applications run in isolation. Tenants can be provided with both dedicated and shared resources to form their unique view of the edge environment. Avassa is a multi-tenant system, where all components support strong isolation between tenants. No data is shared between tenants, and tenant-specific network traffic is always encrypted. The environment delivers fine-grained resource governance and limits enforcements on container level for CPU, memory, and storage, both for shared and tenant-specific deployment models.
The Edge Enforcer provides a high degree of autonomous execution during control plane and connectivity outages. Control plane communication between Control Tower and Edge Enforcer is buffered until communication is restored and the same behaviour is available for applications using site-local APIs. In these situations, the Edge Enforcer supports local management such as installing applications, reading and searching in logs, manage secrets locally (rotating keys, updating secrets, etc), and making manual unseal if necessary. The Edge Enforcer also continues to manage and govern each site-local cluster including re-scheduling applications to new hosts in the event of host failures.
The network topology of an Avassa deployment has so far been described as a hub-and-spoke design, with the Control Tower as the central hub. It is also possible to extend this design with one or more intermediary aggregation layers, with nodes between the edge sites and the Control Tower. For example, one could have an aggregation node per country and another aggregation node per self-governing local authority in the country. One of the more important benefits of this design is that it makes it possible to implement privacy zones, where a privacy zone is the set of edge sites connected to an aggregation node. Associating aggregation nodes with privacy zones makes it possible to implement data privacy policies and regulations for each local authority, such as a federal state.
From a developer / DevOps point of view, we can divide the functionality in the Control Tower and Edge Enforcer into four categories:
- Distributed Application Management
- Event streaming
The rest of this document contains an overview of these categories.
Distributed Application Management includes the following main features:
Application lifecycle management
This includes deploying an application to selected sites. Rolling back an application to a particular version at selected sites. Viewing deployment status of an application across a set of sites, based on logs and monitors.
Configuration as code
This includes versioning of configurations and rollout and rolling back configurations at specific sites. Configurations are defined in application specifications and deployment specifications. Application specifications declare how application instances should be scheduled across hosts within a given site. Deployment specifications describe where applications should be deployed – this indicates a set of sites where applications should be scheduled.
When application instances are scheduled onto hosts, quota thresholds for CPU, memory, and disk volumes are configured. Quotas can be defined on multiple levels in the multi-tenancy hierarchy as well as for individual applications.
- Do a canary deployment of a specific application version on a set of sites
- Roll back the canary deployment
- Do a rolling upgrade of a specific application version on ten sites at the time
- View deployment status across all sites
- Show configuration diffs between two different sites
- Restrict quotas in all sites in a site group
- Run a test-suite in a set of sites, collect results, import to cloud
The operator needs to overlook all relevant edge sites, without importing all log data and metrics to a centralized location such as the Control Tower. Avassa provides distributed logging at each site and real-time search from the Control Tower into the site-local logs and metrics. This distributed approach is highly efficient and scales to thousands of sites. Monitoring includes both logging and metrics:
- Metrics are used for SLA and performance monitoring. It is based on different types of statistical calculations over large amounts of data, where smaller data loss does not matter. Metrics are also used for isolation and error monitoring (SLA-related). Metrics are typed data (integers or floats)
- Logs are used for errors, events, audits, tracing. Log data is untyped.
Key monitoring features include:
- Logging of system and application data in edge sites
- Monitor local metrics in edge sites
- Distributed search of log data and metrics is stored in edge sites
- Deployment monitoring, where progress is tracked through UI, API, and notifications
- Locally consume, process and filter log data and metrics in distributed sites and stream relevant subsets northbound for further analysis and post-processing in centralized external data lakes
- Query all sites for errors in the logs of a specific application
- Get notification when an application was restarted in a specific site after a crash
- Verify that an application is running with the correct version on all relevant sites
- Get notification when Control Tower loses connection to a specific site
- Get notification when tenant resource consumption exceeds a quota
- View aggregated resource consumption for an application across all sites
- View resource utilization in sites in a region
Event streaming solutions have proved to be solid building blocks of loosely coupled distributed systems. Volga is a high-performance event streaming service. It is an integral part of the Avassa system, which is a heavy user of Volga internally. External applications can also use Volga to produce and consume messages to and from event streams.
Key Volga features include:
- Volga supports replicated, persistent, and volatile event queues with arbitrarily many producers and consumers to any given topic
- Volga has full support for Avassa's multi-tenancy model
- Most edge sites are located behind NAT firewalls. Volga includes support for producing to and consuming from topics at such edge sites
- Volga is used as a core building block to coordinate certificates, jobs, configuration changes, etc., in a loosely coupled set of Avassa sites
- Each Volga topic is replicated on compute nodes on one Avassa site. Applications typically consume and produce messages locally on that site, but it is also possible to consume and produce from other sites remotely
- Volga includes a system-wide distributed service, built on top of Volga topics, which distributes the event stream and lets applications communicate across the entire system without direct network connectivity. It guarantees that events posted to the streams will eventually be distributed to all consuming sites
- There is a distributed log query service in Volga. This service facilitates very fast queries over application logs for a large number of sites
- Volga has a very small footprint, allowing event streaming even at small single-host edge sites
- Volga provides built-in strong encryption and user authentication. User data on Volga is always passed over a websocket API, ensuring strict isolation between users on the same Volga implementation
Event streaming is used in many loosely coupled distributed systems. The three examples below illustrate how powerful this technology is in a distributed edge cloud environment.
Example: An application consisting of code at the Avassa Control Tower and code at many edge sites wants to propagate a configuration change to all sites. This is best done by:
- Post the change into Volga Infra at the Control Tower
- Consume the change at the edge sites.
This is a good way to organize loosely coupled distributed applications.
Example: An application with many edge sites where large amounts of data is produced at these sites. Due to either cost or bandwidth, data must be aggregated and refined before it is pushed further up in the network.
Such an application would publish data on one or several Volga topics locally at each site. Data would typically then be consumed and refined locally and either re-produced to an aggregation topic locally or produced remotely to an aggregation topic further up in the network.
Example: Metrics generated at edge sites is best produced to local site Volga topics. To query and search such metrics, a request to search can be posted to Volga at the Control Tower. This is a means to have all metrics data distributed without ever accumulating data centrally. If the number of edge sites is very large and the amount of metrics data is large, this is a good way of avoiding central bottlenecks.
When compute and applications are distributed across hundreds or thousands of sites, data must be protected from security breaches both when in storage and when transported over the network. Physical theft is a real possibility in environments with little or no physical security.
Secrets such as crypto keys, certificates, and third-party access credentials should not be bundled into application images. Secrets should instead be accessed through fully authenticated and audited APIs that allow secrets to be updated and crypto keys to remain hidden without updating software images.
An intrusion at one site must not compromise data at other sites, and when a security breach occurs, it must be easily isolated and mediated.
Strongbox is a system-wide distributed service for managing secrets and policies in the Avassa platform. Secrets are automatically shared to sites where they are required, and policies are applied across the entire system. Sharing only occurs in one direction, from the management point and outward. Local secrets are not propagated upward to avoid poisoning from a potential breach.
Key Strongbox features include:
- Cryptographic isolation of secrets between tenants and sites, separate keys for each tenant and site
- A one-step operation to block a tenant, a site, and a host
- Fine-grained control of how secrets are distributed to sites
- Local secrets storage is sealed until remotely unsealed
- Fully audit-logged access to secrets
- Centralized key management: key rotation, revocation, access
- Encrypt and decrypt services that allow use without access to actual crypto keys
- API for format-preserving encryption/decryption as well as masking of data before logging, storage, and transport
- Fully encrypted and authenticated communication both between sites (inter-site) and between hosts at a site (inter-host)
- Mount secret as files in the file system for a container
- Store access credentials to AWS
- On a specific site, encrypt data the key named "userdata" before sending it in a Volga message to Control Tower
- Post mortem analysis of accesses to secrets after detected intrusion on a specific site
- Block sia site after a security breach was detected
- In a specific application, mask the social security number of the patient before logging
- FPE (Format Preserving Encryption) of credit card number before storing in database for customer registration in a specific application on a specific site
- Rotate a customer certificate
- Sign an x-ray image using certificate before storing in the local file system on a specific site
- Store secret configuration parameters for application
The Avassa platform is an easy-to-use, secure and scalable solution to the challenges of running container-based applications in a distributed environment. We would love to show it to you tell you more in detail about how others are using it. Please contact us to schedule a demo or join our trial program and get access to a running system.