Six Key Design Considerations for Stateful Applications at the Edge
According to Gartner, by 2025, 75% of enterprise-generated data will be created and processed outside of a traditional centralized data center or cloud. This means data will come from everywhere, with applications, devices, and users—geographically distributed. Is your enterprise prepared for this new data environment?
Edge computing distributes computation and data storage closer to where the data is produced and consumed. New edge applications exploit devices and network gateways to perform tasks and provide services locally on behalf of the cloud.
Stateful edge applications demand a new data architecture that takes into account the scale, latency, availability, and security needs of applications. We will look at key design considerations for enterprises to take into account when building a data architecture for edge applications.
The Lifecycle of Data
The first step in figuring out the data architecture is to understand the lifecycle of data–where the data is produced, what needs to be done with it (i.e., analysis, store and forward, and long term storage), and where it is consumed. Moving data across regions incurs high latency, while placing data close to where the data is produced and consumed allows for lower latency and higher throughput. On the other hand, as you move from the cloud towards the edge, you have fewer resources to process the data and have to account for a greater risk of network partitions and downtime.
What the applications are trying to do with the data will be an important factor in figuring out what the data architecture needs to support. For example, retail applications that analyze data from point-of-sale devices in store locations will need to replicate data from the edges to the cloud for a couple of reasons. The analysis may be useful only when data from all the different stores is looked at in aggregate.
Additionally, the cloud has the resources needed to run advanced machine learning algorithms, which an edge location may lack. But if the retail organization wants to deploy consistent software configurations to all its edge locations, it would look to create a master configuration at the core and then replicate that to all edge locations. In this example, the flow of data is from the core to many edge locations.
The following are key considerations when designing a data architecture for edge environments:
How fast is the data growing? How many users and devices will generate data? How much compute power is needed to process the data?
Edge locations typically do not have the compute and storage resources to run deep analytics on vast amounts of data. Also, OLTP databases at the edge may need to scale throughput to handle massive write volumes from devices.
2. Latency and Throughput
How much data will be written or read? Will the data come in bursts or as individual data points? How quickly does it need to be available to users and applications?
For example, with real time applications such as connected vehicles and credit card fraud detection, it is not feasible to send telemetry or transaction data back to a cloud application to determine a course of action. In these cases, real time analytics is applied to raw data in edge locations to generate alerts.
3. Network Partitions
Poor network connectivity is a reality for many near and far edge locations. Applications should take into account how to deal with network partitions. Depending on network quality between the edge and the cloud, different operating modes can occur:
- Mostly connected: The applications can connect to a remote location to perform an API call (i.e., to lookup data) most of the time. A small fraction of these API calls may fail because of network partitions (i.e., a few seconds of partitions over a several hour timespan).
- Semi-connected: In this scenario, there could be an extended network partition lasting several hours. Applications would need to be able to identify changes that occurred during the partition window, and synchronize their state with the remote applications once the partition heals.
- Disconnected: The predominant operating pattern in this case is that the applications run independent of any external site. There may be an occasional connectivity, but that is considered more the exception rather than the norm.
Applications and databases running in the far edge should be designed for disconnected or semi-connected operation. Near edge applications should assume semi-connected or mostly connected operation. The cloud operates in the mostly connected mode. That said, when a public cloud service experiences an outage, the impact is severe and can last many hours.
4. Other Failures
In addition to network partitions, infrastructure outages can be quite common in various locations. At the far edge, everything from node and pod failures to a complete regional outage can be common. At the near edge and in the cloud, while node or pod outages are common, applications can use racks and zones for higher resilience. But even with this fault isolation in place, region-level outages can occur.
But not all failures need to be outages. Other types of failures include resource contention and resource exhaustion.
5. Software Stack
It is important to think about the agility and ease of use when picking components for the software stack. Business services involve a suite of applications, so engineering teams need to design for rapid iteration on applications. One way to achieve this is to use well-known frameworks that enable instant developer productivity, and a well-known and feature-rich database that developers already know well and is open source.
For applications running at the edge, security is paramount, especially since there is a large surface area of attack given the inherently distributed nature of the architecture. It is important to think about least privilege everywhere, zero trust, and zero touch provisioning for all services and components.
Some of the other specific security aspects that come up are listed below:
- Encryption in transit
- Encryption at rest
- Multi-tenancy support at the database layer and per-tenant encryption
- Regional locality of data to ensure compliance and thinking about any geographic access controls that go with it
The rise of edge computing is a paradigm shift in the way applications are built and deployed to cater to the needs of an increasingly decentralized world. There is no one-size-fits-all database reference architecture that works for all applications in this environment. But depending on the requirements of the application and tradeoffs involved, enterprises will make different design choices to meet their needs, and adapt those choices when needs change.
About the author: Karthik Ranganathan is the co-founder and CTO at Yugabyte, the company behind YugabyteDB, a transactional distributed SQL database for cloud-native applications. Ranganathan received his BS and MS in CS from IIT-M and UT Austin. Ranganathan was one of the original database engineers at Facebook responsible for building distributed databases such as Cassandra and HBase. He is an Apache HBase committer, and also an early contributor to Cassandra, before it was open-sourced by Facebook.