Data Consistency Storage in the Cloud

Data helps organizations make business decisions. Good decision making requires data to be accurate, up-to-date, available, consistent, and securely accessible. Data consistency is ensuring that the results of a database transaction are visible to all parties simultaneously. That means that once a transaction has completed (committed or rolled back) all parties accessing that data can see the results of that transaction simultaneously. Secure data consistency ensures the integrity of data and prevents data corruption. Hence, data consistency is a highly desirable property of a database system.

Object storage

Non-relational databases, also known as NoSQL databases, have emerged in recent years as an alternative to relational databases. They provide a simplified way for developers to store and access data. The flexible schema maps naturally to object-oriented languages. Non-relational databases also provide several features that relational databases are not optimally suited to provide, including high-performance at a very large scale and high reliability.

Most cloud architectures use a combination of splitting data vertically, horizontally and replication to improve response times, scalability, availability, and fault tolerance. This is called object storage. Objects are accessible through APIs or through a web interface. Object storage systems store files in a flat organization of containers and use unique IDs to retrieve them.

Object storage introduces new challenges related to secure data consistency, which are not present in relational databases. One of the more fundamental decisions cloud professionals need to make is whether they will choose eventual consistency for data in corporate systems or strong consistency.

Strong consistency

In a cloud-based system there are multiple copies of any piece of data on microservices and containers. For example, let’s consider how mobile banking works. We have access to our bank account through our mobile phone, or through our browser. There is also a value in the main database of the bank’s back-end systems.

When software engineers design banking apps, they need to decide how much effort they will put into making figures across platforms consistent. If they decide the account balance must be the same everywhere, that system is based on secure, strong data consistency.

Figure 1: Strong Data Consistency. Source: Google.

Strong data consistency ensures that your data is always consistent and accurate. Consistency and accuracy are high quality attributes. But what about system performance and user experience? The downside of strong consistency is that the overall system performance is degraded, harming the user experience. For example, what happens when your mobile phone uses access to the bank’s back-end servers? In a strongly consistent system, you will not be able to access your account balance even if it was updated only 5 mins ago and it is 99.99% likely to be accurate.

These limitations may be tolerable in a distributed database of two to four nodes, but in a cloud computing environment with a dozen or more data nodes it becomes unacceptable. The issue gets serious in cloud architectures where network latency and reliability may further impact performance. This gravely affects response times, scalability, and availability.

Eventual consistency

The solution to overcome the performance and usability problems encountered in strong consistency is called eventual consistency. Google defines that “Eventual consistency is a theoretical guarantee that, provided no new updates to an entity are made, all reads of the entity will eventually return the last updated value.”

Figure 2: Eventual Data Consistency. Source: Google.

In eventual consistency copies of data do not always have to be identical if they are designed to eventually become consistent once all current operations have been processed. The Internet Domain Name System (DNS) is a well-known example of a system with an eventual consistency model.

DNS servers do not always reflect the latest status but, rather, the values are cached and replicated across many directories over the Internet. It takes a certain amount of time to replicate modified values to all DNS clients and servers. However, the DNS system is a very successful system that has become one of the foundations of the Internet. It is highly available and has proven to be extremely scalable, enabling name lookups to over a hundred million devices across the entire Internet.

Going back to our banking example, in an eventual consistency data model your banking app on your smartphone will display the latest account balance, if it is fresh enough to be reasonably reliable, even if the bank’s server is not available to query at this very moment.

Considerations for selecting a data consistency model

Selecting either strong or eventual data consistency model relates to the CAP Theorem, which states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

Consistency
Availability
Partition tolerance

In other words, in the presence of a network partition, which is true for distributed databases and cloud infrastructure, the solution architect must choose between data availability and data consistency. The two consistency models favor either availability or consistency over the other. Eventual consistency does not get rid of consistency. However, eventual consistency proposes to relax data consistency, for a short period of time, in favor of data availability.

How a certified cloud security professional can help you

Selecting a data consistency model is a decision that cloud security professionals must make. Both models are useful tools in the hands of cloud professionals, who must determine the best tool to use. Understanding that eventual consistency is not an alternative to strong consistency is important to address all business use cases.

Eventual consistency may be the best model for use cases with very large numbers of entities. If there are a very large number of results in a query, then the user experience may not be affected by the inclusion or exclusion of specific entities. On the other hand, use cases with a small number of entities and a narrow context suggest that strong consistency is required. The user experience will be affected because the context will make users aware of which entities should be included or excluded.

On the other hand, if data changes frequently, then eventual consistency may not be suitable, because users will have to wait a considerable amount of time until changes are propagated to all replicas before requests return the latest version. Hence, eventual consistency might be more appropriate for use cases where data does not change much, such as backups, archives, video and audio files, and VM images.

How the CCSP Certification Can Help You to Succeed

The ISC2 Certified Cloud Security Professional (CCSP) is the answer to all your concerns about data consistency, availability and overall system performance. CCSP is the benchmark of cloud security certifications and is repeatedly recognized as the most valued and well-rounded cloud security certification.

CCSP is a vendor-agnostic certification that ensures that certified practitioners have the security knowledge to successfully secure and protect data in any cloud environment. It is CCSP’s unique criteria that has elevated it to a standard that has allowed it to be identified as the premier cloud security certification, providing an advantage in an increasingly competitive corporate landscape.

Attaining CCSP certification shows you have the advanced technical skills and knowledge to design, manage and secure data, applications, and infrastructure in the cloud using best practices, policies, and procedures established by the cybersecurity experts at ISC2.