Apache Cassandra, first developed at Facebook, is an open source distributed NoSQL database designed to handle large amounts of data across commodity servers. In addition to Facebook, it is used at companies known for their large-scale operations like CERN and Instagram and as a result, is one of the go-to databases for some of the world’s largest applications.
As a key-part of microservice-based applications, Cassandra can benefit from being containerized. Because of this, Cassandra is one of the most popular images in the DockerHub with over 5 million pulls.
However, running Cassandra in a container requires some special considerations. If you can solve these problems, you’ll have gotten most of the way to a successful Cassandra deployment in containers.
Achieving top read/write performance with hyper-convergence
Chances are if you are running Cassandra, read and write performance is important to you. Cassandra was designed to run in bare metal environments where each server offers its own disk as the storage media. The idea of running an app on the same machine as its storage is called “hyper-convergence”. Cassandra will always get the best performance using this setup because of its heavy use of disk during core operations like writes, reads and bootstrap operations. Shared storage puts pressure on these operations.
A starting point for your successful Cassandra container deployment will be enabling Cassandra to run on bare metal or VMs with direct attached storage without sacrificing the flexibility of scheduling task automation via an orchestration framework like Kubernetes, Mesosphere DC/OS or Swarm.
Simplifying deployments using the Network Topology placement strategy
Running Cassandra with direct attached storage will give you the best performance, but you also need to think about reliability. To ensure reliability, you should make sure to deploy your Cassandra ring using the so-called Network Topology strategy so it is resistant to highly correlated failure modes like rack or availability zone outages, cooling failures or network partitions.
The problem with the Network Topology strategy is that it is cumbersome to implement manually. Additionally, if you hand place your containers in the optimum network topology, you can’t take advantage of automated scheduling.
The second issue to overcome for your containerized Cassandra cluster will be automated replica placement, according to the Network topology strategy.
Improving cluster recovery time after node failure
Another consideration when running Cassandra in production is cluster recovery time in the event of outage or failure.
Cassandra itself is capable of replicating data. If a node dies, a new node brought into the cluster will be populated with data from other healthy nodes – a process known as bootstrapping.
As we’ve seen above, the bootstrap process puts load on your cluster as cluster resources are used to stream data to new nodes. This load reduces read/write performance of Cassandra, which slows your application down. The key to bootstrapping is to finish as quickly as possible, but this is increasingly difficult as the cluster is under stress.
So, the third issue you need to overcome to run Cassandra in containers is fast bootstrap operations.
Increasing container density by safely running multiple rings on the same hosts
The above operational best practices have been concerned with reliability and performance. Now we can look at efficiency. Most organizations run multiple Cassandra rings. However, since Cassandra is a resource intensive application, the costs of operating multiple rings can be considerable. It would be ideal if multiple rings could be run on the same hosts.
Lightweight containers are a great way to do this, but you need to enable each container to use its own volume and ensure that you don’t collocate data for two Cassandra instances that belong to the same ring in the same node. Instead, you want each of your rings to be spread across the cluster, maximizing resiliency in the face of hardware failure.
Contributed by: Gou Rao, Co-founder and CTO at Portworx. Gou was previously CTO of Dell’s Data Protection division and of Citrix Systems’ ASG; co-founder and CTO of Ocarina Networks and of Net6; and a key architect at Intel and Lockheed Martin. He holds computer science bachelor’s (Bangalore University) and master’s (University of Pennsylvania) degrees. Portworx solves the operational challenges required to run Cassadra in Docker containers along with any other stateful service.
Sign up for the free insideAI News newsletter.
Speak Your Mind