I recently caught up with Ben Bromhead, CTO and Co-founder at Instaclustr, to discuss the departure of DataStax from the Apache Cassandra open source project, and how there’s now a void with regard to Cassandra dev community health, database feature upgrades, and overall commits. Instaclustr is looking to step into this space, fill the vacuum, get commits back up, and replace DataStax on these initiatives. Ben also is active in the Apache Cassandra community. Prior to Instacstlur, Ben had been working as an independent consultant developing NoSQL solutions for enterprises, and he ran a high-tech cryptographic and cybersecurity testing lab at BAE Systems and Stratsec.
Daniel D. Gutierrez – Managing Editor, insideAI News
insideAI News: Cassandra has had a long and illustrious history. How do you feel changes in DataStax’ relationship with the community will affect this open source project long term?
Ben Bromhead: There is no doubt that after the DataStax departure, it took some time for the community to reflect and regroup with the reduction of resources. What I believe has emerged in that wake, though, is a more balanced ecosystem with more contributions from some of the largest users of this technology – such as Apple, Uber, Instagram and Netflix, to name a few.
DataStax certainly helped to bootstrap the Apache Cassandra project, but over time a rich ecosystem of other vendors and operators started to grow around the community. These users have quickly filled the void and now there is a strong community comprised of those who run open source Apache Cassandra in production (and rely on this technology for some of their most important revenue-generating applications).
insideAI News: What’s the word out about making Cassandra into a new commercialized version? What would that do to the open source version? How well would it be accepted in the marketplace?
Ben Bromhead: There are a number of new vendors emerging that have taken some of the baseline of the technology and have developed their own spin on it. For many organizations choosing to deploy Apache Cassandra – and then deciding on either commercial or open source – it really comes down to a strategic technology decision. But we see that so many of the technology leaders and the largest users of this database will only ever deploy the open source version of Apache Cassandra. No technology lock-in, more transparency, and a large and vibrant community are the key reasons why. These large users need to have the decision-making in their own hands, rather than by a vendor promoting their own product.
insideAI News: Can you give a few words about any upcoming features for Cassandra?
Ben Bromhead: There’s a lot on the horizon that I think will be very interesting – and useful – to the Cassandra community. Among them:
- Pluggable storage with RocksDB will deliver substantial improvements to performance.
- Virtual tables. Some great work being done by Cassandra committer Jeff Jirsa on this front, and this will allow API developers to create virtual tables in Cassandra.
- Change Data Capture (CDC) improvements. Uber is putting a chunk of time in trying to improve CDC performance since they have built some in-process CDC mechanisms – which means this code path is going to be better tested.
- Decoupling redundancy from availability. Both Instagram and Apple have ongoing work allowing Cassandra to have nodes that act as hint stores or lightweight replicas in specific situations using different approaches.
insideAI News: What role will Instaclustr play in all of this?
Ben Bromhead: Within Instaclustr we have now established a team of dedicated developers to work on community-related Apache Cassandra activities. This includes writing code, participating in project activities, fixing bugs, and writing new features that the community can take advantage of. Our intent here is to build a team of contributors that are actively involved and can provide some real operational experience to the community. We have well over 15 million node hours of Apache Cassandra management experience, and have seen this amazing database deployed in very effective (and very ineffective) ways – and both at very large scale and for smaller projects. We believe that our unique operational experience is a real positive for the Apache Cassandra community, and we feel compelled to step up and do our part to make sure that experience is readily available.
Sign up for the free insideAI News newsletter.
This article is factually incorrect. DataStax was not consulted in advance/ asked to provide comment so we will leave this important correction for readers here in the Comments Section.
DataStax has not departed the Apache Cassandra open source project. Some may not be aware that DataStax has built most of Apache Cassandra with more than 85% of code commits to date. At the request of the Apache Foundation, our co-founder and CTO stepped down as chair of the project. And we applaud the objective, to encourage broader community participation. Still, we employe the lion’s share of committers, and we will likely be the biggest contributor of Apache Cassandra 4.0 when it releases. As a strong open source advocate, we are also driving Apache TinkerPop with nearly 100% of code contributions. Plus, developers and other technical distributed database professionals and those working with DataStax Enterprise and Apache Cassandra are fully supported by DataStax’s Developer Relations team. We are additionally driving Apache Cassandra events at various cities around the world. Check us out at here: https://academy.datastax.com/
Andrew Lampitt
Sr. Director, Product Marketing
DataStax
Thanks for the feedback Andrew.
I’m happy to correct my statement from “departure” to “moved all full-time Apache Cassandra™ developers to the DSE team” as per https://www.datastax.com/2016/11/serving-customers-serving-the-community.
Cheers
Ben