DataStax releases enterprise platform making graph fully native
With today’s release by DataStax of the next version of its enterprise platform, it is fulfilling a promise made last year to fully integrate the graph engine into the core platform. DataStax Enterprise 6.8 also adds support of the Cassandra Kubernetes operator that was announced last week, and a host of features designed to smooth operations.
As we stated almost a year ago, one of DataStax’s goals was making graph a first-class citizen in its core platform. The new release accomplished that by allowing use of the same API, and query of graph data from CQL, the native query language of the Cassandra platform. Data scientists and other practitioners are knowledgeable on graph databases will also still be able to use the Gremlin API.
Before this, DataStax Enterprise was, in effect, a dual-model database that let you work in Cassandra or in graph, but not both on the same core engine. In that sense, it resembled Microsoft Azure Cosmos DB, a multi-model database that allowed you a choose which data model to work in, and then use that API exclusively for that data set.
With graph fully integrated, that means data can be ingested and modeled as it normally would for Cassandra, with graph views based on the partition keys that are key to the Cassandra data model. So, you don’t have to load data one way or another, or model it differently, and the graph view can now take advantage of the same multi-master, scale out capability of the underlying Cassandra platform.
Adding Kubernetes support is part of DataStax’s roadmap to both make Cassandra suited for cloud-native operation via containers and microservices and provide another means for getting back to be closely aligned with the Apache Cassandra open source community. In this case, DataStax contributed their Kubernetes operator into the Apache project, where it could be converged with operators developed by other members of the open source community.
Operational simplification is another key them of the 6.8 release – and here, the Kubernetes operator plays directly into that by enabling Cassandra to deploy and elastically scale more easily.
Other operational features include new guardrails that codify best practices in deploying Cassandra. These guardrails can provide warnings when releasing code into operation, triggering alerts when deployment specs such as column sizing or number of indexes might compromise operations; they allow operators or developers to reconfigure before problems hit the fan.
Incremental node sync is another new 6.8 feature aimed at streamlining operations. The guiding notion was to reduce the overhead of re-syncing data when a node or network connection goes down. Previously, you would have had to re-synchronize the entire table and take a node down, but with the new incremental feature, only the specific data affected must be resync’ed. And a new feature, Zero Copy Streaming, speeds the addition or removal nodes for business continuity tasks.
Another of the promises from last year, simplifying the developer experience, will wait for another release cycle. For now, the emphasis is on finalizing Astra, the managed Apache Cassandra cloud service that DataStax has been promising for over a year. Currently in beta on the Google Cloud, the pressure is on now that AWS is currently previewing its own Managed Cassandra Service counterpart.
The other major objective for DataStax is tying together the loose ends in reconnecting to the Apache Cassandra community and realigning its core platform accordingly. A key milestone will be release of Apache Cassandra 4.0 that will focus more on tweaking and updating (e.g., Java 11 support) rather than dramatic extensions of the platform. Getting releases aligned so that the enterprise platform is a modular superset is a work in progress.