DataStax review: Cassandra made faster and easier | Computing
As I discussed in my review of Google Cloud Bigtable in 2016, Google’s 2006 Bigtable paper inspired several large-scale distributed open source NoSQL databases, including Apache HBase and Apache Cassandra. I went on to explain that Cassandra was born at Facebook using ideas from Bigtable and the key-value store Amazon Dynamo, and that while Cassandra is a bit more popular than HBase, has a SQL-like query language (CQL), and is easier to get up and running than HBase, it is still complicated and has a significant learning curve.
Google Cloud Bigtable is one good managed alternative to running your own Cassandra clusters. Others include DataStax Enterprise (DSE) and DataStax Managed Cloud.
Why DataStax? Essentially, DataStax is the supported enterprise version of Cassandra, with improved performance and security, vastly improved management, advanced replication, in-memory OLTP, a bulk loader, tiered storage, search, analytics, and a developer studio. Not coincidentally, DataStax employees have contributed roughly 85 percent of the code in the Apache Cassandra project.
Like Bigtable and Cassandra, DataStax is best suited for large databases—terabytes to petabytes—and is best used with a denormalized schema that has many columns per row. DataStax and Cassandra users tend to use it for very large-scale applications. For example, eBay uses DataStax Enterprise to store 250 TB of auction data with 6 billion writes and 5 billion reads daily. Apple has (or had) more than 75,000 Cassandra nodes storing more than 10 PB of data.