Cassandra

repo: Anant/awesome-cassandra
category: Databases related: Mysql · Mongodb · Postgresql


Awesome Cassandra Awesome

<a href="http://cassandra.apache.org/"><img src="https://upload.wikimedia.org/wikipedia/commons/5/5e/Cassandra_logo.svg" align="right" width="140"></a>

Cassandra is a free and open-source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra is supported by the Apache Software Foundation and is also known as Apache Cassandra.

This is a curated list of awesome Cassandra packages and resources. Maintained by Rahul Singh of Anant. Feel free contact me if you'd like to collaborate on this and other awesome lists. Awesome Cassandra , Awesome Solr, Awesome Lucene. This powers the Resources section of Cassandra.Link, a rich collection of blog feeds, and curated links as a searchable knowledge base.

Contents

General

Cassandra

  • Apache Cassandra - Manage massive amounts of data, fast, without losing sleep.

Cassandra History

Cassandra Use Cases

  • Datastax Academy: What is Cassandra? - Introduction to what Cassandra is, where it came from, and some of it's benefits.
  • [Kaa application based on Raspberry Pi and DHT11 sensor](https://github.com/pyroalf/kaa-cassandra-sample) - Cassandra IoT usecase with Raspberry Pi and a DHT11 Sensor.
  • [Simple Node.js Express 4 Cassandra Application](https://github.com/bradtraversy/mysubscribers) - MySubscribers is a very simple application (Start of an application) which allows you to create, read, update and delete users/subscribers. This application was only created to aid the YouTube course.

Cassandra Distributions

Cassandra Compliant Databases on JVM

  • DataStax Enterprise - Most widely used commercial distribution of Cassandra, integrated with Apache Spark (for SparkSQL, analytics), Apache Solr (for secondary index), Apache TinkerPop based Graph stored in Cassandra, and OpsCenter.
  • DDAC/Luna - Datastax Distribution of Cassandra, a production ready distribution with a bulk loader supported by Datastax. DDAC is Deprecated now, but Datastax is still supporting Cassandra with it's new Luna Service.

Cassandra Compliant Databases on C++

  • ScyllaDB - NoSQL data store using the seastar framework, compatible with Cassandra.
  • YugaByte Database - YugaByteDB is a transactional, high-performance database for building distributed cloud services. It supports Cassandra-compatible and Redis-compatible APIs, with PostgreSQL in Beta.

Cassandra as a Service / Managed Cassandra Based on Open Source Cassandra

Cassandra as a Service / Managed Cassandra Based on Proprietary Technology

Using Cassandra

Cassandra from Relational

Cassandra Data Modeling

Cassandra Architecture

Cassandra Monitoring

Cassandra Maintenance

Cassandra Performance Tuning

Cassandra Security

Cassandra Deployment

Cassandra Deployment on Docker / Containerized Cassandra

Cassandra Deployment on Kubernetes / Kubernetized Cassandra

  • [K8ssandra.io - Kubernetes + Cassandra](https://k8ssandra.io/) - K8ssandra provides a production-ready platform for running Cassandra on Kubernetes. This includes automation for operational tasks such as repairs, backups, and monitoring.
  • [Datastax - Cassandra Kubernetes Operator](https://github.com/datastax/cass-operator) - Datastax's Cassandra Kubernetes Operator which supports Datastax as well as open source Cassandra containers on Kubernetes.
  • [Instaclustr - Kubernetes Operator for Cassandra](https://github.com/instaclustr/cassandra-operator) - The Cassandra operator manages Cassandra clusters deployed to Kubernetes and automates tasks related to operating an Cassandra cluster.
  • [Sky UK - Cassandra Kubernetes Operator](https://github.com/sky-uk/cassandra-operator) - Kubernetes operator that manages Cassandra clusters inside Kubernetes. Well designed and organized.
  • CassKop - Cassandra operator for Kubernetes - Kubernetes operator automates the Cassandra operations such as deploying a new rack aware cluster, adding/removing nodes, configuring the C and JVM parameters, upgrading JVM and C versions. Written in Go.
  • Strapdata - Elassandra Operator for Kubernetes - The Elassandra Kubernetes Operator automates the deployment and management of Elassandra clusters deployed in multiple Kubernetes clusters.
  • Rook.io - Cassandra on Kubernetes - Rook is an open source cloud-native storage orchestrator, providing the platform, framework, and support for a diverse set of storage solutions to natively integrate with cloud-native environments. They have a special operator for Cassandra amongst other providers.
  • Kudo Cassandar Operator - The KUDO Cassandra Operator makes it easy to deploy and manage Cassandra on Kubernetes.

Integrating with Cassandra

  • [Building a Streaming Data Hub with Elasticsearch, Kafka and Cassandra](http://thenewstack.io/building-streaming-data-hub-elasticsearch-kafka-cassandra/) - Blog post detailing how a streaming analytics system on top of open source, big data components can be done.
  • [Docker container for Kafka - Spark streaming - Cassandra](https://github.com/Yannael/kafka-sparkstreaming-cassandra) - Dockerfile that sets up a complete streaming environment for experimenting with Kafka, Spark streaming (PySpark), and Cassandra.
  • sample KafkaSparkCassandra - Introductory sample scala app using Apache Spark Streaming to accept data from Kafka and write a summary to Cassandra.
  • sample Spark Cassandra with SSL - Simple sample job illustrating the use of Spark to execute Apache Spark analytics with Cassandra with SSL connection.

.NET and Cassandra

  • Cassandra API with .NET - Quickstart guide on how to use .NET and the Azure Cosmos DB Cassandra API to build a profile app.
  • DataStax C# Driver - C# Driver for Cassandra from DataStax.
  • DataStax C# Driver Documentation - Documentation on the C# Driver for Cassandra from DataStax.
  • CQL data types to C# types - Documentation on CQL data types to C# types.
  • Connect to Cassandra with C# - Instaclustr article on how to connect to Cassandra with C#.
  • [Access Amazon Keyspaces with a Cassandra .NET Core Driver](https://docs.aws.amazon.com/keyspaces/latest/devguide/using_dotnetcore_driver.html) - Article shows how to connect to Amazon Keyspaces by using a .NET Core client driver.
  • [Cassandra ADO.NET Driver](https://www.cdata.com/drivers/cassandra/ado/) - Cassandra ADO.NET Data Provider enables user to easily connect to Cassandra data from .NET applications.
  • [Cassandra Pagination with ASP.NET Core C#](https://bhonemyintkyaw777.medium.com/cassandra-pagination-with-asp-net-core-c-a85fd58f6b2b) - Article covering how to create infinite scroll pagination with Cassandra and ASP.NET Core C#.

Spark

Search / Secondary Indexes

Databases

Timeseries Databases

Monitoring / Metrics

Custom Time Series

Graph

Miscellaneous

  • Cassandra vs MongoDB - Article comparing the two popular NoSQL databases.
  • Stargate - Stargate is an open-source data gateway that provides REST, GraphQL and schemaless JSON interfaces to Cassandra.
  • [Meet Stargate, DataStax's GraphQL for databases. First stop - Cassandra](https://www.zdnet.com/article/meet-stargate-datastaxs-graphql-for-databases-first-stop-cassandra/) - Introduction and high-level overview of Stargate.
  • Apache/Usergrid - Open source Backend as a Service (BaaS) on Cassandra, Elasticsearch with client SDKs for iOS/Android/.NET/Java.
  • [Building Your Own BaaS With Apache Usergrid & Docker: Lessons Learned At Scale](http://events17.linuxfoundation.org/sites/events/files/slides/Building%20Your%20Own%20BaaS%20With%20Apache%20Usergrid%20%26%20Docker.pdf) - Introductory presentation to Apache UserGrid.
  • Scalar-labs/Scalardl - Tamper-evident and scalable distributed ledger platform.
  • Wikimedia/Restbase - Distributed storage with REST API & dispatcher for backend services.
  • Wikimedia/restbase-mod-table-spec - Shared spec and tests for RESTBase table storage.

Packages

Libraries

  • express-cassandra - Cassandra ORM/ODM/OGM for Node.js with optional support for Elassandra & JanusGraph.
  • [DataStax Java Driver](https://github.com/datastax/java-driver) - Java client driver for Cassandra.
  • DataStax C++ Driver - Modern, feature-rich, and highly tunable C/C++ client library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using exclusively Cassandra's native protocol and Cassandra Query Language v3.
  • [DataStax Python Driver](https://github.com/datastax/python-driver) - Modern, feature-rich and highly-tunable Python client library for Cassandra (2.1+) using exclusively Cassandra's binary protocol and Cassandra Query Language v3.
  • [DataStax Ruby Driver](https://github.com/datastax/ruby-driver) - Ruby client driver for Cassandra. This driver works exclusively with the Cassandra Query Language version 3 (CQL3) and Cassandra's native protocol.
  • [DataStax Node.js Driver](https://github.com/datastax/nodejs-driver) - Modern, feature-rich and highly tunable Node.js client library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using exclusively Cassandra's binary protocol and Cassandra Query Language v3.
  • DataStax C# Driver - Modern, feature-rich and highly tunable C# client library for Cassandra (1.2+) and DataStax Enterprise (3.1+) using exclusively Cassandra's binary protocol and Cassandra Query Language v3.
  • DataStax PHP Driver - DataStax PHP Driver for Cassandra.
  • Achilles - Achilles is an open source Persistence Manager for Cassandra,with the features like Advanced bean mapping (compound primary key, composite partition key, timeUUID, ect),Native collections and map support,and so.
  • phpcassa - PHP client library for Cassandra.
  • Caffinitas - Caffinitas is an advanced object mapper for Cassandra which has been especially designed to work with Datastax Java Driver 2.1+ against Cassandra 2.1, 2.0 or 1.2.
  • Spring Data for Cassandra - Spring Data for Cassandra offers a familiar interface to those who have used other Spring Data modules in the past.
  • gocql - Package gocql implements a fast and robust Cassandra client for the Go programming language.

Tools

  • Hackolade - Visual data modeling tool for NoSQL databases and stuctures like Cassandra, ElasticSearch, Graph DBs, JSON, APIs.
  • JetBrains Datagrip DB IDE - The Cross-Platform IDE for Databases & SQL by JetBrains, with support for Cassandra.
  • Datastax - Management API for Cassandra - The Management API is a sidecar service layer that attempts to build a well supported set of operational actions on Cassandra® nodes that can be administered centrally.
  • DataStax OpsCenter - Simplified management for DataStax Enterprise and Cassandra database clusters.
  • CassandraCAS - Compare-and-swap tool for Cassandra created by Datomic.
  • Peloton - Unified resource scheduler created by Uber. This tool can handle many nodes and clusters through resource management and scalability.
  • Ansible-Galaxy: Cassandra GitHub - Collection called cassandra that aims at providing all Ansible modules allowed to interact with Cassandra.
  • Ansible-Galaxy: Cassandra - Documentation for Ansible-Galaxy: Cassandra.
  • Ansible-dse - Set of Ansible playbooks that will build a Datastax Enterprise cluster.
  • dseansible - DSE Installation and Upgrade Ansible Playbooks/Roles for Ubuntu Linux.
  • DbSchema - Cassandra Designer - DbSchema: Cassandra Diagram Designer & GUI Admin Tool which can do Cassandra amongst other databases.
  • [DBeaver - Free Universal Database Tool](https://dbeaver.io/) - Third party tool for dealing with all sorts of databases including Cassandra.
  • RazorSQL - Multi DB Manager Tool - Multi-db tool for Linux, Mac, and Windows that works with Cassandra.
  • Cassandra Reaper - Automated repairs for Cassandra. Supports all versions.
  • cstar perf - Cassandra performance testing platform.
  • Spark Cassandra Stress - Tool for testing the DataStax Spark Connector against Cassandra or DSE.
  • cqlmigrate - Cassandra CQL migration tool. cqlmigrate is a library for performing schema migrations on a cassandra cluster.
  • cassandra-migration-tool-java - Cassandra migration tool for java is a lightweight tool used to execute schema and data migration on Cassandra database.
  • Cassalog - Cassalog is a schema change management library and tool for Cassandra that can be used with applications running on the JVM.
  • cdeploy - Cdeploy is a simple tool to manage your Cassandra schema migrations in the style of dbdeploy.
  • Web: Cassandra Calculator - Simple calculator to see how size / replication factor affect the system's consistency.
  • Cassandra-web - Web interface for Cassandra.
  • CassanddraRestfulAPI - CassandraRestfulAPI project exposes the cassandra data tables with the help of Restful API.
  • Netflix: Staash - Language-agnostic as well as storage-agnostic web interface for storing data into persistent storage systems, the metadata layer abstracts a lot of storage details and the pattern automation APIs take care of automating common data access patterns.
  • cql-vim - Cassandra CQL Syntax Highlighter for Vim.
  • Presto - Distributed SQL Query Engine for Big Data. Presto allows querying data where it lives, including Hive, Cassandra, relational databases or even proprietary data stores.
  • SSTable Tools - Toolkit for parsing, creating and doing other fun stuff with Cassandra 3.x SSTables.
  • Cassandra-Exporter - Simple Tool to Export / Import Cassandra Tables into JSON.
  • Cassandra SStable Tools - Multiple different tools combined into one that helps admins get summaries, metadata, partition info, cell info.
  • Cassandra-Client - Simple gui tool for browsing tables and data in Cassandra.
  • CQL Data Modeler - Very useful tool to test out a CQL schema and visualize what the partition would like in relationship to the columns and rows.
  • Cassandra Snapshot Backup - Quick and easy way to snapshot files in a Cassandra database and back them up using Ansible.
  • Slothsandra - Integration for Cassandra with the Slack app, which stores old messages that Slack no longer does itself.
  • sandraREST - Cassandra manager with a web UI for RESTful APIs.
  • Cassandra Leadership - Library to help elect leaders using cassandra. Uses paxos to build a leadership election module.
  • Terraform Cassandra - Terraform module that creates a Cassandra cluster.
  • Datadog - Third party tool that allows monitoring and metrics for Cassandra nodes and clusters.
  • tlp-cluster - Provisioning tool for Cassandra designed for developers looking to benchmark and test Cassandra. It assists with builds and starting instances on AWS.
  • Helenos - Free web based environment that simplifies a data exploring & schema management with Cassandra database.
  • ValuStor - ValuStor is a key-value pair database solution.
  • Cassandra-Migration - Cassandra / DataStax Enterprise database migration (schema evolution) library.
  • JanuesGraph-Utils - Tool to Develop a graph database app.
  • Scylla-Migrator - Migrate data extract using Spark to Scylla, normally from Cassandra.
  • Cassandra CA Manager - Create and sign Java keystores.
  • Zipkin - Distributed tracing system.
  • Instaclustr Kerberos plugin - GSSAPI authentication provider for Cassandra.
  • [Instaclustr Java Driver for Kerberos](https://github.com/instaclustr/cassandra-java-driver-kerberos) - GSSAPI authentication provider for the Cassandra Java driver.
  • Instaclustr Minotaur - Command line tool for consistent rebuilding of a Cassandra cluster.
  • Instaclustr TTL Remover - Command line tool for rewriting SSTables to remove TTLs.
  • Instaclustr SSTable Generator - CLI tool for programmatic generation of Cassandra SSTables.
  • Instaclustr Exporter - Java agent that exports Cassandra metrics to Prometheus.
  • Instaclustr Go Client for Instaclustr Icarus - Go client for Instaclustr Icarus sidecar.

Open Source Applications

  • Twissandra - Twissandra is an example project, created to learn and demonstrate how to use Cassandra. Running the project will present a website that has similar functionality to Twitter.
  • ChronoServer - Test server for sampling how long it takes mobile & web clients to make various types of requests to a server doing common request patterns.
  • Cassandra Cluster Admin - Cassandra Cluster Admin is a GUI tool to help people administrate their Cassandra cluster.
  • Cassandra-Tools - Python Fabric scripts to help automate the launching and managing of cluster testing on AWS.
  • Cassandra Opstools - Generic scripts to review and monitor cassandra, from Spotify.
  • CCM: Cassandra Cluster Manager) - Script/library to create, launch and remove an Cassandra cluster on localhost.
  • Netflix-Priam - Co-Process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra.
  • CStar - Cassandra cluster orchestration tool for the command line.
  • CMB - Highly available, horizontally scalable queuing and notification service compatible with AWS SQS and SNS.
  • CassieQ - Distributed queue built off of Cassandra.
  • Cherami - Distributed, scalable, durable, and highly available message queue system.
  • Scheduler - Scala library for scheduling arbitrary code to run at an arbitrary time.

Logging /Metrics

Resources

Documentation

Books

Courses

Communities

Blogs

  • Datastax - DataStax, Inc. is a data management company that provides commercial support, software, and cloud database-as-a-service based on Cassandra.
  • Codecentric: Cassandra - Codecentric is an IT consulting company, these are their blog posts surrounding the topic of Cassandra.
  • Pythian: Cassandra - Pythian provides data and cloud-related services. The company provides services for Oracle, SQL Server, MySQL, Hadoop, Cassandra and other databases and their supporting infrastructure.
  • Instaclustr - Managed and supported open source solutions for Cassandra, Kafka, Elasticsearch & Redis.
  • OpenCredo:Cassandra - OpenCredo is a consulting company that helps clients make informed decisions around cloud native and open source technologies, as well as public cloud services.
  • DOAN DuyHai's Blog: Cassandra - Duyhai Doan is a freelance big data and cloud architect who values sharing knowledge and contributing to the technology community.
  • Amy Tobert - Amy Tobert is a full-stack engineer & leader with passion for sustainable systems and people-centered leadership. Her blog details different Cassandra deployments amont other topics.
  • Christopher Batey: Cassandra - Christopher Batey is a software engineer of over 15 years and is a primary contributor to Akka and occasional contributor to Cassandra.
  • Distributed Bytes: Cassandra - Tim Ojo is the creator of Distributed Bytes and software engineer at Capital one. These are a collection of his posts surrounding the topic of Cassandra.
  • The Netflix Tech Blog - Learn about Netflix’s world class engineering efforts, company culture, product developments and more.
  • Spotify R&D / Engineering Blog : Cassandra - Cassandra related posts on Spotify's official technology blog.
  • Ryan Svilha - Ryan Svilha is a principle engineer at DataStax. His blog posts covers topics surround Cassandra and associated tools.
  • Anant - Anant builds and manages business platforms of which they connect customer experiences and information systems with real-time data platforms.

Videos

Slides

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?