How to Setup Cassandra Docker Container using Docker Compose

How to Setup Cassandra Docker Container using Docker Compose. Any application development will need a database. Different use cases which need different databases. There is no point in installing them all on our machine. With containers, it will ease up to provision the choice of database easily for testing purpose without the need to install it.

In this article we shall provision Apache Cassandra docker container and go through steps on using it.

What is Cassandra?

Organizations usually have a love hate relationship with data. It is because of the unguided decision making process and lost market insights. Moreover, large and active datasets having thousands of requests becomes highly arduous to maintain.

This is where Cassandra is used. This distributed database management system is designed in a way that it can handle a large number of data across several data centers and cloud environment quite effortlessly.

Therefore, large businesses like Apple, Instagram, Facebook, Spotify, Cisco, Twitter, eBay, Netflix, etc., are using this software.

Advantages Of Cassandra

Besides handling larger chunks of data effortlessly, Cassandra offers the following advantages as well:

Highly Scalable

One of the extensive advantages of Cassandra is that it is highly scalable. As discussed above, it tends to add several servers, thereby scaling up and down as per your needs. It does so without any downtime or pause occurring in the applications.

High Performance

With Cassandra’s particular architectural choices make it a significantly beneficial solution for processing data. the Cassandra achieves its speed in two ways:

  • Also by allowing nodes to make data storage decisions. This way, you don’t have to have a centralized “master node” that requires consultation on storage decisions.

Open Source

Firstly Cassandra is an open source project of Apache, which means that it is completely free. It has given birth to a large community where people share their views, queries, and suggestions related to big data. Moreover, it can be configured with other Apache open source projects like Hadoop, Apache Pig, and Apache Hive.

High Availability

Secondly Cassandra provides high availability through replicating data at different locations and data centers. It also has a peer to peer architecture that enables nodes to perform read and write operations. This way, data can swiftly replicate across data centers and geographies.

Peer To Peer Architecture

As discussed earlier, Cassandra follows a peer to peer architecture for execution. That is why it has significantly fewer chances of failure. Hence, you can add as many servers as required by your business in data centers to make Cassandra clusters.

Fault Tolerance

Businesses usually worry about whether the stored data is saved or not. However, with Cassandra, the data is not only secure but also stored in several locations. So, even if one server fails or gets hacked, the data can be retrieved effortlessly from another location. But, the number of replications created entirely depends upon the user. It is then activated by the high level backup and recovery competencies of Cassandra.

Multi Data Center And Hybrid Cloud Support

In addition Cassandra enables you to access multiple data centers as well as to use hybrid cloud support. It is because it is designed as a distributed system for the configuration of large numbers of nodes across several data centers.

Column Oriented

Interestingly Cassandra is column oriented, which means it has a high level data model. The Cassandra stores are column based, which leads to an immediate slicing. Its column names also consist of actual data instead of metadata. In short, it is furnished with a rich data model.

Great Analytics Possibilities

Cassandra provides the following four methods to carry out analytics:

  • Solr based integrated search.
  • Spark based near real time analytics.
  • Batch analytics integrating Hadoop with Cassandra.
  • External Batch analytics powered by Hadoop and Cloudera/Hortonworks.

This way, the range and usage of analytics significantly expand in Cassandra.

Next in this post, we will show you How to Setup Cassandra Docker Container using Docker Compose.

How to Setup Cassandra Docker Container using Docker Compose

Prerequisites

  • A server running Ubuntu 20.04.
  • A root user or a user with sudo privileges.

Add Docker CE Repository

Firstly, you will need to install some dependencies on your server. You can install all the required dependencies using the following command:

				
					apt install apt-transport-https ca-certificates curl software-properties-common -y
				
			

Once all the required dependencies are installed, download and add the Docker GPG key with the following command:

				
					curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
				
			

Next, add the Docker repository to APT:

				
					add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
				
			

Once the repository is added, update the repository cache using the following command:

				
					apt update -y
				
			

Install Docker CE

Here install the Docker CE by running the following command:

				
					apt install docker-ce -y
				
			

After the installation, you can also see the Docker information with the following command:

				
					docker info
				
			

You should see the following output:

				
					Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
				
			

Install Docker Compose

You will also need to install the Docker Compose on your server. Firstly, visit the Docker Compose Git Hub page, pick the latest Docker Compose version and download it with the following command:

				
					curl -L "https://github.com/docker/compose/releases/download/v2.6.1/docker-compose-linux-x86_64" -o /usr/local/bin/docker-compose
				
			

Following step is to set the execution permission on the downloaded binary file with the following command:

				
					chmod +x /usr/local/bin/docker-compose
				
			

Here verify the Docker Compose version using the following command:

				
					docker-compose --version
				
			

You should see the following output:

				
					Docker Compose version v2.6.1
				
			

Create a Docker Compose YAML File

Also  you will need to create a YAML file to install Cassandra inside the container. First, create a directory for Cassandra with the following command:

				
					mkdir Cassandra
				
			

Next, create a docker-compose.yaml file inside the Cassandra directory:

				
					nano Cassandra/docker-compose.yaml
				
			

Add the following configurations:

				
					version: '3.9'

services:
  cassandra:
    image: cassandra:4.0
    ports:
      - 9042:9042
    volumes:
      - ~/apps/cassandra:/var/lib/cassandra
    environment:
      - CASSANDRA_CLUSTER_NAME=cloudinfra
				
			

Save and close the file when you are finished.

Launch Cassandra Container

At this point, the Docker Compose file is ready to launch the Docker container. Now, change the directory to Cassandra and create a Cassandra container with the following command:

				
					cd Cassandra
docker-compose up -d
				
			

You should see the following output:

				
					[+] Running 10/10
 ⠿ cassandra Pulled                                                                                                                     13.7s
   ⠿ d7bfe07ed847 Pull complete                                                                                                          6.1s
   ⠿ caca7a4a00fe Pull complete                                                                                                          7.8s
   ⠿ b669251e1903 Pull complete                                                                                                         10.1s
   ⠿ cf45eff8a02b Pull complete                                                                                                         10.2s
   ⠿ 7cc26c186df9 Pull complete                                                                                                         10.3s
   ⠿ 48f949015080 Pull complete                                                                                                         11.6s
   ⠿ 46441dece7d1 Pull complete                                                                                                         11.8s
   ⠿ aa89acd6f103 Pull complete                                                                                                         13.1s
   ⠿ 2760e583af58 Pull complete                                                                                                         13.2s
[+] Running 2/2
 ⠿ Network cassandra_default        Created                                                                                              0.1s
 ⠿ Container cassandra-cassandra-1  Started                                                                                              0.9s
				
			

Now  you can verify the Cassandra container using the following command:

				
					docker-compose ps
				
			

Also you should see the running container in the following output:

				
					NAME                    COMMAND                  SERVICE             STATUS              PORTS
cassandra-cassandra-1   "docker-entrypoint.s…"   cassandra           running             7000-7001/tcp, 7199/tcp, 9160/tcp, 0.0.0.0:9042->9042/tcp, :::9042->9042/tcp
				
			

Please see the downloaded Cassandra image using the following command:

				
					docker images
				
			

You should see the following output:

				
					REPOSITORY   TAG       IMAGE ID       CREATED      SIZE
cassandra    4.0       efe0d8614dc7   7 days ago   343MB
				
			

Access Cassandra Container

You can now access the running Cassandra container using the following command:

				
					docker-compose exec cassandra /bin/bash
				
			

Once you are connected to the Cassandra container, you will get the following shell:

				
					root@9ca7c49f47b0:/# 
				
			

Connect to the Cassandra shell using the CQLSH utility:

				
					cqlsh
				
			

You should see the following shell:

				
					Connected to cloudinfra at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.5 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh> 
				
			

Verify the Cassandra version with the following command:

				
					show version
				
			

You should see the Cassandra version in the following output:

				
					[cqlsh 6.0.0 | Cassandra 4.0.5 | CQL spec 3.4.5 | Native protocol v5]
				
			

Please exit from the Cassandra shell using the following command:

				
					exit
				
			

Run the exit command again to exit from the Cassandra container:

				
					exit
				
			

To see the Cassandra container log, run the following command:

				
					docker-compose logs
				
			

You should see the following output:

				
					cassandra-cassandra-1  | INFO  [main] 2022-07-28 07:04:57,408 PipelineConfigurator.java:125 - Starting listening for CQL clients on /0.0.0.0:9042 (unencrypted)...
cassandra-cassandra-1  | INFO  [NonPeriodicTasks:1] 2022-07-28 07:04:57,415 SSTable.java:111 - Deleting sstable: /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-2-big
cassandra-cassandra-1  | INFO  [NonPeriodicTasks:1] 2022-07-28 07:04:57,420 SSTable.java:111 - Deleting sstable: /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb-1-big
cassandra-cassandra-1  | INFO  [main] 2022-07-28 07:04:57,432 CassandraDaemon.java:782 - Startup complete
cassandra-cassandra-1  | INFO  [OptionalTasks:1] 2022-07-28 07:05:07,297 CassandraRoleManager.java:339 - Created default superuser role 'cassandra'
				
			

That’s  it! Thank you for reading How to Setup Cassandra Docker Container using Docker Compose.

How to Setup Cassandra Docker Container using Docker Compose Conclusion

In this guide, we explained how to setup a Cassandra docker container using the Docker Compose on Ubuntu 20.04. The solution is highly used by app development and data management companies, including start-ups and traditional legendary enterprises. Written in Java, this NoSQL database is relatively different from other NoSQL and relational databases. Since it can handle high volumes, it is considered highly beneficial to crucial corporations.

Please take a look at our Cassandra content here

Avatar for Hitesh Jethva
Hitesh Jethva

I am a fan of open source technology and have more than 10 years of experience working with Linux and Open Source technologies. I am one of the Linux technical writers for Cloud Infrastructure Services.

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x