How to Install Apache Cassandra on Debian 11 Tutorial (Step by Step)

How to Install Apache Cassandra on Debian 11

What is Apache Cassandra

Apache Cassandra is a trusted open source NoSQL distributed database that delivers fast read write performance and allows storing large data volumes in different data centers. It is a highly scalable, fault tolerant NoSQL database written in Java. Most companies choose Apache Cassandra server over other database systems due to its advanced features, ability to handle large data sets and other advantages.

Today high profile companies like Spotify, Twitter, Cisco, eBay, Facebook, Instagram and Uber are using Cassandra. Apache Cassandra was designed to manage big data workloads across multiple servers without failure. Cassandra architecture supported horizontal scaling across multiple nodes that enabled users to address performance intensive use cases.

The column oriented database could process all PDFs, emails, social media posts and server logs. As a result, organizations can make better informed decisions with the help of Cassandra.

Over time the distributed database became popular and today is known as one of the trusted databases. In Cassandra node every node in the cluster is independent and takes over the other if one goes down. As a result, with Cassandra the chances of failure are less.

Cassandra supports various features, such as fast writes, effortless replication across data centers, zero downtime, linear scalability and more.

Also, the database nodes have the ability to accept read and write requests from any location. It is a popular schema free, API supported NoSQL database that offers easy replication with no single point of failure.

Benefits of Apache Cassandra

Unlike relational databases, Cassandra allows for unstructured data. Its simple design, horizontal scaling and more features enable developers to handle large data sets easily across multiple servers. Here are a few other benefits of using Cassandra:

Open Source

Apache Cassandra is an open source distributed database. Adapting open source technologies result in a higher speed of innovation. Many software development organizations look for software and tools that adopt open source technologies as they are affordable, extensible, and flexible, which helps in avoiding vendor lock in.

The Interface is quite familiar

Another benefit of choosing Apache Cassandra over other distributed databases is the Cassandra Query Language (CQL) is similar to SQL. As a result, developers find it relatively easy to use the language and perform operations.

Scalability

Earlier, applications were scaled vertically using expensive machines, as a result, it turned out to be a time consuming and costly procedure. However, today with distributed database like Cassandra, one can scale horizontally and add multiple nodes to the cluster faster. Also, one can scale across many geographical sites and put on more information or consumers as per the user’s requirement.

Quick Response Time

Cassandra’s fast linear scale performance allows users to add nodes to the clusters without worrying about the complexities. It further helps in improving the throughput and maintaining a quick response time.

Seamless Replication

Many enterprises are moving to hybrid cloud and multi data centers to improve business performance and for new challenges. The leading enterprises are also leveraging the strengths of each ecosystem rather than restricting themselves to a single provider. To get the most from the cloud environments, it is best to have a cloud database that provides security and scalability.

High level performance

In Cassandra, if any node goes down due to technical or other issues, the other can take its place and carry the data. Each node in cluster is equal and plays the same role in reading and writing operations. The fault tolerant features enable you to perform continuously. Further, one can add multiple nodes to the clout without creating any impact on the performance.

Follow this post to learn how to install Apache Cassandra on Debian 11.

Install Apache Cassandra on Debian 11

Install Java OpenJDK

Before starting, you will need to install OpenJDK to work seamlessly with Apache Cassandra. By default, Java is included in the Debian 11 default repository. You can install it by just running the following command:

				
					apt-get install default-jdk -y
				
			

Once Java is installed, you can verify the installed version of Java with the following command:

				
					java -version
				
			

You will get the Java version in the following output:

				
					openjdk version "11.0.14" 2022-01-18
OpenJDK Runtime Environment (build 11.0.14+9-post-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 11.0.14+9-post-Debian-1deb11u1, mixed mode, sharing)
				
			

At this point, Java is installed on your system. You can now proceed to the next step.

Install Apache Cassandra on Linux

Before starting, you will need to install some dependencies to your system. You can install them by running the following command:

				
					apt-get install gnupg2 wget curl unzip apt-transport-https -y
				
			

Once all the dependencies are installed, add the Cassandra GPG key with the following command:

				
					curl https://downloads.apache.org/cassandra/KEYS | apt-key add -
				
			

Next, add the Cassandra repository to APT using the following command:

				
					echo "deb https://downloads.apache.org/cassandra/debian 40x main" | tee -a /etc/apt/sources.list.d/cassandra.sources.list
				
			

Once the repository is added, update the repository and install the Apache Cassandra with the following command:

				
					apt-get update -y
apt-get install cassandra -y
				
			

Verify Apache Cassandra Installation

After the Apache Cassandra installation, its service starts automatically. You can check the status of Apache Cassandra using the following command:

				
					systemctl status cassandra
				
			

You will get the following output:

				
					● cassandra.service - LSB: distributed storage system for structured data
     Loaded: loaded (/etc/init.d/cassandra; generated)
     Active: active (running) since Fri 2022-03-25 05:29:27 UTC; 14s ago
       Docs: man:systemd-sysv-generator(8)
    Process: 7197 ExecStart=/etc/init.d/cassandra start (code=exited, status=0/SUCCESS)
      Tasks: 32 (limit: 4679)
     Memory: 1.1G
        CPU: 20.348s
     CGroup: /system.slice/cassandra.service
             └─7290 /usr/bin/java -ea -da:net.openhft... -XX:+UseThreadPriorities -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+AlwaysPreTouc>

Mar 25 05:29:27 debian11 systemd[1]: Starting LSB: distributed storage system for structured data...
Mar 25 05:29:27 debian11 systemd[1]: Started LSB: distributed storage system for structured data.

				
			

You can also use the nodetool command-line tool to check the Apache Cassandra status:

				
					nodetool status
				
			

You will get the following output:

				
					Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID                               Rack 
UN  127.0.0.1  69.09 KiB  16      100.0%            22efe79d-a286-470c-92a4-31b61bc18983  rack1

				
			

Change Apache Cassandra Cluster Name

Apache Cassandra provides a cqlsh command-line utility to interact with Cassandra via CQL. Run the following command to connect to the Cassandra:

				
					cqlsh
				
			

Once you are connected, you should see the following output:

				
					Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.3 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh> 
				
			

Cassandra Cluster

By default, the Cassandra cluster name is set to “Test Cluster”, you can change it via cqlsh utility.

First login to Cassandra with the following command:

				
					cqlsh
				
			

Once you are log in, change the Cassandra cluster name to “Cluster1” using the following command:

				
					UPDATE system.local SET cluster_name = 'Cluster1' WHERE KEY = 'local';
				
			

Run the following command to exit from the Cassandra shell:

				
					exit
				
			

Next, you will also need to edit the Cassandra configuration file and define your new cluster name.

				
					nano /etc/cassandra/cassandra.yaml
				
			

Change the following line:

				
					cluster_name: 'Cluster1'
				
			

Save and close the file, then flush the system cache with the following command:

				
					nodetool flush system
				
			

Restart the Cassandra service to apply the changes:

				
					systemctl restart cassandra
				
			

Now, verify the cluster name using the following command:

				
					cqlsh
				
			

You will get the following output:

				
					Connected to Cluster1 at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.3 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
				
			

Install Apache Cassandra on Debian 11 Conclusion

Congratulations! you have successfully installed Apache Cassandra on Debian 11. Now, use Cassandra in the server where you want to store the large volume of data set. For more information visit the Apache Cassandra official documentation page.

Avatar for Hitesh Jethva
Hitesh Jethva

I am a fan of open source technology and have more than 10 years of experience working with Linux and Open Source technologies. I am one of the Linux technical writers for Cloud Infrastructure Services.

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x