How to Install Apache Cassandra on CentOS 8. In this article we will introduce what Apache Cassandra is with its pros and features and next we will move onto installation guide. let’s get started.
What is Cassandra
Apache Cassandra is an open source distributed database developed by Apache Software Foundation, allowing users to store and maintain large data volumes across different data centers. The column oriented database has a peer to peer architecture, is highly consistent, fault tolerant and is scalable. Written in Java language and it is one of the efficient NoSQL databases with advanced features.
The purpose of designing the distributed database was to help companies handle big data workloads across multiple servers and data centers without failure. The Cassandra database provides high availability and enables the deployment of multi node Cassandra clusters without a single point of failure to meet the demands. Each node in Cassandra is independent, interconnected, and plays the same role.
Regardless of your data located in the cluster, the database nodes can accept read and write requests. As a result, if any node goes down due to a technical issue, the other node can serve the read/write requests in the network. Facebook, Rackspace, eBay, Twitter, Cisco, Adobe, Netflix, etc., are a few high profile companies that use Apache Cassandra.
Apache Cassandra Features
Cassandra is one of the popular distributed databases available because of its technical features. Here are some of the features of Cassandra that make it an attractive option for enterprises:
Fast Writes: Cassandra is compatible with cheap commodity hardware and can run or handle large data volumes. It can write faster than other databases and store hundreds of terabytes of information without creating any impact on the read efficiency.
Fault Tolerant: In case any node goes down in Cassandra, the other can take its place as each node is equal, carries the same data, and plays the same role. Thus, it is fault tolerant. Also, you can add extra nodes to the cluster as per the need, which ensures less chance of affecting the performance.
High Scalability: The design allows users to easily add extra nodes to the Cassandra cluster at any given time as the demand or need grows. Cassandra grows horizontally rather than going vertical. With Cassandra, you can extend or scale across many geographical sites and add more data or consumers as needed.
Supports a Wide Range of Data Structures: Cassandra allows users to store structured, semi structured, and unstructured data. The open source distributed database supports all kinds of data structures and their dynamic changes to reflect the changing needs and demands.
Quick Response Time: Cassandra is linearly scalable and allows users to increase the count of nodes in the cluster. Users can add extra nodes in a linear fashion without thinking much about the complexities. As a result, you can increase the throughput and maintain a quick response time. Thus, Cassandra offers fast linear scale performance.
Transaction Support: ACID stands for Atomicity, Consistency, Isolation, and Durability. Cassandra supports the properties of ACID transactions as these are supported by relational databases.
Easy Data Distribution: The column oriented database allows distribution of data in a seamless manner. Data distribution and replication perform together in Cassandra. Data distribution in Cassandra is a quick and simple process because it provides the flexibility to transfer information by replicating data across different data centers and commodity servers.
High Reliability: All the nodes in the cluster are interconnected. As a result, Cassandra ensures it has no single node failure and performance doesn’t get affected in any way. The design was built in a manner that it could manage the failure of nodes, a vital feature for mission critical applications.
Follow the post below to learn how to install Apache Cassandra on CentOS 8.
Table of Contents
How to Install Apache Cassandra on CentOS 8
Apache Cassandra is based on Java and supports only Java version 8. So you will need to install Java 8 on your server. You can install it by running the following command:
dnf install java-1.8.0-openjdk-devel -y
Once Java is installed, verify the Java version with the following command:
You will get the Java version in the following output:
openjdk version "1.8.0_322" OpenJDK Runtime Environment (build 1.8.0_322-b06) OpenJDK 64-Bit Server VM (build 25.322-b06, mixed mode)
Install Apache Cassandra
By default, Apache Cassandra is not included in the CentOS 8 default repo. So you will need to create a Cassandra repo on your system. You can create it with the following command:
Add the following lines:
[cassandra] name=Apache Cassandra baseurl=https://www.apache.org/dist/cassandra/redhat/311x/ gpgcheck=1 repo_gpgcheck=1 gpgkey=https://www.apache.org/dist/cassandra/KEYS
Save and close the file then install the Apache Cassandra with the following command:
dnf install cassandra -y
Once the Apache Cassandra is installed, You can proceed to the next step.
Create a Systemd Service File for Apache Cassandra
Next step in this guide how to Install Apache Cassandra on CentOS 8 is to create a systemd service file to manage the Apache Cassandra service. You can create it with the following command:
Add the following lines:
[Unit] Description=Apache Cassandra After=network.target [Service] PIDFile=/var/run/cassandra/cassandra.pid User=cassandra Group=cassandra ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid Restart=always [Install] WantedBy=multi-user.target
Save and close the file, then reload the systemd daemon with the following command:
Next, start the Cassandra service and enable it to start at system reboot:
systemctl start cassandra systemctl enable cassandra
You can now check the status of the Apache Cassandra with the following command:
systemctl status cassandra
You will get the following output:
● cassandra.service - Apache Cassandra Loaded: loaded (/etc/systemd/system/cassandra.service; disabled; vendor preset: disabled) Active: active (running) since Thu 2022-02-24 14:08:04 UTC; 4s ago Main PID: 5629 (java) Tasks: 27 (limit: 11412) Memory: 1.0G CGroup: /system.slice/cassandra.service └─5629 /usr/bin/java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOut> Feb 24 14:08:07 centos8 cassandra: INFO [main] 2022-02-24 14:08:07,045 CassandraDaemon.java:505 - Classpath: /etc/cassandra/conf:/usr/> Feb 24 14:08:07 centos8 cassandra: INFO [main] 2022-02-24 14:08:07,045 CassandraDaemon.java:507 - JVM Arguments: [-Xloggc:/var/log/cas> Feb 24 14:08:07 centos8 cassandra: WARN [main] 2022-02-24 14:08:07,149 NativeLibrary.java:189 - Unable to lock JVM memory (ENOMEM). Th> Feb 24 14:08:07 centos8 cassandra: WARN [main] 2022-02-24 14:08:07,150 StartupChecks.java:136 - jemalloc shared library could not be p> Feb 24 14:08:07 centos8 cassandra: WARN [main] 2022-02-24 14:08:07,150 StartupChecks.java:169 - JMX is not enabled to receive remote c> Feb 24 14:08:07 centos8 cassandra: INFO [main] 2022-02-24 14:08:07,151 SigarLibrary.java:44 - Initializing SIGAR library Feb 24 14:08:07 centos8 cassandra: WARN [main] 2022-02-24 14:08:07,170 SigarLibrary.java:174 - Cassandra server running in degraded mo> Feb 24 14:08:07 centos8 cassandra: WARN [main] 2022-02-24 14:08:07,171 StartupChecks.java:311 - Maximum number of memory map areas per> Feb 24 14:08:07 centos8 cassandra: INFO [main] 2022-02-24 14:08:07,263 QueryProcessor.java:121 - Initialized prepared statement caches> Feb 24 14:08:07 centos8 cassandra: INFO [main] 2022-02-24 14:08:07,898 ColumnFamilyStore.java:432 - Initializing system.IndexInfo
You can also check the Cassandra status using the nodetool.
If everything is fine, you will get the following output:
Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 70.88 KiB 256 100.0% ba8c91a0-bf8f-4ae1-973e-e6d178ca624a rack1
Change Default Cluster Name
First, you will need to install Python2 to use the Cassandra cqlsh utility. You can install it with the following command:
dnf install python2 alternatives --set python /usr/bin/python2
Next, connect to the Cassandra using the cqlsh utility:
Once you are connected, you will get the following shell:
Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.12 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help.
As you can see, the default cluster name is set to Test Cluster. To change the default cluster name, run the following command:
cqlsh> UPDATE system.local SET cluster_name = 'New Cluster' WHERE KEY = 'local';
Next, exit from the Cassandra shell with the following command:
Next, you will also need to define the new cluster name in the Cassandra configuration file. You can edit it with the following command:
Change the cluster name as shown below:
cluster_name: 'New Cluster'
Save and close the file, then run the following command to flush the cache:
nodetool flush system
Next, restart the Apache Cassandra service to apply the changes:
systemctl restart cassandra
Now, verify the new cluster name with the following command:
You should see the new cluster name in the following output:
Connected to New Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.12 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help.
Great, you have learned how to how to Install Apache Cassandra on CentOS 8!
How to Install Apache Cassandra on CentOS 8 Conclusion
Cassandra is a NoSQL database that can store a lot of data and distribute that data as much as possible. Companies who need to process a lot of data and do so quickly and reliably will be successful with it.
In the above guide, we explained what Apache Cassandra is with the main features and how to install Apache Cassandra on CentOS 8. We also explained how to change the default cluster name. For more information, visit the Cassandra documentation.