Configure Free!! Cassandra Cluster part-2

In the previous post here, we launched 3 “Free” EC2 Machines. Now in this post we will configure Cassandra on those nodes and make them work as a single Cluster.

1. First we need to install java on each node. Run the below commands on each of the node

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update &&sudo apt-get install oracle-java8-installer

Once completed, please check the java version installed by running below command

java -version

10

 

2. (for all nodes) Next, we need to Download Cassandra. For this Open the Apache Cassandra website, and go to Downloads.
http://cassandra.apache.org/download/
Then choose the appropriate version, we will take the latest one available, which is 3.7 as of 01-July-2016

11

here we get the download link as:
http://apache.communilink.net/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz

 

3. (for all nodes) Now to download Cassandra use the “wget” function followed by download url:

wget http://apache.communilink.net/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz

We need to run the above command on each node.

12

 

4. (for all nodes) Once downloaded, we need to unzip/untar the downloaded file, for that we can run the below command

tar -xvzf *cassandra*tar*

Please note the TAR options xvzf have meanings as specified below:
x: extract files
v: Verbose output shows all the files getting extracted
z: for unzip gzipped files
f: last option to suggest that next parameter is the path to the file or file itself

 

Once completed, it will create a folder containing cassandra files, to confirm run “ls” command:

13

This shows that a folder got created named “apache-cassandra-3.7”.

 

5. (for all nodes) We will move the contents of this folder to /usr/local/cassandra folder, which will become path to our cassandra home. You may choose another location as per your needs.

sudo mv apache-cassandra-3.7 /usr/local/cassandra

 

6. (for all nodes) Now we need to set the cassandra home path in our .bashrc file

sudo vi $HOME/.bashrc

Add the below lines to this file

export CASSANDRA_HOME=/usr/local/cassandra
export PATH=$PATH:$CASSANDRA_HOME/bin
export CASSANDRA_HOME
export PATH

14To save the file opened in vi editor, press ESC button followed by “:wq” and enter.

finally run the below command to make the above settings applied to our session:
source ~/.bashrc

 

7. (for all nodes) At this point we can run each individual node as a individual standalone Cassandra nodes. However, to run them as one cluster, we need to configure some properties.

sudo vi /usr/local/cassandra/conf/cassandra.yaml

16

The cluster name defaults to “test cluster“, we will let it be like that for the purpose of this demo.

 

8. (for all nodes) Search for the property “seed_provider”

17

and modify the following

- seeds: "127.0.0.1"

to

- seeds: "172.31.46.162"

That is making node1 as the only seed in this cluster. Please note that “172.31.46.162” is the private ip-address of the node1.

This setting is done exactly same on node2 and node3. That is because they should also have the same seed, which is node1 here.

 

9. Set the below properties for each node. Please note that each node would have different settings for these properties.

listen_address
broadcast_address
broadcast_rpc_address

All these need to be set to the PRIVATE IP ADDRESS of respective nodes.

On Node1
listen_address: 172.31.46.162
broadcast_address: 172.31.46.162
broadcast_rpc_address 172.31.46.162

On Node2
listen_address: 172.31.46.163
broadcast_address: 172.31.46.163
broadcast_rpc_address: 172.31.46.163

19

On Node3
listen_address: 172.31.46.161
broadcast_address: 172.31.46.161
broadcast_rpc_address: 172.31.46.161

 

10.  Once the settings are completed on all the nodes, we can start cassandra on each node one by one,
but we need to start from the seed nodes..

so starting cassandra on node1, which is also a seed.

To start Cassandra, just type in “cassandra” keyword

once it starts, start cassandra on the other two nodes

20

Congrats the Cluster is up and running.. Hurray!!!!!!!!
—————————————————————————————
Now we will do some very basic testing on this cluster..

Check if the nodes are Gossiping?

Run below command on any of the nodes which are not seeds:

nodetools netstats

21

 

The above image shows, the node has already completed 1125 Gossip messages. We can also confirm the same using “nodetool info” command also:

22

Now lets do some real testing, first we will create a keyspace, with replication factor as 2 and then confirm that exactly 2 nodes have that keyspace..

for that we start the cqlsh prompt by typing $cqlsh

23

then create a keyspace…

cqlsh> create keyspace myks WITH REPLICATION = { ‘class’ : ‘SimpleStrategy’, ‘replication_factor’ : ‘2’ };

cqlsh> use myks;

24

 

 

 

Now lets create some table… from another3.. check then insert data from node1
and we will check the data in node2 from where we created the keyspace

cqlsh> create table employee(id INT, name VARCHAR, age INT, salary INT, Primary Key(id, name));

25

From node1, we insert data into the table

cqlsh> insert into employee(id, name, age, salary)
values (1, ‘TOM’, 25, 10000);

cqlsh> insert into employee(id, name, age, salary)
values (2, ‘JIM’, 35, 20000);

cqlsh> insert into employee(id, name, age, salary)
values (1, ‘JOHN’, 45, 100000);

26

 

from the node2, through which we created the keyspace, we will check if the data has been inserted

node2…. cqlsh> select * from myks.employee;

however, as our background is coloured, we should disable cqlsh to display output in colour

To remove coloring of output by cqlsh, start cqlsh by

cqlsh --no-color

27

therefore, our cassandra cluster is doing what it is supposed to do. Hurray! This completes the guide on how to get a Cassandra Cluster Up and Running for free!!!