These instructions explain how to install DataStax Enterprise on a set of Digital Ocean droplets. DataStax provides instructions for Installing on cloud providers, but currently only Amazon EC2 and HP Cloud are described specifically.
The steps below can be used for Digital Ocean, or more generally for any other cloud provider. We’ll create a set of Ubuntu droplets and install DataStax Enterprise (DSE) on them to create a Cassandra cluster.
Update: Scroll to the bottom for a video demo of these install steps.
These are the relevant DataStax documentation pages if you want to learn more details behind each step:
- Installing DataStax Enterprise using Deb repositories
- Single data center deployment
- Multiple data center deployment
Prerequisites
- Register for DataStax Enterprise (free, allows use of DataStax Enterprise in dev/test environments)
- An active Digital Ocean account (referral link if you don’t have an account yet)
Creating Digital Ocean Droplets
- Login to the Digital Ocean and navigate to the control panel
- On your local system create an SSH key and store it in the Digital Ocean control panel (help)
- On the control panel click Create with settings like:
- Hostname: node0
- Size: 4 GB / 2 CPU
- Region: default
- Image: Linux Ubuntu 14.04 64-bit
- SSH key: Select the one created previously
- Settings: default
- Repeat for node1, node2, etc. (as many nodes as desired)
- As the nodes are coming up make note of the IP addresses
Installing DataStax Enterprise
For a faster install, see Parallel Installs below.
- SSH into the first node
- Confirm whether Java is already installed (it may be, depending on the Linux image); if not, install either OpenJDK Java or Oracle Java
-
Add the DataStax repository using the username and password from your registration:
echo "deb http://username:password@debian.datastax.com/enterprise stable main" | sudo tee -a /etc/apt/sources.list.d/datastax.sources.list
-
Add the DataStax repository key:
curl -L --silent https://debian.datastax.com/debian/repo_key | sudo apt-key add -
-
Update the local package cache:
sudo apt-get update
(if you see any “403 Not Authorized” errors here, stop and make sure your username and password are correct)
-
Install DataStax Enterprise:
sudo apt-get install dse-full
-
Edit
/etc/hosts
and add an entry for the host with its public IP address (replacing the 127.0.1.1 entry if it exists) -
Edit
/etc/dse/cassandra/cassandra.yaml
and change a couple of settings:- Set
cluster_name
as desired - In the
seeds
field, list the IP address of node0 (the first server will be the seed for the cluster) - Set
listen_address
to blank - Set
num_tokens
to 256
- Set
-
Start the DSE service:
sudo service dse start
-
Repeat the above steps for the remaining nodes (node1, node2, etc.)
-
SSH to any of the nodes and check the status of the DSE cluster:
nodetool status
Parallel Installs
You can run the above install commands in parallel for a much faster setup time. On the Mac I use i2cssh which powers several iTerm2 consoles in parallel.
This technique is borrowed from Jake Luciani’s video How to set up a 4 node Cassandra cluster in under 2 minutes.
Steps:
- Install i2cssh and iTerm2
-
Create a file
~/.i2csshrc
with the server IP addresses. For example this file defines 3 servers included in a cluster named ‘digdemo’:version: 2 iterm2: false clusters: digdemo: login: root hosts: - 54.176.126.209 - 54.176.91.139 - 50.18.136.76
-
Launch parallel terminal sessions:
i2cssh -c digdemo
-
Enable broadcast mode in iTerm2 with Cmd-Opt-I
-
Type commands from the install procedure above; they should be echoed on all sessions in parallel
Video Demo
Notes
- The cluster created above should not be used for production because we have not set up any security.
- To use RedHat or other YUM-based environments, you can follow the same steps above using Installing DataStax Enterprise using Yum repositories as a guide.