Large Scale Machine Learning and Other Animals: Deploying a GraphLab Cluster using MPI

Wednesday, October 24, 2012

Deploying a GraphLab Cluster using MPI

Note: the MPI section of this toturial is based on this excellent tutorial.

Preliminaries:

Mpi should be installed

Step 0: Install GraphLab on one of your cluster nodes.

Using the instructions here on your master node (one of your cluster machines)

Step 1: start MPI

a) Create a file called .mpd.conf in your home directory (only once)
This file should contain a secret password using the format:
secretword=some_password_as_you_wish

b) Verify that MPI is working by running the daemon (only once)
$ mpd & # run the MPI daemon
$ mpdtrace # lists all active MPI nodes
$ mpdallexit # kill MPI

c) Spawn the cluster. Create a file named machines with the list of machines you like to deploy.
Run:
$ mpd -f machines -n XX # where XX is the number of MPI nodes you like to spawn.

d) Copy GraphLab files to all machines.
On the node you installed GraphLab on, run the following commands to copy GraphLab files to the rest of the machines:

d1) Verify you have the machines files from section 1c) in your root folder.

d2) Copy the GraphLab files
cd ~/graphlabapi/release/toolkits; ~/graphlabapi/scripts/mpirsync
cd ~/graphlabapi/deps/local; ~/graphlabapi/scripts/mpirsync

Step 2: Run GraphLab ALS

This step runs ALS (alternating least squares) in a cluster using small netflix susbset.
It first downloads the data from the web: http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train and http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate, and runs 5 alternating least squares iterations. After the run is completed, you can login into any of the nodes and view the output files in the folder ~/graphlabapi/release/toolkits/collaborative_filtering/

The algorithm operation is explained in detail here.

cd /some/ns/folder/
mkdir smallnetflix
cd smallnetflix/
wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train
wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate

Now run GraphLab:

mpiexec -n 2 /path/to/als --graph /some/ns/folder/ --max_iter=3 --ncpus=1

Where -n is the number of MPI nodes, and --ncpus is the number of deployed cores on each MPI node.

Note: this section assumes you have a network storage (ns) folder where the input can be stored.
Alternatively, you can split the input into several disjoint files, and store the subsets on the cluster machines.

Note: Don't forget to change /path/to/als and /some/ns/folder to your actual folder path!

Large Scale Machine Learning and Other Animals

Wednesday, October 24, 2012

Deploying a GraphLab Cluster using MPI

Preliminaries:

Step 0: Install GraphLab on one of your cluster nodes.

Step 1: start MPI

Step 2: Run GraphLab ALS

No comments:

Post a Comment

Labels

GraphLab Users Google Group

pagerank

google analytics

syntax