Preliminaries:
- Mpi should be installed
Step 0: Install GraphLab on one of your cluster nodes.
Using the instructions here on your master node (one of your cluster machines)
Step 1: start MPI
a) Create a file called .mpd.conf in your home directory (only once)This file should contain a secret password using the format:
secretword=some_password_as_you_wish
b) Verify that MPI is working by running the daemon (only once)
$ mpd & # run the MPI daemon
$ mpdtrace # lists all active MPI nodes
$ mpdallexit # kill MPI
c) Spawn the cluster. Create a file named machines with the list of machines you like to deploy.
Run:
$ mpd -f machines -n XX # where XX is the number of MPI nodes you like to spawn.
d) Copy GraphLab files to all machines.
On the node you installed GraphLab on, run the following commands to copy GraphLab files to the rest of the machines:
d1) Verify you have the machines files from section 1c) in your root folder.
d2) Copy the GraphLab files
cd ~/graphlabapi/release/toolkits; ~/graphlabapi/scripts/mpirsync
cd ~/graphlabapi/deps/local; ~/graphlabapi/scripts/mpirsync
Step 2: Run GraphLab ALS
This step runs ALS (alternating least squares) in a cluster using small netflix susbset.
It first downloads the data from the web: http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train and http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate, and runs 5 alternating least squares iterations. After the run is completed, you can login into any of the nodes and view the output files in the folder ~/graphlabapi/release/toolkits/collaborative_filtering/
It first downloads the data from the web: http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train and http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate, and runs 5 alternating least squares iterations. After the run is completed, you can login into any of the nodes and view the output files in the folder ~/graphlabapi/release/toolkits/collaborative_filtering/
mkdir smallnetflix
cd smallnetflix/
wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train
wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate
Now run GraphLab:
mpiexec -n 2 /path/to/als --graph /some/ns/folder/ --max_iter=3 --ncpus=1
Where -n is the number of MPI nodes, and --ncpus is the number of deployed cores on each MPI node.
Note: this section assumes you have a network storage (ns) folder where the input can be stored.
Alternatively, you can split the input into several disjoint files, and store the subsets on the cluster machines.
Note: Don't forget to change /path/to/als and /some/ns/folder to your actual folder path!
No comments:
Post a Comment