Initial results are encouraging. Mahout Alternating least squares implementation by Sebastian Schelter was tested on Amazon EC2, using two m2.2xlarge nodes (13x2 virtual cores).
For running 10 iterations, number of features=20, lambda=0.065, it takes 39272 seconds, while GraphLab implementation in C++ takes only 714 seconds (on a machine with 8 cores).
Running time may be taken with a grain of salt, since I was not using the exact same machine, but the magnitude of difference will certainly hold even if I would run GraphLab on EC2 (which I plan to do soon).
Regarding accuracy, Mahout ALS has a test RMSE accuracy of
0.9310 while GraphLab obtained slightly better accuracy of 0.9279.
Here is Mahout ALS final output: (of the RMSE computation)
ubuntu@ip-10-115-27-222:/mnt$ /usr/local/mahout-0.4/bin/ mahout evaluateALS --probes /user/ubuntu/myout/probeSet/ --userFeatures /tmp/als/out/U/ --itemFeatures /tmp/als/out/M/ | grep RMSE 11/02/17 12:31:42 WARN driver.MahoutDriver: No evaluateALS.props found on classpath, will use command-line arguments only 11/02/17 12:31:42 INFO common.AbstractJob: Command line arguments: {--endPhase=2147483647, --itemFeatures=/tmp/als/out/M/, --probes=/user/ubuntu/myout/probeSet/, --startPhase=0, --tempDir=temp, --userFeatures=/tmp/als/out/U/} RMSE: 0.9310729597725026, MAE: 0.7298745910296568 11/02/17 12:31:55 INFO driver.MahoutDriver: Program took 12437 ms
Here is the GraphLab output:
bickson@biggerbro:~/newgraphlab/graphlabapi/debug/apps/pmf$ ./PMF netflix-r 10 0 --D=20 --max_iter=10 --lambda=0.065 --ncpus=8 setting run mode 0 INFO :pmf.cpp(main:1121): PMF starting loading data file netflix-r Loading netflix-r train Creating 99072112 edges... ................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................loading data file netflix-re Loading netflix-re test Creating 1408395 edges... ........setting regularization weight to 0.065 PTF_ALS for matrix (480189, 17770, 27):99072112. D=20 pU=0.065, pV=0.065, pT=1, muT=1, D=20 nuAlpha=1, Walpha=1, mu=0, muT=1, nu=20, beta=1, W=1, WT=1 BURN_IN=10 complete. Obj=6.83664e+08, TEST RMSE=3.7946. INFO :asynchronous_engine.hpp(run:56): Worker 0 started. ... INFO :asynchronous_engine.hpp(run:56): Worker 7 started. Entering last iter with 1 228.524) Iter ALS 1 Obj=2.60675e+08, TRAIN RMSE=2.2904 TEST RMSE=0.9948. Entering last iter with 2 289.594) Iter ALS 2 Obj=6.48921e+07, TRAIN RMSE=1.1400 TEST RMSE=0.9573. Entering last iter with 3 350.487) Iter ALS 3 Obj=4.75073e+07, TRAIN RMSE=0.9754 TEST RMSE=0.9444. Entering last iter with 4 411.551) Iter ALS 4 Obj=4.09914e+07, TRAIN RMSE=0.9063 TEST RMSE=0.9381. Entering last iter with 5 472.615) Iter ALS 5 Obj=3.79096e+07, TRAIN RMSE=0.8718 TEST RMSE=0.9348. Entering last iter with 6 533.039) Iter ALS 6 Obj=3.61298e+07, TRAIN RMSE=0.8513 TEST RMSE=0.9324. Entering last iter with 7 594.177) Iter ALS 7 Obj=3.50076e+07, TRAIN RMSE=0.8382 TEST RMSE=0.9305. Entering last iter with 8 654.41) Iter ALS 8 Obj=3.42655e+07, TRAIN RMSE=0.8294 TEST RMSE=0.9290. Entering last iter with 9 714.095) Iter ALS 9 Obj=3.37535e+07, TRAIN RMSE=0.8234 TEST RMSE=0.9279. INFO :asynchronous_engine.hpp(run:66): Worker 6 finished. ... INFO :asynchronous_engine.hpp(run:66): Worker 2 finished.
A question I got from Alexandre Rodriguez (FEUP):
ReplyDeleteBeing distributed, .. , I'm considering in using GraphLib to write a ALSWR Factorizer (distributed) and some other functions (I must plan how should I use the GraphLib paradigm to do so).
Can you tell me if there's any kind of SVD implementation or similar approaches using GraphLab?
My answer:
Of course! There is graphlab implementation of the exact ALSWR algorith.
Documentation is found here: http://www.graphlab.ml.cmu.edu/pmf.html
Installation instructions for Linux: http://bickson.blogspot.com/2011/02/graphlab-large-scale-machine-learning.html
Installation instructions for MAC OS: http://bickson.blogspot.com/2011/02/graphlab-large-scale-machine-learning_28.html
You will need to install itpp/lapack. Installation instructions is found here:
http://bickson.blogspot.com/2011/02/installing-blaslapackitpp-on-amaon-ec2.html
This post discusses performance of Graphlab compared to Mahoot:
http://bickson.blogspot.com/2011/02/large-scale-matrix-factorization-using.html
Anyone who is trying this out - let me know if you need any assistance in installation and setup. I am quite excited since the code was just released and I am already in touch with several people who are trying it already. Any feedback is appreciated.