Monday, July 22, 2013

2nd GraphLab Workshop Talks Now Online!

We have just released most of the workshop talks. If you missed the workshop, you are encouraged to catch up! Here is the GraphLab Keynote talk by Prof. Carlos Guestrin, CEO of GraphLab & Prof. at University of Washington:


The rest of the talk videos are here.

Patrick Durusau sent me a link to his blog post where he linked all speakers and presenters to their DBLP publications. Thanks Patrick!

Benchmarking Machine Learning Frameworks

I am often contacted from different researchers (both universities and companies) who are trying to benchmark and compared different machine learning frameworks. I am trying to introduce them to each other, since basically everyone is compiling the same benchmark tests, and it will be a good idea to create uniform measures and practices for comparing systems. One such example is Intel Labs report I wrote about a couple of months ago in my blog.

A few days ago I got from my collaborator Aapo Kyrola a related paper: Li, Kevin; Gibson, Charles; Ho, David; Zhou, Qi; Kim, Jason; Buhisi, Omar; Brown, Donald E.; Gerber, Matthew, "Assessment of machine learning algorithms in cloud computing frameworks", Systems and Information Engineering Design Symposium (SIEDS), 2013 IEEE, pp.98,103, 26-26 April 2013. IEEExplore

The above paper performs some comparison tests on Amazon EC2, using the same hardware, and similar algorithms and datasets. And here is the bottom line:
As you can see, GraphLab is significantly faster, comparing two tasks: collaborative filtering (ALS) and text analysis (LDA). The paper claim that mahout is slightly more accurate.

Hopefully the construction in the paper is detailed enough so people will be able to reproduce it.

Sunday, July 7, 2013

Large Scale Reommender Systems Workshop - LSRS 2013 - Call for papers


As part of Recsys 2013, we (Tao Ye from Pandora & Quan Yuan from Taobao) are organizing a large scale recommender system workshop.

Anyone working in this domain is encouraged to submit an abstract describing your work. 

Submission deadline is July 21st, 2013. Extended to July 25, 2013.

We have two confirmed keynotes:

  • Aapo Kyrola, CMU will talk about GraphChi out of core graph computation framework
  • Justin Basilico, Netflix will talk about collaborative filtering @ Netflix




Saturday, July 6, 2013

Still a chance to win a Google Chrome book, iPad and Kindle!!

Only two days left to fill our online survey.

Looking forward to hear more what are you working on, which tools are you using and what data magnitudes.

We will announce in a couple of days of the winners (by emailing the winners directly, and also using this blog)


Additionally, we have 3 local GraphLab users meetup groups:

San Francisco: http://www.meetup.com/San-Francisco-GraphLab-Users/
Atlanta: http://www.meetup.com/Atlanta-GraphLAB-Users-Group/
Seattle: http://www.meetup.com/Seattle-GraphLab-Users/

New York: http://www.meetup.com/New-York-GraphLab-Users/

We plan to hold occasional meetups with tutorials, demo and new feature releases of GraphLab.

Everyone is welcome to join!

GraphLab Workshop Reflection

A nice Gigaom article.

I got from my collaborator Chris DuBois, the following blog post about the GraphLab workshop, written by OpenDNS security research.

Here are some highlights:
A few hundred researchers from academia and industry gathered on Monday, July 1 for the 2nd annual Graphlab Workshop at the Nikko hotel in downtown San Francisco. The event was a great success in acting as a venue to discuss challenges and opportunities the emerging large scale graph analytics community currently faces.
The first talk was about a product we have been excited about for quite some time called GraphLab.
....

Umbrella Security Labs tried GraphLab 2 a couple months ago on our research 10-node cluster, and were impressed by the results. Algorithms running at high speed allowed us to quickly build new models and check their output on a complete data set.
Furthermore, a solid set of algorithms have already been implemented on top of this incredibly fast engine. They address a wide range of problems, from the domains of graph mining, to machine learning and linear algebra. 

Read more.. 

Tuesday, July 2, 2013

2nd GraphLab workshop is over!

  • Thanks to the 570 data scientists who joined us, thanks for the great speakers and of course to our sponsors. We hope to post the talks videos online soon.



    Looking forward to next year's workshop!

    And Here are some twits about the workshop...

    "GraphLab is a graph-based, high performance, distributed computation framework written in C++."
  • Today in San Francisco: 'Graphlab "Big Learning" Workshop' @ 7:00 AM
  • It was mostly about systems such as new version of graphlab, graph builder, naiad, grappa, etc. Not many papers tho
  • Who best addresses Graph isomorphism. or , discussed this riddle with .
  • RT : I just posted my slides about Naiad from the workshop
  • It just occurred to me that wants to be another -- it asks questions we answered in Seattle in mid-2000s...
  • the evolution of technology platforms posibility -> scalability -> usability - Carlos Guestrin
  • Thanks to all that attended the 2nd Annual GraphLab Workshop! What a great success!!
  • My favorite talk: Pankaj's on Cassovary. We use and can pick it up tmrw! Just stick it all into RAM.
  • graphlab-code/graphlab .. Distributed graph computation framework .. via
  • I just posted my slides about Naiad from the workshop via
  • Walmart Labs recommendation beyond similarity. You do NOT want to recommend a similar TV right after a user purchased TYVM!
  • Great to see former twitter intern presenting right after me (on some of the work inspired by his twitter internship)
  • Predicting drug-target interactions using Latent Factor models by Murat Can Conbanoglu, Balestra, CMU 45% hit
  • Interesting, 40% as many unfollows as follows daily
  • listening to at discussing the who to follow system at twitter 400 million tweets/day
  • Naiad by . IMHO MSFT research has terrific NoSQL technologies. Shame Redmond bigwigs keep pushing SQLServer for revenue.
  • Graph overload for me. But still Hash to min algo for graphs in is the best.
  • Beyond Pregel - Disclaimer: Focus on algorithmic methods. By looking for large-scale over lapping clusters using MR and ASYMP
  • RT : Heheeheh Facebook presenter - cheeky! Love it. workshop
  • workshop talk by Dr. Avery Ching, Facebook – Graph Processing at Facebook Scale.rocking it Trillion graph edges!
  • Time for graph talk by in workshop
  • Ted Willke: is fast but getting appropriate data into shape for it is hard/slow. Nodding.
  • Any / BDAS people at today? Would love to chat about an integrated solution for ETL, exploratory, OLTP, graph processing
  • at Workshop ( ) today, stop by and talk to us during the poster session
  • Intel GraphBuilder talk, committed to open source
  • My stochastic eyeballs regressed 500 heads of hair trend black. Blonds, redheads, gray beards are outliers. Lunch!
  • Support for interactive data analysis in version 3 is an exciting direction for the startup to take. Can't wait to try it myself.
  • RT : *mind melted* Stoachastic approximation - holy smokes
  • workshop. Never occurred to me that ML can be domain blind and data schematic ignorant.
  • Python journal for : can't wait to play with it.
  • Dr. and Dan LaRocque are at 2013. Stop by to talk about the Titan/Faunus
  • 570 tickets sold! The 2nd Annual GraphLab Workshop is a great success!!
  • Lot of graph engine frameworks seem 2b taking route of parallel iterators n complex but transparent data structure
  • GL v1 Possibility | GL v2 Scalability | GL v3 Usability. opening the workshop to 500 Graphistas
  • Are you at the GraphLab workshop? I'd be glad to chat with you after this morning's talks.
  • Met buncha people from smart startups at GraphLab. Not a single person from stupid ones!
  • 'The number of people predicting the demise of Moore's law doubles every year.'
  • Be sure to check out our poster presentation today at in San Francisco! Click link for event info:
  • Graphlab 3 looks like it Is going to be awesome...
  • Python console for 3 to play around. So cool!
  • Scalability + Usability: GraphLab 2.2 is out Loving the live demo - python console, streaming graphs, & more
  • accessed using IPython notebook. Nearly seamless transition between distributed computing and numpy/matplotlib.
  • new site has an interface to spin up a EC2 cluster and access it via ipython notebook. getting a live demo now at the workshop :)
  • Graphlab 2.2 makes it extremely easy to get started using web interface. No installation needed! Will make it fun to play around
  • GraphLab 2.2 will be released today. Including a new awesome programming model (WarpGraph).
  • Not at , but hello anyway :) Hope you are well!
  • Looking forward to talking at workshop this afternoon. If you are there, please say hello.
  • I am giving short GraphChi talk at the great workshop today at 4pm.
  • Looking forward to talking about Naiad at the workshop this afternoon!
  • speaking about Adaptive User Segmentation for Recommendation at the Workshop () in Mon 7/1!
  • Time came for workshop in SF
  • At the workshop. Interesting mix of people here
  • Will be late for workshop for sure, no BART, :(
  • Beautiful day in SF! Checking into .
  • We're giving away GraphLab tee shirts away at registration.
  • I just checked in our first event attendee. The 2nd Annual GraphLab Workshop begins!! !