As if MLers where not busy with the Yahoo! KDD Cup, today I found out about another interesting contest.
This time the prize is much higher: up to 1M$. See http://overstockreclabprize.com/
Jake Mannix (from Twitter) dug into the details and here is his impression:
It's actually a pretty interesting challenge, once you get past the
constrictions of their API: you're optimizing explicitly for revenue-per-session, take as
input past sessions, which include the kinds of practical things you'd like: each
session is by a userId which will naturally include repeat customers,
products have prices, and there are categoryId labels already.
Due to the whole Netflix data lawsuit, the training data is synthetic, which
puts the contestants at a disadvantage, and another interesting fact:
runtime performance is at issue: your code will be run *live*, with your model being
used to produce recommendations with a hard timeout of 50ms - if you
miss this more than 20% of the time, you fail to progress to the end of
the semi-final round.
You're allowed to use open-source Apache licensed code (and are in fact
*required* to license your code according to the ASL to compete), but
their APIs are, while extraordinarily similar to Hadoop and Mahout/Taste,
are fixed, so you can't just do drop-in replacement.