tag:blogger.com,1999:blog-3211409948956809184.post1172135409417392537..comments2024-03-21T04:14:27.443-07:00Comments on Large Scale Machine Learning and Other Animals: KDD CUP 2012 Track 1 using GraphLab..Danny Bicksonhttp://www.blogger.com/profile/01517237836051035400noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-3211409948956809184.post-65379250007446275112012-05-18T15:01:56.704-07:002012-05-18T15:01:56.704-07:00Thank you both very much for the response!Thank you both very much for the response!Ali Mashhoorihttps://www.blogger.com/profile/13720638351838111039noreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-2153761082334873992012-05-18T05:58:20.844-07:002012-05-18T05:58:20.844-07:00Also please ask your questions here: https://group...Also please ask your questions here: https://groups.google.com/group/graphlab-kdd<br />Thanks!!Danny Bicksonhttps://www.blogger.com/profile/01517237836051035400noreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-66281265335931263572012-05-18T05:37:37.675-07:002012-05-18T05:37:37.675-07:00HI,
Can you be more specific, which of the progra...HI, <br />Can you be more specific, which of the program mentioned above gives you the error?Danny Bicksonhttps://www.blogger.com/profile/01517237836051035400noreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-68496718562772532032012-05-18T04:28:05.843-07:002012-05-18T04:28:05.843-07:00Hi Danny,
I tried to follow this blog post, but I...Hi Danny,<br /><br />I tried to follow this blog post, but I keep getting out of memory error. I am running on 12GB RAM.<br />How big is the memory needed to run this script?<br /><br />Thank you<br /><br />PhilipsPhilipsnoreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-25283245903306452352012-05-17T22:52:20.803-07:002012-05-17T22:52:20.803-07:00Here is the answer I got from Michael Jahrer from ...Here is the answer I got from Michael Jahrer from the commendo team to your question:<br /><br />Wow 0.38234 is really good for one model.<br />I assume that this is more than plain SVD.<br />Yes the user overlap in train and test is very bad, so you have to use other user information sources in order to solve the issue.<br /><br />My best factor model (SVD) has 0.38591 leaderboard score, but this integrates additional parts: asymmetric info, user sns, user action, user keywords, user tags, user genre and user birthYear info. It has 20 features, more seems to hurt. And i trained it to minimize the rank (like in our 2011 kdd papers) - ranking improves the MAP approx. 0.005. By using all this parts the train/test user coverage is much better - therefore the better score. Plain SVD gives me about 0.346 on leaderboard. An ASVD gives me 0.352, ASVD + user sns part gives me 0.371 leaderboard score.<br /><br />Thanks Michael!Danny Bicksonhttps://www.blogger.com/profile/01517237836051035400noreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-11918679853813866162012-05-17T06:42:52.844-07:002012-05-17T06:42:52.844-07:00Hi. I'm using matrix factorization too. (I hav...Hi. I'm using matrix factorization too. (I have not tried GraphLab Yet). My current MAP is 0.38234. <br />I have a question and it is very kind of you if you give your opinion about it.<br />As ehtsham mentioned, users in the test file who have no data in the training file (81% of users in the test file!) are a big problem for MF models. But this is not the only problem. The most prevalent items in the test file are extremely rare in the training period.<br />Do you think there is a way to alleviate this problem?Ali Mashhoorihttps://www.blogger.com/profile/13720638351838111039noreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-27774107672154196532012-05-15T00:36:25.148-07:002012-05-15T00:36:25.148-07:00Hi!
I answered a week ago but somehow the answer d...Hi!<br />I answered a week ago but somehow the answer did not appear.. :-(<br />Anyway I suggest using bias-SGD where in case of a missing user in training data the bias of the item is used. In ALS we don't have a good answer on how to predict cold start users.<br />Maybe try to predict the average for that item.Danny Bicksonhttps://www.blogger.com/profile/01517237836051035400noreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-77910951768268690832012-05-07T22:40:59.082-07:002012-05-07T22:40:59.082-07:00I meant specifically using the alternating least s...I meant specifically using the alternating least squares based approach, which as I understand for prediction requires that you take the dot product of the corresponding user and item vector that you learned in the training phase.ehtshamhttps://www.blogger.com/profile/12379114673457018951noreply@blogger.comtag:blogger.com,1999:blog-3211409948956809184.post-23092307015857625082012-05-07T22:27:37.780-07:002012-05-07T22:27:37.780-07:00Hi Danny, nice article, I have a question, how wer...Hi Danny, nice article, I have a question, how were you able to generate a prediction for each user-item pair in the test file? Because a lot of users/items in the test file do not appear in the training file.ehtshamhttps://www.blogger.com/profile/12379114673457018951noreply@blogger.com