Monday, October 26, 2009

Francisco Martin #RecSys09 Industry Keynote Summary

推薦系統界的年度盛事之一 ACM Recommender System 2009 剛剛落幕(October 22-25),Strands 的創辦人 Francisco J Martin 在會中以業界人士身份受邀發表的演說 (Industry Ketnote) : Top 10 Lessons Learned Developing, Deploying, and Operating Real-World Recommender Systems 中有許多值得大家思考的內容,Neal Lathia (MobBlog) 將 Martin 博士的演說,以推特(Twitter)筆法,整理成十則簡明的摘要:

Lesson 1Make sure a recommender is really needed! Do you have lots of recommendable items? Many diverse customers?… also think Return-on-Investment… a more sophisticated recommender may not deliver a better ROI.


Lesson 2Make sure the recommendations make strategic sense. Is the best recommendation for the customer also the best for the business? What is the difference between a good and useful recommendation? Good recommendations .vs. useful recs; Obvious recommendations may not be useful; risky recs may deliver better long-term value (所有系統都是為企業需求而生,切記切記)


Lesson 3 - Choose the right partner! Select the right rec vendor vs hire some #recsys09 students. If you are a big company the best you can do is to organize a contest (為什麼不直接明說 Netflix ?LOL)


Lesson 4 – Forget about cold-start problems (!) …. just be creative. The internet has the data you need (somewhere…) (記住那句老話:We are limited only by our imigination)

Lesson 5 – Get the right balance between data and algorithms. 70% of the success of a #recsys is on the data, the other 30% on the algorithm (這個問題我們已經討論很多次了, Worry about the data before you worry about the algorithm


Lesson 6 – Finding correlated items is easy but deciding what, how, and when to present to the user is hard… or don't just recommend for the sake of it. Remember user attention is a scarce and valuable resource. Use it wisely! … don't make a recommendations to a customer who is just about to pay for items at the checkout! User interface should get at least 50% of your attention.


Lesson 7 – Don't waste time computing nearest neighbours (use social connections)… just mine the social graph. Might miss useful connections??


Lesson 8 – Don't wait to scale (6, 7, 8, 9 顯然都是實務上的經驗談)


Lesson 9 – Choose the right feedback mechanism. Stars vs thumbs …. the YouTube problem. More research on implicit and other feedback mechanisms is needed. The perfect rating system is no rating system! … focus on the interface. Seems to me this is one of the gaps in current research… algorithms > data > interface


Lesson 10 – Measure Everything! … business control and analytics is a big opportunity here. (不僅要評量預測準不準,企業流程裡每個環節都要有評估機制,這是有真正創業、經營體驗的人的心得)


Keynote Takeaway – Think about application context; Focus on interface as much as algorithms; Be creative with start-up data. … the UI needs to get the lion’s share of the effort (50%) compared to algorithms (5%) , knowledge (20%), analytics (25%)


對於最後的 Takeaway,每個讀者或許都有自己的看法,畢竟要量化各因素在系統開發過程中的比重實在不容易,最後只能是被迫給出一組表達自己“經驗值”的數字。UI 的重要性當然毋庸置疑,只是 UI 為什麼是演算法的十倍?聰明的你(妳),想必有一套自己的想法!


1 comment:

如果我的心是一朵蓮花

~ 林徽因 · 馬雁散文集 · 蓮燈 ~ 馬雁 在她的散文《高貴一種,有詩為證》裡,提到「十多年前,還不知道林女士的八卦及成就前,在期刊上讀到別人引用的《蓮燈》」 覺得非常喜歡,比之卞之琳、徐志摩,別說是毫不遜色,簡直是勝出一籌。前面的韻腳和平仄的處理顯然高於戴...