[詩戀] 借他人酒杯,澆胸中塊壘


嘉慶 25 年(西元 1820 年),龔自珍做《又懺心一首》:
佛言劫火遇皆銷 何物千年怒若潮

經濟文章磨白晝 幽光狂慧復中宵

來何洶湧須揮劍 去尚纏綿可付簫

心藥心靈總心病 寓言決欲就燈燒


How recommender researchers test their algorithm and make the system smarter?

FastCompany 今日介紹 RichRelevance 首席科學家 Darren Vengroff想出一個讓研究人員測試推薦系統演算法的方法,其實這個點子很簡單,將真實世界的資料包裝成一個黑箱,讓研究者上傳程式,使用這個機制測試演算法的好壞。
There are many holy grails in online commerce, but one that has frustrated C-level executives and engineers alike is how to produce better recommendation algorithms. Produce better recommendations, and you’ll sell more stuff.
Historically, however, there has been one major structural impediment to making significant breakthroughs on this front. But the Chief Scientist at RichRelevance, which provides personalization solutions for the likes of Walmart, Sears, and, may just have fixed that. 
First, however, the impediment: The people who are likely to produce breakthroughs--the really smart smarty-pants in the math departments of the world’s universities--don’t have access to large bodies of real-world data. And without real-world data, they can come up with as many hypotheses and new types of math as they like, but they’ll never really know if it actually works in the real world. It’s like trying to learn how to serve without tennis balls. You can swing as much as you like, but until you actually hit a real-live ball, you can never be sure if your swing would actually place a ball in the serve box. 
For their part, the people who have real-world data--the Amazons and eBays of the world--can’t share it with the researchers for reasons of customer privacy. “Even if we anonymize it, we’re handcuffed because we can’t give out data that can be reasonably be used to reconstruct who someone really is,” the Chief Scientist, Darren Vengroff, tells Fast Company.
Vengroff, however, has come up with a novel solution: He’s created a “black box” of sorts with real-world data that researchers can use to run experiments on. Researchers won’t be able to look at the data, but they will be able to dump their algorithms in and have the box spit out results, which the researchers can then use to refine their hypotheses. 
It’s a simple idea, but it wasn’t really possible to execute until the advent of the cloud. Now researchers from any part of the globe will be able to use the system to run experiments. (In principle, of course--in practice, a committee will vet proposals and choose which ones will actually run.) 
Vengroff, who once worked as a Principal Engineer at Amazon, says he got the idea for the project while attending a computing conference last fall. “In one of the sessions, there were three consecutive papers in a row where about two-thirds of the way through, I was really excited about what was being presented, and then they went down a different path than I thought they were going to go,” he says. “I realized if they only knew what the real-live data, that I look at every day, says, they wouldn’t have gone that way. They would have gone the right way and gotten to a much better solution.”
“Seeing these brilliant ideas get misapplied because of a very reasonable assumption about how shoppers might shop, but happens not to be true in the empirical data--I realized we’ve got to find a way to have this not happen anymore.” 
The system, which is in beta right now, will launch next month at conference in Palo Alto. Says Vengroff: “We’ve got a significant new path that we think is really going to change things.”
[Video] To the life that used to be.

早在去年8月,網友就提醒,音樂劇悲慘世界25週年音樂會,將在去年 10月初倫敦表演。雖然沒機會躬逢其盛,但是透過網路,我們還是「間接」參與了這場盛會。

或許是期望過高,或許是被十週年紀念音樂會影響而有先入為主的「偏見」,坦白說,我對這次的表現有點小失望。比如說,經典的 One Day More 絲毫感受不到激情,謝幕前做為 encore 曲再唱一遍時,老歌手的表現固然點燃了全場熱情,但這回擔綱 Marius 的 Nick Jonas 缺乏大將之風的缺點曝露無遺。

平心而論,雖然 Marius 的表現令人失望,其他的角色還算到位,編曲、舞台、噱頭爆點的設計比起十五年前猶有過之。有些曲子還是有些亮點的,比如說 Drink with Me 中 Grantaire 的情緒爆發就比十週年的含蓄慢板要更符合原著精神,也更有舞台效果,可惜當 Marius 在曲終沉痛深情的問  Will you weep, Cosette, for me?,Nick Jonas 怎麼也不及 Michael Ball 表達出的情緒起伏。

Drink with me to days gone bye
To the life that used to be 


25 週年版 Drink with me

10週年版 Drink with me

日本國立材料科學院,一個研究超導體的研究團隊,在某次慶功小聚中,喝的有點多的研究人員,酒意上頭之後,決定把研究材料放到“許多、許多”酒 (原文是 many many liquor)中。

事後檢測,這批泡過慶功酒的材料,傳導性比平日的實驗結果好得多 (疑問:這些人原本要慶祝什麼?)。 進一步比較後發現,日本燒酒的成績比平日材料好 23%,泡過紅酒的材料則改善了 62%。但是根據報導,這宴會還供應威士忌和啤酒,莫非這兩種酒的表現不大好!?


不論如何,一定很多人愛死這篇報導的結論了,”So, a little sip of something turns out to make potential superconductors much better at their jobs. And, perhaps, scientists better at their jobs as well.“。



本以為這個月初 Vodafone Australia 泄露四百萬客戶資料的事已夠駭人聽聞,沒想到加拿大統計局 (Statistics Canada)諸君的豐功偉業才真是登峰造極令人髮指。

多倫多太陽報這個月10號在一份報導中整理過去五年 Stats Can 的犯行,在 2007年,他們把一個裝了敏感訊息的檔案櫃當作多餘的傢具賣掉,還有一次,統計局的幹員把某家公司的調查資料,留在其他的調查對象的辦公室,太陽報很客氣的說這只是 some examples of breaches.....

OCT. 2010: Purolator envelope containing 11 unencrypted, non-password-protected CDs for the Vital Statistics Program in Alberta addressed to Ottawa head office sent July 9, 2010 is discovered missing. It contains more than 21,000 electronic images of confidential information about individual birth, death, stillbirth and marriage registrations. It is found Nov. 30, 2010 locked in a rarely-used filing cabinet.
SEPT. 2009: Stats Can library's password access protocol constitutes "major security breach."
DEC. 2008: A briefcase with documents and personal notes is stolen from the car of an interviewer from Quebec. Confidential addresses of respondents were included.
JULY 2008: An error in transmission meant e-mails of 108 subscribers of Health Reports notifications were "inadvertently revealed" to all recipients of message - constituting a breach of Privacy Act and Stats Can policy.
JUNE 2008: Stats Can is informed that on Feb. 12, 2008 Surrey RCMP and Canada Post recovered completed 2006 census questionnaires from a private residence in a bust of a major identity theft ring. Other items included equipment related to credit card/ID theft, drivers' licences, 3,000 pieces of stolen mail, government-issued cheques, fake currency and more than 100 CDs with thousands of personal data profiles. Census questionnaires were not in the hands of census staff - it is believed they were obtained by tipping mailboxes or break-ins to homes and cars.
AUG. 2007: A laptop containing personal information about individuals who participated in the Labour Force Survey or Canadian Community Health Survey is stolen from the residence of an employee in Abbotsford, BC. Password was written on a sticky note stored in laptop case. Police called, affected people are informed and interviewer receives verbal reprimand.
JUNE 2007: Laptop with three completed household spending surveys stolen in home break-in in Delta, B.C.
MARCH 2007: Edmonton regional office reports two laptop thefts from field interviewers' vehicles. Staff are reminded about protocol for securing material.
MARCH 2007: Privacy Commissioner's office advised of inadvertent disclosure and loss of personal info after surplus filing cabinets with Records of Employment about 66 2006 census workers were sold at a Crown Assets Auction in Edmonton. Affected individuals are contacted and Stats Can implements more stringent procedures to avoid a recurrence.
JULY 2006: Enumerator leaves completed questionnaire instead of blank at Scarborough, Ont. respondent's home.
APRIL 2005: Blank forms faxed to a business include additional pages of confidential information related to two other businesses. Staff receive retraining and posters/notices are displayed as reminders.
FEB. 2005: Marketing information collected for one user is reviewed by another user and possibly four other unknown individuals in a Corporations Returns Act survey.
FEB. 2005: Laptop being shipped from Williams Lake, B.C. to Edmonton containing 23 Survey of Household Spending cases - including 11 completed ones - goes missing. A flurry of e-mails ensues among senior managers at Stats Can and officials "pester" Canada Post to find the lost item. Confidential statistical info is encrypted. Laptop is found two weeks later.

[Video] How to succeed? Get more sleep

昨夜早早就寢,今晨睜眼看到時鐘,赫然發現自己睡了足足十一個小時。猶在睡眼惺忪之際,打開 google reader,看到 Ted 中國粉絲團介紹由Arianna HuffingtonThe Huffington Post 的共同創辦人和主編,在遭遇一次健康危機後,對睡眠的反省和重新認知:How to succeed? Get more sleep

資料庫大牛 @Fenng (dbanote),在裡談到五條 Facebook 管理技術團隊的經驗,但在推文裡找不到原文出處,後來才發現原來是出自一位在 2005 到 2010 間擔任 Facebook 技術主管(Director of Engineering/Software Engineer)的 Yishan Wong文章,我覺得比前幾天那個問蠢問題的經驗談要靠譜。

@fenng 扼要的總結這幾篇文章:(1)招聘高手是最要緊的大事 (2) 流程要由真正在做事的人來制訂 (3) 內部升遷,不要空降 (4)用工具提升生產力 (5) 讓懂技術的人領導,不要外行領導內行。


1. Hiring is number one
2. Let process be implemented by those who practice it
3. Promotion from within
4. Tools are top priority
Technical Leaders

[詩戀] 泡杯咖啡試試



以培養創意為核心業務的顧問公司Idea Champions 今日在公司部落格發表一篇文章談 100 個拖延的理由,那些耳熟能詳的藉口比如說:沒錢、沒時間、這種事情我幹不來、老闆一定不會同意啦,不出意外的高列排行榜前十。

1. I don't have the time.
2. I can't get the funding.
3. My boss will never go for it.
4. Were not in the kind of business likely to innovate.
5. We won't be able to get it past legal.
6. I've got too much on my plate.
7. I'll be punished if I fail.
8. I'm just not not the creative type.
9. I'm juggling way too many projects.
10. I'm too new around here.

Idea Champions 果然「專業」,硬是找出了100個藉口,第97個藉口 - 那是研發部門的工作 - 着實讓我無言,我這個自認為是正宗研發的散人,要推給什麼部門才好咧?

96. That's my boss's job.
97. That's R&D's job.
98. I would if I could, but I can't, so I won't.
99. First, we need to benchmark the competition.
100.It's against my religion.

[詩戀] 將進酒 (艾農)



蠢問題 Ask Stupid Questions

IT 界資深媒體人 Curtis Franklin Jr. 在 Enterprise Efficiency如何改善IT經理人的管理技巧,他列出以下六個原則:
  • Hire Well
  • Get Out of Their Way
  • Run Interference
  • Hold Them Responsible
  • Listen
  • Ask Stupid Questions

同個實驗室的學弟來信談到未來以 concept drift 為基礎的新研究方向,附件裡計畫書的品質,比印象中他原先的水準要好許多,已經隱然有大將之風。拿到學位的前一年,還覺得這位老弟仍然「」的很,距離獨立作業還有段距離,離開學校半年,赫然發現除了停滯不前的自己,每個人都在成長

新題目很有意思,距離我原本在校期間想嘗試推動的研究方向很接近,和網路界實務上關注的角度也更靠近了。不過,自從三個月前回到產業界任職,這些都距離我益發的遠了,在人生的時間線(timeline)上,我 drift 到原先並沒有想過的軌道上來。

2010 年的最後一天,和同事談起以前曾經用心關注的話題,忍不住多說了幾句,還留下一個恍若隔世的感嘆,當晚就被學弟們在網路上好生調侃了一番,這可是一百年前的天寶舊事了,可不正是隔世麽?




  • 因為青春期某些不愉快的經歷,每到年底心情都會陷入低檔,在發過想找人喝酒、想找人打架、想找人 #$%&*!..... 的牢騷之後,有位童鞋慷慨的願意提供他的 iPad 讓我發泄,Jing,我一定不會辜負你的好意的。
  • 文棟、向量,下回去北京出差,一定會提前通知你們地
  • 我現在越來越適應在黑豹的歌聲中寫東西了 XD

[Video] 7Billion, Living on Earth

到 2011 年底,地球上約莫有兩百個國家,將有 70億人口,用超過七千種語言彼此溝通。


~ 林徽因 · 馬雁散文集 · 蓮燈 ~ 馬雁 在她的散文《高貴一種,有詩為證》裡,提到「十多年前,還不知道林女士的八卦及成就前,在期刊上讀到別人引用的《蓮燈》」 覺得非常喜歡,比之卞之琳、徐志摩,別說是毫不遜色,簡直是勝出一籌。前面的韻腳和平仄的處理顯然高於戴...