ponedeljek, 22. avgust 2011

#sigir2011

It's more than a bit ironic that a premiere conference on information retrieval took place behind the Great Firewall and consequently without discussions on Twitter. But Chinese have become also great scientist and are not just cheap labor anymore, so I guess they have well deserved to host this event.

The 34th instance of SIGIR conference in Beijing was attended by more than 800 people from throughout the world (China 400, USA 250, Europe 100, ...). The acceptance rate for the papers was only 20%, which makes this conference one of the more competitive. What came as a nice surprise this year is that the presentation level was substantially better than last year, with almost all speakers giving their talks in comprehensible English and with good rhetoric skills.
Bruce Croft (program chair) presenting basic facts about the conference
What makes the field of IR different from the other scientific fields is the influence of industry and their research labs. Almost 50% of the papers had at least one author from Microsoft, Google, Yahoo, Facebook, Yandex, Baidu or some other company. Therefore, while SIGIR is a scientific conference, I got the feeling that it is very much oriented towards the real problems of the industry. If this assumption is correct, than we could perhaps deduce the problems of the industry by examining share of papers in different areas.

Top 5 areas for accepted papers
The main stress of SIGIR2011 could be summed as "find data that solve the problem". Here are couple of examples of this approach in action:
  • The best paper award was given to a Russian Mikhail Ageev, who devised a simple game that enabled collection of data for measuring success of search. They collected search trails for apx. 150 users using Mechanical Turk and that was sufficient to learn the model that predicts whether the user found the information he was searching for or not. This technique enable Google et al. to automatically evaluate quality of their search. 
  • PICASSO is a system by Aleksandar Stupar that, given an image, recommends related music. The main idea behind this system is to use movies and their soundtracks to learn relation between images and music. 
  • Guys from Microsoft have presented a clever way how to identify geographical relevance of a web site - just track where the readers come from.
Overall (and excluding censorship) I liked the SIGIR2011 in Beijing more than last year's conference in Geneve. Last year too much stress was put on rigorous evaluation, while program committee allowed for more bold thinking this year. I got many good ideas while attending SIGIR2011 and you may expect many of them being implemented in Zemanta soon.


Looking forward to SIGIR2012!
    Enhanced by Zemanta