2008年2月25日 星期一

[Reading] Lecture 02 - Image Retrieval: Ideas, Influences, and Trends of the New Age

I used to do research on video retrieval when I was in Eric's small group before, therefore, I thought this paper would not be too hard to read for me. However, the fact is that there are so much thing that I had never heard before. Beside, I only understand how some developing techniques can be used in CBIR, but don't know why they are brought up into this area. I'm such a frog in a wall...

Although this paper only focus on the history of CBIR after 2000, we still can understand how the mainstream of this area goes in the last decades. In the very beginning, the research subjects for CBIR were very limited, people usually tried to take only two steps - feature extraction and similarity calculation, to retrieve what they want. After several years, more and more new areas such signature extraction, clustering (metric learning), classification (SVM), and relevance feedback are thriving up. The author took some experiments by Google Scholar's search tool to proof this phenomenon.

Then, the author gives several aspects of the design of a CBIR system. Because the CBIR applications are interactive, the author usaually thinks from users' viewing angle. Some constraints such as searching speed are added to ensure the practicability of a CBIR system. Those constraints make the CBIR design becomes even more hard to design, since we cannot solve the problem by brute force.

There is always a trade-off between performance and complexity. Take features/signatures extraction for example, in order to obtain a thorough description of an image, we should also concern about the spatial information (including the shape). However, to get a more concise and meaningful spatial information, higher computation is often unavoided (Ex:segmentation). Similarly, the trade-off between performance and complexity can be seen in similarity calculation. Mallows distance, which performs well, is described by a minimzation problem. However, since the complexity to get this value is so high, some people would rather choose integrated region matching (IRM) as their metric, which can be computed significantly faster but not much inferior to Mallows distance.

Another important issue deserves to be mention is the evaluation strategies. It is hard to evalute whether the result of a CBIR is good or bad because it is very subjective, not to mentioned how to compare two different CBIR system. Therefore, researchers started to agreed on certain evaluation datasets, benchmearks, and forums.

Overall, this paper gives a comprehensive survey and also provides some possible offshoots in the future. However, we won't get to much thing from this paper if we only skim through it. The author tries to give many things so that he sacrifices some of the details. Of course, if we really try to understand a certain subject well, reading the reference listed by the author would be a good start.

Reference:
R. Datta, D. Joshi, J. Li, and J. Z. Wang, "Image retrieval: Ideas, influences, and. trends of the new age." ACM Computing Surveys, 39(65), 2007

沒有留言: