This is a pioneer work that makes the machine learn to auto-annotate the image segmentation (blob). We say auto-annotate here, because they don't need precise annotation database. All they need is images with words associate with them. The annotation work is then modeled as a machine translation job, and can be solved by EM algorithm. After giving the detail of using EM to estimate the model, the paper provides several experiment result and the discussion, but they are not quite convincing to me.
Before this week, I had no idea how to classify/recognize/label an object using unsupervised method before. Now, at least I know one or two of these approaches. It is very interesting, since for many realistic cases, only unsupervised data is available.
Reference:
P. Duygulu, K. Barnard, J. F. G. de Freitas, and D. A. Forsyth, "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary," European Conference on Computer Vision, p. IV: 97 ff., 2002.
P. Duygulu, K. Barnard, J. F. G. de Freitas, and D. A. Forsyth, "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary," European Conference on Computer Vision, p. IV: 97 ff., 2002.
沒有留言:
張貼留言