Abstract: This paper investigates a general framework to discover categories of unlabeled scene images according to their appearances (i.e., textures and structures). We jointly solve the two coupled tasks in an unsupervised manner: (i) classifying images without pre-determining the number of categories, and (ii) pursuing generative model for each category. In our method, each image is represented by two types of image descriptors that are effective to capture image appearances from different aspects. By treating each image as a graph vertex, we build up an graph, and pose the image categorization as a graph partition process. Specifically, a partitioned sub-graph can be regarded as a category of scenes, and we define the probabilistic model of graph partition by accumulating the generative models of all separated categories. For efficient inference with the graph, we employ a stochastic cluster sampling algorithm, which is designed based on the Metropolis-Hasting mechanism. During the iterations of inference, the model of each category is analytically updated by a generative learning algorithm. In the experiments, our approach is validated on several challenging databases, and it outperforms other popular state-of-the-art methods. The implementation details and empirical analysis are presented as well.
Comments: | 11 pages, 7 figures |
Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
MSC classes: | 68U01 |
Cite as: | arXiv:1502.00374 [cs.CV] |
(or arXiv:1502.00374v1 [cs.CV] for this version) | |
https://doi.org/10.48550/arXiv.1502.00374 |