Go to (within this page):
Related Publications: [back to menu]
- "Characterization of 3-d Volumetric Probabilistic Scenes for Object Recognition.", IEEE Journal of Selected Topics in Signal Processing, vol. 6, issue Emerging Techniques in 3-D, pp. 522-537, 09/2012. BibTex IEEEXplore Download: PDF (9.38 MB)
- "An Evaluation of 3-d Local Descriptors in Probabilistic Volumetric Scenes", BMVC, Guilford, UK, 09/2012. BibTex Download: Final PDF (2.99 MB)One page abstract (1002.63 KB)Poster (8.12 MB)
- "Object Recognition in Probabilistic 3D Volumetric Scenes", International Conference on Pattern Recognition Application and Methods, 1st, Vilamoura, Algarve, Portugal, 02/2012. BibTex Download: PDF (3.29 MB) Oral Presentation (30 min - Best Paper Finalist) Conference Website
People Involved: [back to menu]
Project Details: [back to menu]
The Probabilistic Volume
odel Pollard and Mundy (2007) proposed a probabilistic volume model that can represent the ambiguity and uncertainty in 3-d models derived from multiple image views. In Pollard's model, a region of three-dimensional space is decomposed into a regular 3-d grid of cells, called voxels. A voxel stores two kinds of state information: (i) the probability that the voxel contains a surface element and (ii) a mixture of Gaussians that models the surface appearance of the voxel as learned from a sequence of images. The surface probability is updated by incremental Bayesian learning , where the probability of a voxel containing a surface element after N+1 images increases if the Gaussian mixture at that voxel explains the intensity observed in the N+1 image better than any other voxelalong the projection ray. In a fixed-grid voxel representation, most of the voxels may correspond to empty areas of a scene, making storage of large, high-resolution scenes prohibitively expensive. Crispell (2010) proposed a continuously varying probabilistic scene model that generalizes the discrete model proposed by Pollard and Mundy. Crispell's model allows non-uniform sampling of the volume leading to an octree representation that is more space-efficient and can handle finer resolution required near 3-d surfaces. More recently a GPU implementation of Crispell's model has been implemented by Miller et al. (2010). Training times decrease by several orders of magnitudes making it feasible to train large number of objects requiered for multi-class object recognition tasks. The following figure sumarizas the probabilisti volume model.
The local information in the probabilistic scenes is used to build representations of objects as bags of volumetric words. Local neighborhoods are described using principal component analysis or Taylor series approximation of the surface and appearance attributes. K-means type clustering is used to form a common vocabulary accross categories. Finally, features descriptors are assigned to the most similar vocabulary entry and quantized to learn distributions for different object classes. A Bayesian classifier is used during the testing phase to assign to each object the most probable class label. The workflow just described and the classification results are presented below.