Pollard and Mundy (2007) proposed a probabilistic volume model that can represent the ambiguity and uncertainty in 3-d models derived from multiple image views. In Pollard's model, a region of three-dimensional space is decomposed into a regular 3-d grid of cells, called voxels. A voxel stores two kinds of state information: (i) the probability that the voxel contains a surface element and (ii) a mixture of Gaussians that models the surface appearance of the voxel as learned from a sequence of images. The surface probability is updated by incremental Bayesian learning , where the probability of a voxel containing a surface element after N+1 images increases if the Gaussian mixture at that voxel explains the intensity observed in the N+1 image better than any other voxelalong the projection ray. In a fixed-grid voxel representation, most of the voxels may correspond to empty areas of a scene, making storage of large, high-resolution scenes prohibitively expensive. Crispell (2010) proposed a continuously varying probabilistic scene model that generalizes the discrete model proposed by Pollard and Mundy. Crispell's model allows non-uniform sampling of the volume leading to an octree representation that is more space-efficient and can handle finer resolution required near 3-d surfaces. More recently a GPU implementation of Crispell's model has been implemented by Miller et al. (2010). Training times decrease by several orders of magnitudes making it feasible to train large number of objects requiered for multi-class object recognition tasks. The following figure sumarizas the probabilisti volume model.