Sparse B-spline polynomial descriptors for human activity recognition

Share/Save/Bookmark

Oikonomopoulos, Antonios and Pantic, Maja and Patras, Ioannis (2009) Sparse B-spline polynomial descriptors for human activity recognition. Image and vision computing, 27 (12). pp. 1814-1825. ISSN 0262-8856

[img]PDF
Restricted to UT campus only
: Request a copy
1562Kb
Abstract:The extraction and quantization of local image and video descriptors for the subsequent creation of visual codebooks is a technique that has proved very effective for image and video retrieval applications. In this paper we build on this concept and propose a new set of visual descriptors that provide a local space-time description of the visual activity. The proposed descriptors are extracted at spatiotemporal salient points detected on the estimated optical flow field for a given image sequence and are based on geometrical properties of three-dimensional piecewise polynomials, namely B-splines. The latter are fitted on the spatiotemporal locations of salient points that fall within a given spatiotemporal neighborhood. Our descriptors are invariant in translation and scaling in space-time. The latter is ensured by coupling the neighborhood dimensions to the scale at which the corresponding spatiotemporal salient points are detected. In addition, in order to provide robustness against camera motion (e.g. global translation due to camera panning) we subtract the motion component that is estimated by applying local median filters on the optical flow field. The descriptors that are extracted across the whole dataset are clustered in order to create a codebook of ‘visual verbs’, where each verb corresponds to a cluster center. We use the resulting codebook in a ‘bag of verbs’ approach in order to represent the motion of the subjects within small temporal windows. Finally, we use a boosting algorithm in order to select the most discriminative temporal windows of each class and Relevance Vector Machines (RVM) for classification. The presented results using three different databases of human actions verify the effectiveness of our method.
Item Type:Article
Copyright:© 2009 Elsevier
Faculty:
Electrical Engineering, Mathematics and Computer Science (EEMCS)
Research Group:
Link to this item:http://purl.utwente.nl/publications/69476
Official URL:http://dx.doi.org/10.1016/j.imavis.2009.05.010
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page