research

Take your eyes off the ball: tracking the invisible in team sports

Accurate video-based ball tracking in team sports is important for automated game analysis, and has proven very difficult because the ball is often occluded by the players. We propose a novel approach to addressing this issue by formulating the tracking in terms of deciding which player, if any, owns the ball at any given time. This is very different from standard approaches that first attempt to track the ball and only afterwards assign ownership. We show that our method achieves a significant increase in accuracy over such approaches on long basketball and soccer sequences. [CVIU 2014 ] [example videos]

Missed our CVPR'13 demo? Play basketball roulette here!

Learning parameterized histogram kernels on the simplex manifold for image and action classification

State-of-the-art image and action classification systems often employ vocabulary-based representations. The classification accuracy achieved with such vocabulary-based representations depends significantly on the chosen histogram distance. In particular, when the decision function is a support-vector-machine (SVM), the classification accuracy depends on the chosen histogram kernel. We learn parameters of histogram kernels so that the SVM accuracy is improved. This is accomplished by simultaneously maximizing the SVM's geometric margin and minimizing an estimate of its generalization error. [ICCV 2011 paper][code]

Layers of graphical models for tracking partially-occluded objects

We propose a representation for scenes containing relocatable objects that can cause partial occlusions of people in a camera's field of view. In this representation, called a graphical model layer, a person's motion in the ground plane is defined as a first-order Markov process on activity zones, while image evidence is aggregated in 2D observation regions that are depth-ordered with respect to the occlusion mask of the relocatable object. The effectiveness of our scene representation is demonstrated on challenging parking-lot surveillance scenarios. [T-PAMI 2011 paper] , datasets] [CVPR2008 paper]

Learning a familty of detectors via multiplicative kernels

Object detection is challenging when the object class exhibits large within-class variations. In this work, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly learned in a multiplicative form of two kernel functions. Model training is accomplished via standard SVM learning. Our approach compares favorably to existing methods on hand and vehicle detection tasks. [T-PAMI 2011 paper] [CVPR 2008 paper] [CVPR 2007 paper]

Document image analysis and enhancement for multi-lingual OCR

Modern optical character recognition (OCR) engines achieve remarkable accuracy on clean document images but tend to perform poorly when presented with degraded documents or documents captured with hand-held devices. The problem is exacerbated for multilingual OCR engines. We proposed an approach for automated script identification for degraded documents and for an automatic correction of perspective warp. [ICDAR 2005 paper] [ICDAR 2003 paper]

Tracking small vessels in littoral zones

In water-based scenarios, waves caused by wind or by moving vessels (wakes) form highly correlated moving patterns that confuse traditional background analysis models. In this work we introduce a framework that explicitly models this type of background variation. The framework combines the output of a statistical background model with localized optical flow analysis to produce two motion maps. In the final stage we apply object-level fusion to filter out moving regions that are most likely caused by wave clutter. The resulting set of objects can now be handled by a tracking algorithm. [ICIP 2003 paper]