Tracking an object in an image sequence (video) is the process of dynamically computing moving objects’ location over time. Tracking has important applications which span several disciplines such as human-computer interaction, security and surveillance, augmented reality, video editing etc. In most of the tracking systems a set of parameters related to the location of the object (state) are dynamically computed (e.g., translation parameters, affine transform parameters, deformation parameters etc). The lab has considerable experience in developing robust tracking algorithms and possesses a number of tools:
The past decade, particle filtering has been the dominant paradigm for tracking the state of an object given a set of noisy observations. The lab proposed and developed various tracking algorithm within a particle filter framework ([1], [2]):
The above algorithms have been successfully applied to for multiple independent objects tracking such as tracking of facial features (e.g., mouth and eye corners).
Hierarchical face and gaze tracking is a wise-combination of Appearance-Based Trackers capable of estimating predefined facial features in monocular video sequences. This tracker describes the non-rigid facial movements of Eyelids, Irises, Eyebrows and Lips. At the same time, the rigid Head motion is estimated in 3D (three angles, image plane translation and in-deep scaling). This method is person independent and does not require prior training neither with facial textures, shapes nor facial actions. The tracker learns temporally On-Line the facial textures and smooth illumination changes. Therefore, this method is robust to short-time occlusions, drifting problems and out-of-plane movements. Consequently, unusual faces can be also tracked either to describe 3D head pose and location or the still visual features suitable of cognitive interpretation.
Currently, we are developing tracking algorithms based on an online learning approach for kernel-based Principal Component Analysis which incrementally updates the Eigenspace of the tracking features. We devise kernels which are robust to illumination changes and occlusion [3].
Maja Pantic, Joan Alabort-i-Medina, Epameinondas Antonakos, Akshay Asthana, James Booth, Shiyang Cheng, Stephan Liwicki, Javier Orozco, Christos Sagonas, Patrick Snape, Georgios Tzimiropoulos
S. Liwicki, G. Tzimiropoulos, S. Zafeiriou, M. Pantic. International Journal of Computer Vision. 101(3): pp. 498 - 518, 2013.
S. Liwicki, G. Tzimiropoulos, S. Zafeiriou, M. Pantic. IEEE Transactions on Neural Networks and Learning Systems. 23: pp. 1624 - 1636, October 2012.
S. Zafeiriou, G. Tzimiropoulos, M. Pantic. Proceedings of IEEE Int’l Conf. Computer Vision and Pattern Recognition (CVPR-W’11), Workshop on Computer Vision for Computer Games. Colorado Springs, USA, pp. 37 - 42, June 2011.
G. Tzimiropoulos, S. Zafeiriou, M. Pantic. Proceedings of IEEE Int’l Conf. Computer Vision and Pattern Recognition (CVPR-W’11), Workshop on CVPR for Human Behaviour Analysis. Colorado Springs, USA, pp. 26 - 33, June 2011.
S. Liwicki, S. Zafeiriou, G. Tzimiropoulos, M. Pantic. Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (FG'11). Santa Barbara, CA, USA, pp. 507 - 513, March 2011.
I. Patras, M. Pantic. Proceedings of IEEE Int'l Conf. Systems, Man and Cybernetics (SMC'05). Waikoloa, Hawaii, pp. 1066 - 1071, October 2005.
I. Patras, M. Pantic. Proceedings of IEEE Int'l Conf. Face and Gesture Recognition (FG'04). Seoul, Korea, pp. 97 - 102, May 2004.