During my Postdoc, my focus was focused on implementing computer vision approaches in robotic scenarios. That includes: place recognition, action recognition (on a mobile robot platform), unsupervised action detection, action forecasting in a human-robot collaboration scenario.
Place recognition: We presented novel methods exploiting the properties of mid- and high-level features in ConvNets to obtain great results with respect to the biggest challenges in visual place recognition – appearance and viewpoint changes.
Action recognition (on a mobile robot platform): We addressed challenges in transitioning from computer vision datasets to robotics applications; the sensitivity of many traditional approaches on background cues, and the effect of camera motion in a robotics context; by developing methods for selecting action region proposals that are motion salient and more likely to contain the actions, regardless of background and camera motion.
Unsupervised action detection: We propose a new task of unsupervised action detection by action matching. Given two long videos, the objective is to temporally detect all pairs of matching video segments. The task is category independent—it does not matter what action is being performed—and no supervision is used to discover such video segments.
Action anticipation in a human-robot collaboration scenario: Anticipating future actions is a key component of intelligence, specifically when it applies to realtime systems, such as robots or autonomous cars. While recent works have addressed prediction of raw RGB pixel values, we focus on anticipating the motion evolution in future video frames.
During my PhD, I have worked on a range of computer vision on topics including face, object and action recognition, action clustering and face/object tracking using the non-euclidean geometry of the Riemannian manifolds.
Visual recognition via graph embedding discriminant analysis on Grassmann manifolds: Proposing a novel DA on Grassmann manifolds (as well as SPD manifolds) based on the graph-embedding framework in applications such as face/object/action recognition, texture classification and person re-identification.
Clustering on Grassmann manifolds: Reformulating the action clustering task over the Grassmann manifold (extrinsic approach) and proposing a new measure of clustering distortion in applications such as human action clustering, handwritten digit clustering, face clustering and social behaviour clustering.
Visual tracking via Grassmann geometry: Reformulating the visual tracking task over the Grassmann manifold and developing a novel object appearance model based on affine subspaces as well as a novel approach to measure distance between affine subspaces.