Palestra Extraordinária: Complex Image Representations and Similarities for Multimedia Understanding.
Prof. Dr. Nicolas Thome da UPMC/LIP6, na Série de Seminários 2011 da Pós-Graduação, dia 08/06/2011, às 14:00 h, Auditório do IC, Sala 85 - IC 2.
| What | Palestra |
|---|---|
| When |
08/06/2011 from 14:00 to 15:00 |
| Where | Auditório do IC - Sala 85 - IC 2 |
| Add event to calendar |
|
In this talk, I present our current working interest for designing and learning representations and similarities. I start with a specific challenging application of text detection in urban context. We develop an overall system (SnooperText) consisting in character segmentation, classification and grouping. These bottom-up steps are combined with a top-down validation that analyzes each text region globally. Successful experiments have been carried out on the ICDAR database and on a specific urban dataset in line with the French Itowns project. We also propose to extend the method to videos by combining text detection and tracking.
Regarding image representation, we propose an approach that go beyond the popular Bag of Words (BoW) methodology. The novel representation, BOSSA (Bag Of Statistical Sampling Analysis) enrich the standard pooling step by capturing a histogram of distance to each visual word. We show promising results by evaluating the method in both image in video classification tasks. In kernel methods, we tackle the problem of intermediate fusion strategies for feature combination, i.e. kernel combination. We propose a new kernel combination algorithm (CVl1MKL), able to output a non-sparse kernel combination well-adapted to combine complementary image modalities. In deep networks, we are working on unsupervised learning of deep representations. In the context of Restricted Bolzman Machines (RBM), we propose an algorithm that enforce both selectivity and sparsity in the learned representations. We are also investigating deep biologically inspired methods. We improve state of the art architectures by providing multi-scale S2 prototypes, improving the similarity coding and incorporating spare priors at the coding and pooling steps. ===================================================================== Nicolas THOME is an assistant professor from UPMC/LIP6 - DAPA MALIRE Multimedia Group, since 2008. He has a B.Sc. from the Ecole Nationale Supérieure de Physique de Strasbourg http://www-ensps.u-strasbg.fr/
(ENSPS, 2002) and a Ph.D. from theLaboratoire d'InfoRmatique en Image
et Systèmes d'information(LIRIS, 2007). He is interested in the design/learning
of complex image representations and similarities, with applications to image/video understanding. He participates in a CAPES/COFECUB project involving several universities in Brazil and France, among which UNICAMP and UPMC --- Paris 6. ===================================================================== Organizadora: Profª. Drª. Maria Beatriz Felgar de Toledo
(beatriz@ic.unicamp.br))
IC / Unicamp
Fone: (019) 3521-5869 =====================================================================
