Analysis and Understanding of Document Images in Network Media

There is a huge growth in the amount of multimedia data on social network media. With such large data collections of weakly structured content, current information retrieval methods face difficulties in mining such data. The objective of the proposed project is to develop a system for mining and retrieval of heterogeneous documents, mainly focusing on weakly structured documents such as born-digital documents and scene images with embedded text. Analyzing the contents of such documents is very challenging because of complex background, complex layout, perspective distortion, lighting variance, defocus, variation of font type, size and color, mixed graphics and text, multi-languages within the same text and sometimes low resolution.

The research plan of the proposed system is composed of complementary parts that finally form a pipeline of a complete system. First, different image types are received as input; they will be classified by the “fast image categorization” part. Then, scene images will be analyzed by the “scene text detection and extraction” part, whereas born-digital documents will be analyzed by the “layout analysis and page segmentation” part. The text extracted from different images types from the previous two parts will be analyzed by the “multi-lingual text recognition” part. Finally, the “contextual interpretation and information integration” part will combine the information analyzed from the previous parts and integrate them in order to reach a meaningful representation of the document database.

The two project partners will collaborate on solving the different problems in accordance with their respective expertise. The expected research achievements in analyzing the contents of document images in network media will provide research experience and visibility to the partners, and will be very useful for different social applications -- such as interactive tourists guidance --, cyber security and commercial data mining.


NLPR Institute of Automation of Chinese Academy of Sciences

L3i Laboratoire Informatique, Image, Interactions

ANR grant: 244 296 euros
Beginning and duration: octobre 2014 - 48 mois


ANR Programme: Interactions des mondes physiques, de l'humain et du monde numérique (DS0707) 2014

Project ID: ANR-14-CE24-0031

Project coordinator:
Monsieur Jean-Marc OGIER (Laboratoire Informatique, Image, Interactions)


