The French National Research Agency Projects for science

Voir cette page en français

ANR funded project

Contenus et Interactions (CONTINT) 2010
Projet DIGIDOC

Document Image diGitisation with Interactive DescriptiOn Capability

The DIGIDOC project belongs to the field of document digitization and more precisely the digitization of old and precious documents. In a global context where many big projects are devoted to the preservation of cultural heritage, the DIGIDOC project aims at improving a specific point, the image acquisition step. We focus on this first step in order to improve and simplify the future use of the digital documents (storage, text recognition, document retrieval,...). Our approach is to take into account a priori knowledge on the documents to be digitized and knowledge on how they will be used in the image acquisition step.
In order to reach this objective, we propose to insert an additional module into scanners to provide a set of descriptors of intermediate level computed from the digitized image. These meta-data will be used thereafter to better acquire, store, analyze and index the digitized documents. In particular, they should allow to quantify the adequation between a given document digitization and its future use. The definition of such a set of features and its integration in a new format of digitized document is the main objective of the project. This new format will be the basis of new interaction procedures with scanners and of new documents analysis tools. A first application will aim at simplifying the choice of scanner parameters by semi-automatically adapt them according to the document characteristics and to the needs of the final users. A second application will be to quantify the quality of existing document images.

These objectives are clearly in the topics of the call « Contenu et interaction » as they contribute to define a new file format dedicated to the description of the contents of digitized documents. This description can be used to ease and improve the storage, the processing, the comparison and the indexation of document images. This projet brings together research laboratories (LaBRI Bordeaux, LI Tours, L3I La Rochelle, LITIS Rouen), industry partners (I2S Bordeaux, Akhenum Bordeaux) and final users (BNF).

Partners

Arkhênum ARKHENUM

BNF BIBLIOTHEQUE NATIONALE DE FRANCE

i2S i2S

L3I UNIVERSITE DE LA ROCHELLE

LaBRI UNIVERSITE BORDEAUX I

LITIS UNIVERSITE DE ROUEN [HAUTE-NORMANDIE]

LI UNIVERSITE DE TOURS [FRANCOIS RABELAIS]

ANR grant: 866 159 euros
Beginning and duration: - 42 mois

 

ANR Programme: Contenus et Interactions (CONTINT) 2010

Project ID: ANR-10-CORD-0020

Project coordinator:
Monsieur Jean-Philippe DOMENGER (UNIVERSITE BORDEAUX I)
domenger@nulllabri.fr

 

Back to the previous page

 

The project coordinator is the author of this abstract and is therefore responsible for the content of the summary. The ANR disclaims all responsibility in connection with its content.