CONTINT - Contenus et Interactions

Document Image diGitisation with Interactive DescriptiOn Capability – DIGIDOC

Submission summary

The DIGIDOC project belongs to the field of document digitization and more precisely the digitization of old and precious documents. In a global context where many big projects are devoted to the preservation of cultural heritage, the DIGIDOC project aims at improving a specific point, the image acquisition step. We focus on this first step in order to improve and simplify the future use of the digital documents (storage, text recognition, document retrieval,...). Our approach is to take into account a priori knowledge on the documents to be digitized and knowledge on how they will be used in the image acquisition step.
In order to reach this objective, we propose to insert an additional module into scanners to provide a set of descriptors of intermediate level computed from the digitized image. These meta-data will be used thereafter to better acquire, store, analyze and index the digitized documents. In particular, they should allow to quantify the adequation between a given document digitization and its future use. The definition of such a set of features and its integration in a new format of digitized document is the main objective of the project. This new format will be the basis of new interaction procedures with scanners and of new documents analysis tools. A first application will aim at simplifying the choice of scanner parameters by semi-automatically adapt them according to the document characteristics and to the needs of the final users. A second application will be to quantify the quality of existing document images.

These objectives are clearly in the topics of the call « Contenu et interaction » as they contribute to define a new file format dedicated to the description of the contents of digitized documents. This description can be used to ease and improve the storage, the processing, the comparison and the indexation of document images. This projet brings together research laboratories (LaBRI Bordeaux, LI Tours, L3I La Rochelle, LITIS Rouen), industry partners (I2S Bordeaux, Akhenum Bordeaux) and final users (BNF).

Project coordination

Jean-Philippe DOMENGER (UNIVERSITE BORDEAUX I) – domenger@labri.fr

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

i2S i2S
Arkhênum ARKHENUM
LaBRI UNIVERSITE BORDEAUX I
LI UNIVERSITE DE TOURS [FRANCOIS RABELAIS]
LITIS UNIVERSITE DE ROUEN [HAUTE-NORMANDIE]
L3I UNIVERSITE DE LA ROCHELLE
BNF BIBLIOTHEQUE NATIONALE DE FRANCE

Help of the ANR 866,159 euros
Beginning and duration of the scientific project: - 42 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter