The French National Research Agency Projects for science

Voir cette page en français

ANR funded project

Corpus, données et outils de la recherche en sciences humaines et sociales (Corpus) 2012

Ontology Research, Image Features, Letterform Analysis on Multilingual Medieval Scripts

The ORIFLAMMS project (Ontology Research, Image Features, Letterform Analysis on Multilingual Medieval Scripts) gathers 3 public research units in Humanities, 3 research units in Engineering, Information Sciences and Technologies and an industrial company in order to enhance our knowledge of the medieval scripts and multilingualism through a new, interdisciplinary approach.
Combining scientific, technologic, industrial and societal issues, ORIFLAMMS aims at analyzing the evolution of writing systems and graphical forms during a long period (Middle Ages) and according to their production contexts (informal, documentary, book scripts) and languages (Latin or vernacular). It aims at establishing an ontology of forms and analyzing the graphical structures of scripts and to upgrade a linear, textual approach with a visual, bi- or tridimensional one. This will give new knowledge for linguistics, history of scripts (palaeography, epigraphy, diplomatics)
ORIFLAMMS will first gather and harmonize several research corpora, then increase and enhance them, in order to create a new Reference Corpus, covering the diversity of medieval scripts: handwriting to print, informal drafts to monumental inscriptions, from Carolingian times to the eve of Renaissance, from theology and liturgy to chancery rolls and accounts.
This Reference Corpus, one of its kind by its wide content, will be of free access, and give access not only to images, but also to graphically analyzed transcriptions (allographetic transcriptions). The text will also be aligned with the image (with coordinates of pixels on the image). All data will be stored in an interoperable XML-TEI file, for long term digital information preservation and access.
The Reference Corpus will create a concordance of all written forms in the Middle Ages. For creating this concordance and move to a large-scale Humanities computing project, ORIFLAMMS will develop innovative image analyzing tools: upgrade the aligning methods for image and text and create a computer-aided transcription tool for medieval scripts. This software will be open source and documented.
Working with large-scale, rich encoded data and adopting new tools in the research community being an issue, the new software will be developed by the end users (Humanities researchers) in a consortium with a private company to make sure that it can be offered to a larger audience and meet the standards of ergonomic and usability.
The innovative tool is not only about producing new encoded data: it is part of their exploitation. ORIFLAMMS plans a new method for the study of scripts: analyzing the graphical variability. The latter will be considered through image analysis on a two-dimensional level, and through computational linguistics for the variability of morphosyntactic and graphical codes in Latin and vernacular.
The open source softwares of computational linguistics will be upgraded and documented as part of this project.
ORIFLAMMS plans the creation of a Reference Corpus, with images of scripts and transcribed, graphically analyzed texts from representative places and dates of medieval culture, and in interoperable formats. It will create innovative, open source tools for image processing and analysis as well as for statistics and text analysis. It will create new knowledge about the evolution of writing in the multilingual Middle Ages. Il will offer new technologies and approaches for analyzing handwritten texts in a digital context. It will enhance the comprehension of the scribal processes for the anthropologists, pedagogues, neurocognitivists.


A2iA A2iA : Analyse d'Image & Intelligence Artificielle

CESCM - UMR7302 Centre d'études supérieures de civilisation médiévale (CESCM)

IRHT Délégation Régionale Ouest et Nord

ENC - EA3624 École Nationale des Chartes - Centre Jean Mabillon

IRHT - UPR841 Institut de Recherche et d'Histoire des Textes

INSA DE LYON - LIRIS Institut National des Sciences Appliquées de Lyon Laboratoire d'Informatique en Images et Systèmes d'Information (LIRIS)

ICAR - UMR5191 Interactions, Corpus, Apprentissage, Représentations (ICAR)

LIPADE-EA2517 Laboratoire d'Informatique Paris Descartes (LIPADE)

ANR grant: 240 993 euros
Beginning and duration: février 2013 - 36 mois


ANR Programme: Corpus, données et outils de la recherche en sciences humaines et sociales (Corpus) 2012

Project ID: ANR-12-CORP-0010

Project coordinator:
Monsieur Dominique Stutzmann (Délégation Régionale Ouest et Nord)


Back to the previous page


The project coordinator is the author of this abstract and is therefore responsible for the content of the summary. The ANR disclaims all responsibility in connection with its content.