Corpus - Corpus, données et outils de la recherche en sciences humaines et sociales

Ontology Research, Image Features, Letterform Analysis on Multilingual Medieval Scripts – ORIFLAMMS

Submission summary

The ORIFLAMMS project (Ontology Research, Image Features, Letterform Analysis on Multilingual Medieval Scripts) gathers 3 public research units in Humanities, 3 research units in Engineering, Information Sciences and Technologies and an industrial company in order to enhance our knowledge of the medieval scripts and multilingualism through a new, interdisciplinary approach.
Combining scientific, technologic, industrial and societal issues, ORIFLAMMS aims at analyzing the evolution of writing systems and graphical forms during a long period (Middle Ages) and according to their production contexts (informal, documentary, book scripts) and languages (Latin or vernacular). It aims at establishing an ontology of forms and analyzing the graphical structures of scripts and to upgrade a linear, textual approach with a visual, bi- or tridimensional one. This will give new knowledge for linguistics, history of scripts (palaeography, epigraphy, diplomatics)
ORIFLAMMS will first gather and harmonize several research corpora, then increase and enhance them, in order to create a new Reference Corpus, covering the diversity of medieval scripts: handwriting to print, informal drafts to monumental inscriptions, from Carolingian times to the eve of Renaissance, from theology and liturgy to chancery rolls and accounts.
This Reference Corpus, one of its kind by its wide content, will be of free access, and give access not only to images, but also to graphically analyzed transcriptions (allographetic transcriptions). The text will also be aligned with the image (with coordinates of pixels on the image). All data will be stored in an interoperable XML-TEI file, for long term digital information preservation and access.
The Reference Corpus will create a concordance of all written forms in the Middle Ages. For creating this concordance and move to a large-scale Humanities computing project, ORIFLAMMS will develop innovative image analyzing tools: upgrade the aligning methods for image and text and create a computer-aided transcription tool for medieval scripts. This software will be open source and documented.
Working with large-scale, rich encoded data and adopting new tools in the research community being an issue, the new software will be developed by the end users (Humanities researchers) in a consortium with a private company to make sure that it can be offered to a larger audience and meet the standards of ergonomic and usability.
The innovative tool is not only about producing new encoded data: it is part of their exploitation. ORIFLAMMS plans a new method for the study of scripts: analyzing the graphical variability. The latter will be considered through image analysis on a two-dimensional level, and through computational linguistics for the variability of morphosyntactic and graphical codes in Latin and vernacular.
The open source softwares of computational linguistics will be upgraded and documented as part of this project.
ORIFLAMMS plans the creation of a Reference Corpus, with images of scripts and transcribed, graphically analyzed texts from representative places and dates of medieval culture, and in interoperable formats. It will create innovative, open source tools for image processing and analysis as well as for statistics and text analysis. It will create new knowledge about the evolution of writing in the multilingual Middle Ages. Il will offer new technologies and approaches for analyzing handwritten texts in a digital context. It will enhance the comprehension of the scribal processes for the anthropologists, pedagogues, neurocognitivists.

Project coordination

Dominique STUTZMANN (Délégation Régionale Ouest et Nord)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

INSA DE LYON - LIRIS Institut National des Sciences Appliquées de Lyon Laboratoire d'Informatique en Images et Systèmes d'Information (LIRIS)
A2iA A2iA : Analyse d'Image & Intelligence Artificielle
ENC - EA3624 École Nationale des Chartes - Centre Jean Mabillon
LIPADE-EA2517 Laboratoire d'Informatique Paris Descartes (LIPADE)
IRHT Délégation Régionale Ouest et Nord
CESCM - UMR7302 Centre d'études supérieures de civilisation médiévale (CESCM)
ICAR - UMR5191 Interactions, Corpus, Apprentissage, Représentations (ICAR)
IRHT - UPR841 Institut de Recherche et d'Histoire des Textes

Help of the ANR 240,993 euros
Beginning and duration of the scientific project: January 2013 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter