The French National Research Agency Projects for science

Voir cette page en français

ANR funded project

Réseaux, technologies logicielles, cybersécurité et Sécurité globale (CE28)
Edition 2014


An Operational Automatic Framework for Identity Document Fraud Detection and Profiling

An Operational Automatic Framework for Identity Document Fraud Detection and Profiling

Objectives, originality and novelty of the project
The core idea of IDFRAud project is to provide an automatic verification system for identity documents in order to replace existing manual verification processes. The different components of ID analysis and verification in IDFRAud are driven by a set of control rules. In order to guarantee an interpretable and adaptive behavior at each ID analysis step, the identity document descriptions will be organized by a knowledge management module. The objective of this module is to facilitate adding new document descriptions and completing the description of
existing documents, which makes the entire system flexible and evolving and allows enriching the set of rules generated by the module. Moreover, IDFRAud system includes a data analyzer in order to detect forensic links from fake IDs datasets.
Thus, IDFRAud project aims at proposing an operational and automatic framework for detection and analysis of fraudulent identity documents. As said above, its novelty is being an evolving system with increasingly exhaustive automatic controls. This evolving capacity relies on the inter-connectivity between three tasks: ID verification, ID knowledge management, and ID fraud analysis.

Scientific approches
An important evaluation was conducted in collaboration with the team LinkMedia for automatic classification of identity documents. Several characteristic extraction methods were evaluated. The main problem is the unavailability of document images which makes classical machine learning methods inadequate to our problem. A new method was proposed to learn classification models only from a single reference image. The creation of models and the classification are based on the extraction of local image descriptors (SURF).
Modelling the highest number of identity documents is one of the most important goals of the project. A primary knowledge base has been elaborated based on identity document models. It was achieved during an internship in the LIS team where the web semantic technologies has been adopted to generate a document knowledge base using the PRADO database. All knowledge can not be generated automatically and some of it should be entered by domain experts. We can encounter two cases within this project: (1) the creation or completion of new model of documents and (2) the description of fake documents in view of their analysis.
In addition, many clustering and data analysis methods were explored to analyse the grouping of fake documents.


A first version of the proposed classiifer is already in use. It allows the discrimination of ten classes (french identity documents) with a very good rate of 98%. New security checks have been also added thanks to the help of the DCPAF and the analysis of the real cases of fake documents.
At this stage, a functional prototype of 'Formulis' exists and has been experience by the policemen of the IRCGN in order to describe Portuguese fake identity documents. In this context, AriadNEXT created a new framework 'AutoRDF' which allows a better manipulation of data in RDF formats by automatically generating c++ corresponding code based on an OWL ontology.


A wider coverage of the classifier is expected in the coming months to include more document models of countries bordering France. Additional security checks will be added following the addition of new models in the knowledge base. This will be coupled with an important extension of document models description in the knowledge database.
Reflection and collaboration between LIS and AriadNEXT has been started on generating workflows coordinating the different document analysis modules. The main objective is to improve the flexibility and maintainability compared to the existing solution which is manual. Flexibility is needed to adapt to different sets of documents, to different devices and at different levels of analysis.

Scientific outputs and patents

International journals:
1. Sébastien Ferré, Sparklis: An Expressive Query Builder for SPARQL Endpoints with Guidance in Natural Language. Semantic Web: Interoperability, Usability, Applicability, 2016. IOS Press.
Internationals conferences:
1. F. Chevalier. AutoRDF - Using OWL as an Object Graph Mapping (OGM) specification language, Extended Semantic Web Conference, demo, (2016)
2. Sébastien Ferré, Peggy Cellier. Graph-FCA in Practice. Int. Conf. Conceptual Structures, 2016: 107-121. Springer.
3. Sébastien Ferré. Bridging the Gap Between Formal Languages and Natural Languages with Zippers. Extended Semantic Web Conference, 2016: 269-284. Springer.
4. Sébastien Ferré. A Proposal for Extending Formal Concept Analysis to Knowledge Graphs. Int. Conf. Formal Concept Analysis (ICFCA), LNCS 9113, pages 271-286, 2015. Springer.
Nationals Conferences:
1. Ahmad Montaser Awal et Abdullah Almaksour. Classification et extraction des documents complexes à partir des images issues d’un périphérique mobile : Application aux documents d’identité, Colloque International Francophone sur l’Ecrit et le Document, 575-588 (2016)
2. Sébastien Ferré. Conception interactive d'ontologies par élimination de mondes possibles. In Ingénierie des connaissances (IC), 2015.



ENSP Ecole Nationale Supérieure de Police

 Pôle Judiciaire de la Gendarmerie Nationale

Université de Rennes 1 / IRISA Université de Rennes 1 / Institut de Recherche en Informatique et Systèmes Aléatoires

ANR grant: 905 433 euros
Beginning and duration: octobre 2014 - 36 mois

Submission abstract

Identity-related frauds represent a major risk on the society safety given its serious consequences. These consequences may vary from small but very frequent frauds (telecom contracts, small credits, etc.) to transnational organized crimes and terrorist actions. An increasing number of false identity documents have been detected during the last few years, according to several official studies around the globe. The fast development of such criminal activities can be explained by an easy and public access to advanced technologies. Several studies have reported the organized nature of ID fraud activities and the progress of such black market.
In order to fight against identity-related frauds, traditional investigation methods applied to identity documents (ID) rely on the presence of an expert, which significantly reduces the spread of such important verification in many administrative and commercial entities. In addition, existing ID control tools have shown several limitations, such as high false positive rates (rejection of valid documents), partial controls or nonexistent evolving capacities (new ID models, new control rules). Because of these shortcomings, automatic verification tools have not been widely used and their role has been reduced to data memorizing and simple assistance tasks.
When ID frauds are detected, it is important to discover forensic links in order to identify the source of those frauds. Current investigation methods do not sufficiently address this problem and are still based on case-by-case approaches with no global analysis. However, an efficient automatic fraud pro ling system allowing to ID fraud link detection will certainly be of great benefit to anti-fraud authorities and will help to uncover many forgery worldwide networks.
The core idea of IDFRAud, our proposition of an industrial research project, is to establish a virtuous circle between two processes: (1) the automatic verification of ID documents, and (2) the automatic profiling of ID frauds. The first process applies control rules on ID documents in order to check their validity, and sends detected ID frauds to the second process that analyzes them in order to discover forensic links (fraud profiling), and to enhance the ID control rules. Control rules are stored and maintained in a knowledge base in order to facilitate the system evolution. The knowledge base is also fed with existing repositories of ID document models (like Prado1) and ID frauds. In fact, adding new control rules enables more robust future ID controls, which in turn enable the detection of more ID frauds, and forensic links.
The first originality of IDFRAud is to propose an automatic solution for ID verification that can handle documents issued from a large set of countries. The solution will be able to execute specific controls according to the ID model (type, country, generation, etc) thanks to a knowledge base. ID content and rules modeling is one of the main originalities of IDFRAud. To the best of our knowledge, there is no existing formal description of ID documents and existing public and industrial ID knowledge bases cannot be directly used for automatic reading and verification.
ID fraud automatic profiling represents a major ambition of IDFRAud. Experts from national security authorities along with academic and industrial partners will work side by side to propose the first data analysis solution dedicated to ID forensic link detection. Such intelligent solution aims at replacing the manual fastidious analysis that can hardly cope with a high-dimensional evolving false ID datasets.


ANR Programme: Réseaux, technologies logicielles, cybersécurité et Sécurité globale (CE28) 2014

Project ID: ANR-14-CE28-0012

Project coordinator:
Monsieur Ahmad Montaser AWAL (AriadNEXT)

Project web site:


Back to the previous page


The project coordinator is the author of this abstract and is therefore responsible for the content of the summary. The ANR disclaims all responsibility in connection with its content.