The French National Research Agency Projects for science

Voir cette page en français

ANR funded project

Interactions humain-machine, objets connectés, contenus numériques, données massives et connaissance (DS0707) 2015
Projet GoAsQ

Generating and Answering Ontological Queries over Semi-structured Medical Data

More and more information on individuals (e.g., persons, events, biological objects) are available electronically in a structured or semi-structured form. However, selecting individuals satisfying certain constraints based on such data manually is a complex, error-prone, and time and personnel consuming effort. For this reason, tools that can automatically or semiautomatically answer questions based on the available data need to be developed. While simple questions can directly be expressed and answered using keywords in natural language, complex questions that can refer to type and relational information increase the precision of the retrieved results, and thus reduce the effort for posterior manual verification of the results. One example for this situation is the setting where electronic patient records are used to find patients satisfying non-trivial combinations of certain properties, such as eligibility criteria for clinical trials. Another example that will also be considered as a use case in this project is the setting where a student asks the examination office questions about study and examination regulations. In both cases, the original question is formulated in natural language.

In the GoAsq project, we will investigate, compare, and finally combine two different approaches for answering questions formulated in natural language over textual, semi-structured, and structured data. One approach is the text-based question answering that directly answers natural language questions using natural language processing and information extraction techniques. The other tries to translate the natural language questions into formal, database-like queries and then answer these formal queries w.r.t. a domain-dependent ontology using database techniques. The automatic translation is required since it would be quite hard for the people asking the questions (e.g. medical doctors, students) to formulate them as formal queries. The ontology allows to overcome the possible semantic mismatch between the person producing the source data (e.g., the GPs writing the clinical notes) and the person formulating the question (e.g., the researcher formulating the trial criteria). GoAsq can thus leverage recent advances obtained in the ontology community on accessing data through ontologies, called ontology-based query answering (OBQA). More precisely, in Task 1 of the project we investigate the two use cases mentioned above (eligibility criteria; study regulations). In Task 2 we will introduce and analyze extensions to existing formal query languages that are required by these use cases. Task 3 will develop techniques for extracting formal queries from textual queries, and Task 4 will evaluate the approach obtained this way, compare it with approaches for text-based question answering, and develop a hybrid approach that combines the advantages of both.


LIMSI Laboratoire d'informatique pour la mécanique et les sciences de l'ingénieur


UPSUD/LRI Université Paris Sud/Laboratoire de Recherche en Informatique

ANR grant: 271 134 euros
Beginning and duration: décembre 2015 - 36 mois


ANR Programme: Interactions humain-machine, objets connectés, contenus numériques, données massives et connaissance (DS0707) 2015

Project ID: ANR-15-CE23-0022

Project coordinator:
Madame Yue Ma (Université Paris Sud/Laboratoire de Recherche en Informatique)


Back to the previous page


The project coordinator is the author of this abstract and is therefore responsible for the content of the summary. The ANR disclaims all responsibility in connection with its content.