The French National Research Agency Projects for science

Voir cette page en français

ANR funded project

Contenus numériques et interactions (CONTINT) 2013

Semantic visual analysis and 3D reconstruction of urban environments

The goal of the SEMAPOLIS project is to develop advanced large-scale image analysis and learning techniques to semantize city images and produce semantized 3D reconstructions of urban environments, including proper rendering.

Geometric 3D models of existing cities have a wide range of applications, such as navigation in virtual environments and realistic sceneries for video games and movies. A number of players (Google, Microsoft, Apple) have started to produce such data. However, the models feature only plain surfaces, textured from available pictures. This limits their use in urban studies and in the construction industry, excluding in practice applications to diagnosis and simulation. Besides, geometry and texturing are often wrong when there are invisible or discontinuous parts, e.g., with occluding foreground objects such as trees, cars or lampposts, that are pervasive in urban scenes.

We wish to go beyond by producing semantized 3D models, i.e., models which are not bare surfaces but which identify architectural elements such as windows, walls, roofs, doors, etc. The semantic priors we use to analyze images will also let us reconstruct plausible geometry and rendering for invisible parts. Semantic information is useful in a larger number of scenarios, including diagnosis and simulation for building renovation projects, accurate shadow impact taking into account actual window location, and more general urban planning and studies such as solar cell deployment. Another line of applications concerns improved virtual cities for navigation, with object-specific rendering, e.g., specular surfaces for windows. Models can also be made more compact, encoding object repetition (e.g., windows) rather than instances and replacing actual textures with more generic ones according to semantics; it makes possible cheap and fast transmission over low-bandwidth mobile phone networks, and efficient storage in GPS navigation devices.

The primary goal of the project is to make significant contributions and advance the state-of-the-art in the following areas:

- Learning for visual recognition: Novel large-scale machine learning algorithms will be developed to recognize various types of architectural elements and styles in images. These methods will be able to fully exploit very large amounts of image data while at the same time requiring a minimum amount of user annotation (weakly supervised learning).

- Shape grammar learning: Techniques will be developed to learn stochastic shape grammars from examples, and corresponding architecture style. Learnt grammars will be able to rapidly adapt to a wide variety of specific building types without the cost of manual expert design. Learnt grammar parameters will also lead to better parsing: faster, more accurate and more robust.

- Grammar-based inference: Innovative energy minimization approaches will be developed, leveraging on bottom-up cues, to efficiently cope with the exponential number of grammar interpretations, in particular in the context of grammars featuring rich architectural elements. A principled aggregation of the statistical visual properties will be designed, to accurately score parsing trials.

- Semantized 3D reconstruction: Robust original techniques will be developed to synchronize multiple-view 3D reconstruction with the semantic analysis, preventing inconsistencies such as unaligned roof and windows at facade angles.

- Semantic-aware rendering: Image-based rendering techniques will be developed benefiting from semantic classification to greatly improve visual quality regarding: improved depth synthesis, adaptive warping and blending, hole filling and region completion.

To validate our research, we will run experiments based on various kinds of data concerning Paris (large-scale panoramas, smaller scale but denser and geo-referenced terrestrial and aerial images, cadastral maps, construction date database), reconstructing and rendering an entire neighborhood.


Acute3D Acute3D

GREYC Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen

Inria Paris - Rocquencourt Institut national de recherche en informatique et automatique

Inria Sophia-Antipolis Institut National de la Recherche en Informatique et en Automatique- Centre de Recherche Sophia Antipolis-Méditerranée- REVES

LIGM Laboratoire d'Informatique Gaspard Monge

ANR grant: 791 399 euros
Beginning and duration: octobre 2013 - 42 mois


ANR Programme: Contenus numériques et interactions (CONTINT) 2013

Project ID: ANR-13-CORD-0003

Project coordinator:
Monsieur Renaud Marlet (Laboratoire d'Informatique Gaspard Monge)


Back to the previous page


The project coordinator is the author of this abstract and is therefore responsible for the content of the summary. The ANR disclaims all responsibility in connection with its content.