COSINUS - Conception et Simulation

Large-scale simulation-based probabilistic inference, optimization, and discriminative learning with applications in experimental physics – SIMINOLE

Submission summary

Simulation lies at the heart of most of today's large scale experiments. Since the appearance of heavy computational machinery, simulation has become the third pillar of scientific discovery beside experimentation and theoretical model building. Its most important role is to connect models at different levels of resolution. Simulation can complement or, in certain cases, replace expensive experimentation; it can be used to validate high-level models using low-level experimental data; it can serve as an engineering aid for designing tools, machines, or detectors. On the other hand, simulation has also become the bottleneck of these applications so a lot of research has been devoted to find how to carry out simulations more efficiently. Most of the time these studies follow one of two approaches: they either delve into the inner workings of the simulator and try to improve it algorithmically, or they attack the problem by implementing simulators on various high-end computing devices. In this project we follow a third approach: we propose to use simulators more efficiently by considering them as a black box, and minimizing the number of calls to the simulator for accomplishing certain tasks.

Simulators can be used in different ways for solving particular problems. In this project we identified three common scenarios. In probabilistic inference, the goal is to find values for some input parameters that generate simulations similar to observed data. Our goal is to formalize a data-driven simulation setup, and to replace the sub-optimal naive exhaustive search by an approach based on Monte-Carlo Markov chain (MCMC) techniques. In the second scenario, simulation is used in an optimization loop. When designing complex instruments, tools, or machines, it is a common situation that the simulated instrument is assigned a utility (or cost), and the goal of the procedure is to find regions of the parameter space where the utility is high (or the cost is low). As in the previous scenario, exhaustive search is highly sub-optimal. In this task of the project we will formalize the problem as utility-driven simulation in a stochastic optimization setup, and apply powerful adaptive techniques developed recently for optimizing expensive black-box functions. In the third scenario, a large set of simulations is used to "discover" interesting features, for example, features that predict well certain generating parameters. These "observables" are then used on real data to estimate or reconstruct generative parameters. The goal of this task is to optimize the use of simulations by replacing the "manual" discovery of observables using machine learning algorithms.

The research outlined above is directly motivated by the design and inference problems we are facing in two major astroparticle physics experiments, the Pierre Auger experiment and the JEM-EUSO experiment. Their goal is the same: to study the properties of ultra-high energy cosmic ray particles by observing the particle cascade generated by the collision of the cosmic ray particle and atmospheric particles. The Auger experiment employs two independent terrestrial detectors covering 3000 square kilometers on the Argentinian pampas, whereas the JEM-EUSO telescope will be on orbit on the Japanese Experiment Module of the International Space Station starting in 2015. The techniques outlined in the previous paragraph will be directly usable for the statistical data analysis in both experiments and for the design of the on-board software of the JEM-EUSO experiment. The methodological development is motivated directly by these two concrete applications, but the proposed techniques will be generally usable in other simulation-heavy application domains.

Balázs KÉGL (CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE - DELEGATION REGIONALE ILE-DE-FRANCE SECTEUR SUD) – balazs.kegl@gmail.com

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

LTCI CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE - DELEGATION REGIONALE ILE-DE-FRANCE SECTEUR PARIS A
INRIA Saclay - Île-de-France - Equipe-Projet TA0 INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE - (INRIA Siège)
LAL CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE - DELEGATION REGIONALE ILE-DE-FRANCE SECTEUR SUD

Help of the ANR 1,042,903 euros
Beginning and duration of the scientific project: - 48 Months

Explorez notre base de projets financés

ANR makes available its datasets on funded projects, click here to find more.