JCJC SVSE 6 - JCJC : Sciences de la vie, de la santé et des écosystèmes : Génomique, génomique fonctionnelle, bioinformatique, biologie systémique

Genetic Dating in the Post-Genomic Era – DATGEN

The making of human population structure

The objective of the project is to propose statistical methods and software to characterize population genetic structure and processes of biological adaptation. Applications of the methods concern mainly massive genetic data available for the human species.

Develop cutting-edge software that can scale with the dimension of new genetic data

The advent of high-throughput genotyping technologies has revolutionized evolutionary biology and genetics. These fields are now inundated by massive quantities of data, as determining dense sets of genotypes from individual genomes is no longer an overly expensive and difficult task. In the next few years we will see complete genome sequences for multiple individuals from a number of species.These genome-wide population-genetics data are being produced for a multitude of purposes, such as disease-gene mapping, quantitative trait locus mapping and inference of evolutionary history.<br /><br />Software were developed mainly be English and American teams to analyze these genetic data. To get a better understanding of the evolutionary history of populations of living organisms, these software infer 1. population structure and 2. processes of biological adaptation. Our project aims at developing software that can answer to these questions and cope with the dimension of the data.

We should cope with several issues when analyzing genetic data at the scale of population.
1. The dimension of the data is more and more important and we propose methods that reduce the dimension of the data to provide interpretable results.
2. There is a large uncertainty concerning biologica parameters and that is a reason why we adopt a Bayesian perspective that can perfectly handle that.
3. We should make some effort when providing the software to the biologists so that the results provided by our software are correctly interpreted.
4. We should make an effort to provide proper visualization tool with the software we develop

Our statistical analyzes have already helped to answer central questions concerning the history of human populations

1. For each continent, we characterized the main orientations of genetic differentiation: North-South in Europe and Africa, East-West in Asia and no preferred orientation for Native American.
2. We have shown that the origin of the so-called modern humans did not come from a population bottleneck that would have occurred 150,000 years ago during the penultimate ice age.

Ssoftware development is not yet finalized and we have no feedback from the biological community. However, these methodological developments have brought us closer to the cellular genetic laboratory at INRA Toulouse, the lab of stat at the University of New South Wales in Sydney (Australia) and have strengthened an existing collaboration with the department of evolutionary biology at Uppsala University (Sweden).

A new difficulty we have to face with is the size of the data when the number of molecular markers (SNPs) exceeds half a million. We plan to release new versions of our software which can take into account data of this size.

1. Blum M.G.B., M.A. Nunes, D. Prangle, S.A. Sisson. A comparative review of dimension reduction methods in approximate Bayesian computation. Statistical Science, 28: 189-208 (2012)
Statistical analysis where we compare variants of the « Approximate Bayesian computation », which is a method widely used to infer the evolutionary history of populations of living organisms.

2. Sjödin P., A.E. Sjöstrand, M. Jakobsson, M.G.B Blum. Resequencing data provide no evidence for a human bottleneck in Africa during the penultimate glacial period. Molecular Biology and Evolution 29:1851-1860 (2012)
Based on cutting-edge statistical analysis, we have shown that the origin of modern humans is not due to a demographic bottleneck that would have arisen during the penultimate ice age some 150,000 years ago. Our paper provides evidence against the «bottleneck« theory that was popular among anthropologists.

3. Jay F, P Sjödin, M Jakobsson, MGB Blum. Anisotropic isolation by distance: the main orientations of human genetic differentiation. Molecular Biology and Evolution 30: 513-525 (2013)

4. Gattepaille LM, M Jakobsson, MGB Blum. Inferring population size changes with sequence and SNP data: lessons from human bottlenecks. Heredity 110: 409-419 (2013)

5. SoftwareLocalDiff (http://membres-timc.imag.fr/Michael.Blum/LocalDiff.html) to provide friction maps.based on molecular data. This software characterizes genetic population structure in a spatial context.

6. Software PCAdapt to detect genes involved in local adaptation (http://membres-timc.imag.fr/Nicolas.Duforet-Frebourg/PCAdapt.html)

The advent of high-throughput genotyping technologies has revolutionized evolutionary biology and genetics. These fields are now inundated by massive quantities of data, as determining dense sets of genotypes from individual genomes is no longer an overly expensive and difficult task. In the next few years we will see complete genome sequences for multiple individuals from a number of species. In humans, an international consortium is currently sequencing the genomes of 1000 human beings (1000 genomes project, www.1000genomes.org/). These genome-wide population-genetics data are being produced for a multitude of purposes, such as disease-gene mapping, quantitative trait locus mapping and inference of evolutionary history. One appealing opportunity created by population genomic data is the dating of past demographic events. Genetic dating is an emerging technique and has already been applied for dating the emergence of the H1/N1 influenza strain, or for providing a timescale of the human Paleolithic journey.
Our project proposes to develop computational methods for genetic dating that will be implemented in a user-friendly software We will provide the ages of the different common ancestors from which modern humans inherited stretches of DNA. We will apply our method of dating to the forthcoming 1000 human genomes in order to refine our understanding of human origins. Different scenarios of human origins are at the forefront of the debate between paleoanthropologists. Our analysis of 1000 entire genomes will be a unique opportunity to unravel the process by which modern humans colonized the globe. The analysis of the 100 genomes project will be reinforced by a joint analysis of microsatellite data from the Human Genome Diversity Panel. Outside the field of human genetics, our software of genetic dating has a strong potential since large sequence population data, if not whole genome data, will be available soon for a large number of species.
In addition to the dating of neutral markers, we view the 1000 genomes project as a unique occasion to date the occurrence of adaptive events in humans. For dating mutations that have been involved in adaptive events, we will propose an original method that accounts for human spatial expansion. Our large-scale dating of adaptive events will shed light on the current controversy concerning the magnitude of biological adaptation following the rise of agriculture.


Project coordination

Michael Blum (CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE - DELEGATION REGIONALE RHONE-ALPES SECTEUR ALPES) – michael.blum@imag.fr

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

UMR UJF/CNRS 5525 CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE - DELEGATION REGIONALE RHONE-ALPES SECTEUR ALPES

Help of the ANR 165,000 euros
Beginning and duration of the scientific project: - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter