Detection of regulatory patterns and variants by modelling covariation of regulatory sequences and expression across genes and individuals

Navn på bevillingshaver

Lasse Maretty

Institution

University of Copenhagen

Beløb

DKK 1,134,000

År

2017

Bevillingstype

Reintegration Fellowships

Hvad?

Modern DNA sequencing technology makes it possible to accurately and affordably determine an individual's genome. However, predicting the effect (if any) of individual genomic variants on phenotypic traits (e.g. disease risk) remains difficult - especially if the variants are carried by few individuals. In this project, I will develop a new statistical method for identifying variants that affect gene expression. The method is based on a new modelling approach that enables sharing of information across variants and hereby increases the power to detect rare-variant associations - I hope! I will apply the model to large, publicly available datasets to identify new mutations that affect gene expression and leverage these signals to further our understanding of gene expression regulation.

Hvorfor?

A key approach to studying biological systems in both basic and applied research is to search for associations between genomic variation and phenotypic traits. However, while the genomic revolution has provided the data foundation for performing such studies, it remains difficult to identify phenotypic associations with rare variants using current statistical methods. As any individual carries a large number of rare variants and as basic evolutionary theory tells us that such variants are more likely to have significant phenotypic effects, new statistical approaches are needed.

Hvordan?

The development and application of the model will be done in an iterative fashion, where I - starting from a simple model - will iterate between modifying the model and applying it to a real dataset reserved for testing. I expect that I by continuously studying the model's performance on real data will be able to both guide the modelling process and determine when the model is accurate enough for large scale application to real datasets for identification of new variants. This approach will ensure that new biological results are obtained as early as possible in the project.

Tilbage til oversigtssiden