> Back to seminars list


Tuesday, October 1st, 2019
From 14h To 15h30
Centre de Recherche - Paris - Amphithéâtre Constant-Burg - 12 rue Lhomond, Paris 5e

Pinpointing disease-causing regulatory genetic variants by multi-omics and machine learning

Determining the genetic cause of rare disorders is crucial for the affected families, enabling genetic testing among relatives and providing a rationale for therapies. However, for most of the rare disease patients undergoing DNA sequencing, which variant is pathogenic remains unclear. I will present a blend of multi-omics and machine learning approaches to address this problem.

We and others have shown that sequencing RNA, additional to the DNA of patients, boosts the diagnosis rate of rare disease patients by revealing pathogenic gene regulatory defects that still cannot be predicted from genotype. Novel algorithms are needed to realize the potential of RNA-sequencing and other omics in revealing the causes of rare diseases. We formalize this problem as an outlier detection task, with the twist that here, outliers are the signal of interest and not artifacts to exclude from the data. I will present OUTRIDER [1], a method based on a denoising auto-encoder, that allows detecting expression outliers controlling for technical and biological confounding effects.

Having identified a pathogenic gene regulatory defect, the last piece of the puzzle is the genetic variant causing it. Machine learning applied to high-throughput genomics technologies is making drastic progresses in unraveling how every step of gene expression is genetically encoded. However, lack of standardization of such models has hampered their impact in medical research. We are co-developing Kipoi, a collaborative initiative to define standards and foster sharing and re-use of trained machine learning models in genomics [2]. Our repository (kipoi.org) contains over 2,000 trained models of transcriptional and post-transcriptional mechanisms. Using a modular modeling approach leveraging Kipoi, we built MMSplice [3], the first ranked splicing effect predictor at the CAGI5 challenge (Critical Assessment of Genome Interpretation).


[1] Bretchmann et al. OUTRIDER: A Statistical Method for Detecting Aberrantly Expressed Genes in RNA Sequencing Data, AJHG, 2018

[2] Avsec et al. The Kipoi repository accelerates community exchange and reuse of predictive models for genomics, Nature biotechnol., 2019

[3] Cheng et al. MMSplice: modular modeling improves the predictions of genetic variant effects on splicing, Genome Biology, 2019.


Prof. Julien GAGNEUR
Assistant Professor for Computational Biology at TUM (Technische Universität München)

Technische Universität München - Dpt Informatics

Invited by

Prof. Thomas WALTER
Directeur CBIO, Enseignant Chercheur Mines ParisTech
Domain 3 - U900 - CBIO - Bioinformatics, Biostatistics Epidemiology and Computational Systems

Institut Curie


Prof. Thomas WALTER

Directeur CBIO, Enseignant Chercheur Mines ParisTech

Institut Curie

Send an e-mail