Speeding up autism gene discovery via deep multitask learning of data from cohorts of comorbid neurodevelopmental and neuropsychiatric disorders

Awarded: 2019
Award Type: Pilot
Award #: 640935

A. Ercüment Çiçek, Ph.D.
Bilkent University

Computational gene risk prediction methods and network-based analyses are major tools in analyzing large autism spectrum disorder (ASD) datasets based on whole-exome and whole-genome sequencing studies for (i) imputing the insufficient statistical signal and providing a genome-wide risk ranking and (ii) finding out the affected cellular circuitries such as pathways and networks of genes. While they have proven useful, they can benefit from larger cohorts for more precise and sensitive predictions.

One way to increase the cohort size is to use data obtained from cohorts of comorbid neurodevelopmental and psychiatric disorders; such an approach exploits information coming from the underlying shared genetic architecture. Some studies treat all such conditions as the same disorder and bag mutation counts, but this is a crude way of analyzing the data and one loses disorder-specific genetic and phenotypic components.

In the current project, Ercüment Çiçek and his team propose a novel cross-disorder gene discovery algorithm, which will analyze whole-exome sequencing data from related conditions (including ASD, intellectual disability, schizophrenia and epilepsy) simultaneously and explicitly learn shared and disorder-specific genetic components. Learning the shared architecture will: (i) increase the prediction power for ASD (also, for all other individual disorders) by salvaging samples from others and (ii) advance our understanding of ASD in terms of its comorbidity with other conditions. Moreover, since this method will not treat all disorders as one (i.e., bagging the mutations) and will explicitly learn disorder-specific components as well, it will not suffer from losing important information relevant to each of the individual disorders.

SFARI