Autism spectrum disorder

Autism spectrum disorder (ASD) is a common neurodevelopmental disorder with high heritability and both phenotypic and genetic heterogeneity. A practical approach to constrain ASD heterogeneity and to develop effective treatments is by studying monogenic disorders with high penetrance for causing ASD and subsequently to translate those results into a broader group of patients with genetically diverse, but mechanistically related etiology. Phelan-McDermid syndrome (PMS) is a common monogenic cause of ASD, accounting for ca. 1% of ASD diagnoses, and is caused by heterozygous 22q13.3 deletions or point mutations leading to haploinsufficiency of SHANK3 gene. To this end, our work aims to better understand the cellular and molecular mechanisms affected by SHANK3 mutations. We employ a combination of genetic and transcriptomic techniques to  ex vivo peripheral blood and iPSCs-derived NPCs derived from PMS patients and unaffected siblings. We are also actively integrating these human data with brain-based transcriptomics from Shank3-knockout animals. These projects have translational research potential with the goal of identifying the mechanisms underpinning SHANK3-deficiency in PMS, thereby accelerating the development of new mechanism-based treatments for PMS as well as iASD.

Posttraumatic stress disorder, anxiety and depression

Art by Jessica Johnson

I am involved in a number of functional genomic studies (i.e. RNA-sequencing, 450K methylation arrays) utilizing peripheral tissues (i.e. whole blood, umbilical cord blood, serum, plasma) to study immune mechanisms and biomarkers of major depression and posttraumatic stress disorder. These projects investigate; a) how prenatal exposure to maternal depression and/or posttraumatic stress impacts immune and metabolic function as well as cognitive and affective behaviour in early childhood; b) genes that differ in their expression or sequence between individuals that are vulnerable or resilient to stress; and c) genes that differ in their expression or their sequence between individuals with and without depression. We employ a combination of unbiased data-driven genetic, transcriptomic and proteomic approaches. These projects have translational research potential with the goal of identifying objective diagnostic/prognostic blood biomarkers for stress reactivity/resilience and depression. Molecular data  are often integrated together with cognitive, electrophysiological, and neuroimaging techniques using supervised multivariate machine-learning algorithms to  identify unique combinations of biosignatures able to distinguish between finer gradients of health and disease.


Another tier of my work seeks to identify key genes and molecular pathways contributing to the development of  schizophrenia (SCZ). To this end, I am involved in the analysis of large-scale genetic and RNA-sequencing datasets in collaboration with the CommonMind Consortium, which involve multiple brain regions from hundreds of post-mortem brain samples from schizophrenia cases. In parallel, and in collaboration with colleagues at the University of Cape Town, we are also actively researching pharmacological/environmental models of SCZ, namely methamphetamine-associated psychosis (MAP). MAP is considered to be a model of  SCZ due to similarities in clinical course, response to treatment and presumed neuromechanisms. However, several challenges still exist in accurately identifying and diagnosing MAP before the model can be used to accelerate biomarkers for SCZ. Our work applies a combination of blood-based transcriptomics, serum-based proteomics and neuroimaging  to better understand the neurocognitive and molecular deficits underlying the development of MAP and its progression in order to develop improved diagnostics and treatment approaches.

Algorithm Development

A critical component of biomarker discovery and validation is a rigorous statistical analysis. I am especially interested in solving emerging biological and algorithmic problems arising in computational/molecular biology, such as: a) comparative transcriptomics and proteomics; b) computing the genetically regulated component of gene/protein expression; c) predicting cellular frequencies from heterogeneous biological tissue; d) multi-modal integrative deep machine-learning applications; e) modelling RNA-editing in from RNA-sequencing data; f) gene network reconstruction and multi-modal data integrations. In addition to generating new data in support of these aims, I also use just about any high-throughput data I can get my hands in the public domain, which can ultimately be translated into better understanding the molecular mechanisms underlying neurodevelopmental and nueropsychiatric disorders.

Download computational code for quality control and analysis of RNA-sequencing gene expression data from my evolving github account, here.