Working Group



  Jinko Graham

  Professor       Office TLX-10553
  Statistics and Actuarial Science       IRMACS Lab
  Simon Fraser University       Tel: (778)782-3155
  Burnaby, BC V5A 1S6      

A core activity in science is to collect data to develop and evaluate theories about the way things work. To address our scientific questions, we design studies and then look at the resulting data to find meaning in the context of randomness. Statistics provides a principled framework and powerful set of methods to help in these efforts.

As a statistical geneticist, my interests lie at the interface with genomic science. A central theme in my research is that the genomic data on DNA sequence variation of individuals reflects their underlying genealogical relationships. These relationships can tell us about individual susceptibility to traits that run in families or populations, and so are of use in mapping disease genes. Another theme is that, in genomic studies, high-throughput measurement technologies produce large data sets of different types. Often, these data sets can be integrated for additional scientific insights into individual disease susceptibility and how genetic and environmental variation modulate the expression of phenotypes. However, with big data come issues in statistical interpretation. Does the way the data have been collected allow us to answer our research questions? What are the relevant patterns and what are blind alleys that result from unrecognized biases in sampling. What are the sources of variation in our data? Are we seeing merely chance patterns in random data that has no systematic component?

In my research, I try to incorporate fundamental genetics principles and statistical thinking into models and methods for the analysis of genomic data that is directed towards understanding individual disease susceptibility. Recent developments in statistical computing and Bayesian modelling of data structures with complex dependencies have enabled and enriched this effort. Additionally, advances in high-dimensional data analysis have enabled genomic data to be integrated with imaging and clinical data, in recent collaborations with neuro-imaging and pediatrics experts. Broadly, my focus is on developing and evaluating analytic tools to understand trait susceptibility, while accounting for random variation, complex data dependencies, and the way the data have been sampled.