A tiny SNP can greatly influence disease
Personalised Medicine and Health
I鈥檓 a statistical geneticist with a background in genome-wide association studies (GWAS) and computational analysis, and it鈥檚 a good thing others love being in the lab 鈥 otherwise, I wouldn鈥檛 have any data. After tissue samples have been collected and processed in the lab and sequenced, it鈥檚 my turn to figure out how to interpret all of the data.
Our genome is complex. It has coding regions that map directly to genes and traits, and non-coding regions that were previously thought to have no biological function and were therefore called junk DNA.
SNPs make us different from one another
What鈥檚 fascinating about our genome is that we can have many naturally occurring individual variations within our DNA and still be healthy. It鈥檚 estimated that we each have about 10 million of these single genetic variants, called SNPs (single nucleotide polymorphisms, and pronounced snips), most of which are harmless, since they don鈥檛 directly alter a gene and therefore, are not the cause of a disease. Yet, with advances in technology like GWAS, we鈥檙e discovering that many SNPs are associated with disease. In fact, it appears that >80% of SNPs associated to disease are located in non-coding regions of DNA, where they can influence genes and affect the severity of illness in a patient.
SNPs are a powerful tool
SNPs may help trace the inheritance of certain diseases within families and populations and can also help identify an individual鈥檚 chances of developing a disease, a patient鈥檚 response to environmental risk factors or predict a patient鈥檚 response to drug treatment. In contrast, if certain SNPs are known to be associated with a disease, we can look at what genes they associate with in order to determine the cause.
This research area has a high degree of variety and uncertainty, which I enjoy, and I think I鈥檝e developed an intuition for how to navigate this
Is positioning important for a SNP?
It鈥檚 the associations between SNPs and disease that captivate me. Reading a sequence of DNA is like reading sentences in a book: we can encounter a SNP on page 10 and have to wait until page 325 to read about the gene it associates with. Our genome is clever though, it holds things together. In reality, it twists and turns, and folds back on itself in a variety of ways, and we know that this folding is very important for regulating genes. Thus, our SNP on page 10 may end up being right next to, or very close to the gene it associates with. We want to understand how the distance between a SNP and its associated gene affects molecular signaling, and whether location is at all important. This will help us gain a better understanding of genetic signals and the underlying biology of disease.
SNPping together causal relationships
Currently, I鈥檓 in a machine learning lab where we are trying to develop computer algorithms to merge various sources of biological information (like genome folding) together with SNPs that associate with common diseases in order to generate statistically convincing causal relationships between SNPs and disease.
This research area has a high degree of variety and uncertainty, which I enjoy, and I think I鈥檝e developed an intuition for how to navigate this. I often meet with researchers in different disciplines and what we have in common are computational and analytical methods -- everyone has their own particular topic, their own traits or disease that they鈥檙e studying, yet we can easily discuss how to handle and share our data. For now, based on my calculations and 鈥渓ocal optimization,鈥 here is where I belong, both personally and professionally.
Sara Pulit, PhD
Post-doctoral researcher
Center for Molecular Medicine
UMC Utrecht