I’m a bioinformatics scientist and the head of Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center in Berlin. I have been developing computational methods for analyzing and integrating large-scale genomics data sets since 2002. I mainly use machine learning and statistics to uncover patterns related to important biological variables such as disease state and type. I spent some time in the USA, Norway, Turkey, Japan, and Switzerland in order to pursue research work and education related to statistics, machine learning and bioinformatics.
The underlying aim of my current work is utilizing complex molecular signatures to provide decision support systems for disease diagnostics and biomarker discovery. In addition to the research efforts and managing a scientific lab, since 2015, I have been organizing and teaching at computational genomics courses in Berlin with participants from across the world.
PhD in Genomics & Bioinformatics, 2010
University of Bergen, Norway
BSc in Bioengineering (Bioinformatics track), 2005
Sabanci University, Istanbul, Turkey
Cancer is a disease of the genome that is characterized by abnormal cell growth and invasion of other body parts. It affected 19m people in 2020 and was the cause of 9.5m deaths that year alone. The underlying cause of the abnormal phenotype of cancer cells is, in most cases, the acquired genetic defects causing other molecular changes. These changes in turn help cells circumvent safeguarding mechanisms and become cancerous. Therefore, genetic defects and related molecular changes of the tumor plays an important role in discovering disease mechanisms, uncovering potential drug targets and matching patients to targeted therapies.
The most commonly used molecular markers to accomplish all these different but related tasks are mutations and they are obtained via panel sequencing. This technique examines frequently mutated genes in cancer to come up with mutations and copy-number variations for those genes. Especially for diagnostics the approved methods for targeted drugs are usually presence or absence of mutations. However, as mentioned, the mutations and the co-occuring or consequential abnormalities in gene expression, proteomics and epigenomics space are jointly responsible for the abnormal phenotype of cancer cells. Relying only on panel sequencing data ignores this fact especially in diagnostics, where panel sequencing is the main molecular diagnostic employed by molecular pathology labs. We believe in order to better understand cancer and develop better drugs and diagnostics, we need to make use of all the molecular features by integrating different omic data sets: transcriptome, genome, epigenome etc. Using multi-omics and machine learning techniques, we show that using multi-omics has indeed superior performance for different clinical variable modeling tasks in cancer. In addition, sophisticated data integration techniques such as variational autoencoders can also be of help when dealing multi-layered datasets such as multi-omics. They provide faster, more flexible and accurate approach for data integration. Lastly, in the single-cell setting as well, the predictive modeling performance for cancer cell types can also be enhanced by the use of multi-omics data rather than sticking to one data type.
All in all, we show that using multi-omics together with machine-learning based data integration techniques improves clinical variable modeling in cancer and should be preferred over panel sequencing or single omic datasets.