Vlado Menkovski bio photo

Vlado Menkovski

Assistant Professor at Eindhoven University of Technology.

Email Google Scholar

The goal of the project will be dimensionality reduction using Deep Autoencoders. The use case of this project is the epigenetic data of human cancerous tumors, but the approach should be transferable to other datasets and domains. This use case comes from Philips’ Genetics department in New York, where 3 months of the internship will be spent on site. The problem with the mentioned epigenetic data is its dimensionality, since a human cell has about 22.000 genes and per cancer type less than a 1000 samples are available. This leads to a dataset with 22.000+ features and just a 1000 data points. These dimensions are unusable for any type of Statistical Analysis or Machine Learning, so this dimensionality should be reduced. In the past many statistical approaches have been developed for dimensionality reduction, but for this use case these methods do not suffice. Some work has been done with Autoencoders and Deep Learning but the field is not very matured at this point. At this point we expect Deep Autoencoders to work really well on this problem, but other methods will be explored as well. Together with Philips’ bioinformaticians I hope to develop a method to tackle this problem for the epigenetic data and eventually this method should become general purpose for any dataset having similar dimensionality problems.