Unsupervised Ordering of Independent Components in High-Dimensional Biological Data – Niels Bohr Institute - University of Copenhagen

Forward this page to a friend Resize Print kalender-ikon Bookmark and Share

Niels Bohr Institute > Calendar > 2007 > Speciale by Lykke Pede...

Unsupervised Ordering of Independent Components in High-Dimensional Biological Data

The rapid development of new technologies in biology has resulted in high-dimensional data. These require analyzing methods capable of handling a large amount of complexity and extract relevant information.

In my thesis I have focused on the analyses of microarray data. The use of microarrays for analyses of expressions influenced by different biological factors, e.g., cancer or treatment, has become more and more common. The problem is to pick out those genes that are strongest influenced by biological factors.

Independent Component Analysis linearly represents data by means of independent components (ICs). Recently it has been shown that the ICs relate to different biological processes.

I propose a new method of unsupervised ordering of ICs. It is based on a weighted sum of individual scores that reflects different characteristics of the ICs.  Using my method the number of ICs representing biologically coherent groups of genes is estimated by visual inspection of the weighted sums.

The method is not limited to use for analyses of microarray data. The weights of the individual scores are changed depending on the type of data. An example of data dependent on an interval scale is given, where the method of unsupervised ordering finds ICs representing predefined synthetic signals. These signals are mixed and noise is added before carrying out the ordering of the ICs.

The rapid development of new technologies in biology has resulted in high-dimensional data. These require analyzing methods capable of handling a large amount of complexity and extract relevant information.

In my thesis I have focused on the analyses of microarray data. The use of microarrays for analyses of expressions influenced by different biological factors, e.g., cancer or treatment, has become more and more common. The problem is to pick out those genes that are strongest influenced by biological factors.

Independent Component Analysis linearly represents data by means of independent components (ICs). Recently it has been shown that the ICs relate to different biological processes.

I propose a new method of unsupervised ordering of ICs. It is based on a weighted sum of individual scores that reflects different characteristics of the ICs.  Using my method the number of ICs representing biologically coherent groups of genes is estimated by visual inspection of the weighted sums.

The method is not limited to use for analyses of microarray data. The weights of the individual scores are changed depending on the type of data. An example of data dependent on an interval scale is given, where the method of unsupervised ordering finds ICs representing predefined synthetic signals. These signals are mixed and noise is added before carrying out the ordering of the ICs.

Supervisor: Mogens Høgh Jensen