Minutes 23/09/05

Project Rearrangements in genomes with unequal content
Date 23/09/05
Version 1.0
Purpose of Meeting Report on results of prototype
Supervisor present Leong Hon Wai

Current Task/Sub-Task

  • Implementation of second iteration

Reported On

  • Results of prototype

Discussed

Analysis of prototype

Our initial approach to the problem unequal content was to keep the entire genome but when doing a pairwise distance comparison reduce the two genomes under comparison to a set of common markers. This has the advantage of being relatively easy to incorporate into the current program. However due to the inherent complexity of the program I have only managed to modify the procedure for finding the median of 3 genomes to use this approach.

In order to evaluate the effectiveness of this idea, I create several sets of test data. Starting from the ancestral genome which is the identity permutation of length n, randomly applied r1, r2 and r3 reversals to obtain G1, G2 and G3. Then d1, d2 and d3 deletions was applied to generate the final set of genomes with unequal content. Finally the prototype was used to recreate the ancestral median.

The results of the experiments showed that when the number of deletions was about the same for all three genomes, the prototype did a relatively good job of recreating the correct ancestral median. However for data sets in which deletions occured in only one genome, say G1, then the algorithm will first merge G1 and another genome and this results in an ancestral median is more similar to G1 and hence further from the identity. The main reason is that when the number of common genes between two genomes is small, the resulting reversal distance will also be small. This means that two genomes with fewer genes in common will have a smaller reversal distance compared to two genomes with more genes in common! However in reality the evolutionary distance between two genomes with few genes in common should be larger than two genomes with many genes in common. This anomaly is a result of the fact that the current distance does not take into account the similarity (in terms of gene content) of the two genomes being compared.

Improvements to the prototype

Prof Leong suggested that we modify the distance computation to take into account the gene content of the two genomes. Modifying the distance computation is fairly straightforward however the current algorithm does not take into account operations that modifies the gene content of the genomes. Including additional operations into the program will require extensive changes to the current implementation.

Problems Raised

  • None

Things to report on in next meeting

  • Alternative approaches
 
mgr/mtg_0012.txt · Last modified: 2007/12/29 09:37 (external edit)
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki