Experimental results

Simulated data

The following two sections shows some of the initial results we have obtained with the MGR-BR (best rearrangements) and MGR-GT (grow tree) algorithm. These two algorithms are extensions of the basic MGR algorithm proposed in Bourque and Pevzner, 2002 to handle genomes with unequal content.

The main idea behind both algorithms is to retain all the genes but when merging genomes to form an ancestral genome, the set of genes in the ancestral genome is restricted to the set of genes which are common to all the genomes.

The MGR-BR algorithm is a generalization of the first phase of the MGR algorithm which applies “good rearrangements” to reconstruct the phylogeny. In MGR-BR we define a scoring function for a rearrangement on a genome and at each step apply the rearrangement with the highest score. This is repeated until all the genomes converge.

The MGR-GT algorithm makes use of the idea in the second phase of the MGR algorithm which iteratively adds genomes to a partially constucted tree. The genome to be added in each step is the one which is closest to the genomes already in the tree, the addition of the new genomes splits an existing edge in the tree. We can compute the new edges by solving the median problem. The edge to be chosen is one such that the increase in the score of the tree is minimum.

Topological accuracy

Setup

We generate model trees with m leaf nodes using the Beta-splitting model (Aldous 1995) 1) with beta = -1. The length of each edge in the tree is equal to the parameter k. After we have the topology of the tree, we simulate the process of evolution by starting with the a genome with n genes at the root (gene order of the root is that of the identity permutation) and for each edge perform k rearrangements. In order to create genomes with unequal content, we introduce deletions to simulate gene loss. Out of the k rearrangements for each edge, d% of them will be deletions and the rest are reversals. The gene order at the n leaf nodes are then used as input to the algorithms. Since the original MGR algorithm can only deal with genomes with equal content, we will first remove genes which are not present in all genomes from the input data before running MGR.

A tree edge defines a bipartition on the set of leaves, two edges are equivalent if they define the same bipartition. We denote the percentage of false positives as the percentage of edges which are found in the recovered tree but not in the model tree. In this experiment we are interested in how the percentage of false positives for MGR-BR, MGR-GT and MGR are affected by the percentage of deletions.

Results

Comments

  • The false positive rate increases with higher percentage of deletions, this is expected since there is a loss of gene order information
  • MGR-GT is more accurate than MGR-BR and MGR

Scaling up the number of genomes

Results

Accuracy of recovered root

Setup

In this experiment, we are interested in comparing the accuracy of the rearrangement scenario recovered by the algorithms. This is done by comparing the reversal distance of the recovered root with the actual gene order of the root of the phylogeny. Since all the algorithm produce unrooted trees we created datasets with outgroups and make the parent of the outgroup the root of the phylogeny. In order to develop such datasets we first use the method described in the previous experiment to create a binary tree of m - 1 leaf nodes then we perform another k' rearrangements on the root to obtain the outgroup, where k' = average number of rearrangements of the m - 1 leaf nodes from the root.

Results

Comments

  • The accuracy of the recovered root decreases with higher percentage of deletions for a fixed number of rearrangements/edge
  • The accuracy of the recovered root decreases rapidly as the number of rearrangements/edge increases, this is expected since the parimony assumption starts to fail for high rearrangement rate
  • MGR-BR produces a more accurate root than MGR-BR and MGR
1) A short explaination of the Beta-splitting model can be found at Experimentation plan
 
mgr/experimental_results.txt · Last modified: 2007/12/29 09:37 (external edit)
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki