Abstract
Genome rearrangement problems have been extensively studied for more than two decades, intended to understand the species evolutionary relationships in terms of the long range genetic mutations at the genome level. While most earlier studies focus on the simplified genomes ignoring gene duplicates, thousands of whole genome sequencing projects reveal that a genome typically carries multiple gene duplicates distributed in various ways along the genome. Given a source genome and a target genome such that one is a re-ordering of the genes in the other, we measure the evolutionary distance by the minimum number of reversals applied on the source genome to recover all the gene adjacencies in the target genome. We define this optimization problem as sorting by reversals to recover all adjacencies, or SBR2RA in short. We show that SBR2RA is APX-hard and uncover some similarities and differences to the classic counterpart, the sorting by reversals problem. From the approximability perspective, we present a 2 α-approximation algorithm, where α∈ [1 , 2] is the best approximation ratio for a related optimization problem which is suspected to be NP-hard.
Original language | English |
---|---|
Pages (from-to) | 1170-1190 |
Number of pages | 21 |
Journal | Journal of Combinatorial Optimization |
Volume | 37 |
Issue number | 4 |
DOIs | |
State | Published - May 1 2019 |
Scopus Subject Areas
- Computer Science Applications
- Discrete Mathematics and Combinatorics
- Control and Optimization
- Computational Theory and Mathematics
- Applied Mathematics
Keywords
- Alternating cycle
- Gene adjacency
- Genome rearrangement
- Maximum matching
- Sorting by reversals