05 dez 2022
14:00 Master's Defense IC3 Auditorium
Theme
Extensions and Applications of Genomic Rank Distance
Student
Lucas Peres Oliveira
Advisor / Teacher
John Meidanis
Brief summary
From a computational perspective, genomes can be modeled as a collection of oriented segments, commonly called syntenic blocks, which represent regions conserved throughout evolution. Such regions are susceptible to large-scale mutations --- known as genome rearrangements --- that switch the syntenic blocks in different configurations. Over the years, several rearrangement-based distance models have been developed in order to efficiently calculate the evolutionary distance between genomes. Among them, the rank distance is based on the modeling of genomes as matrices and the use of rank as a distance metric. The rank distance is the successor to the algebraic distance, a distance model that represents genomes as permutations and is based on permutation group theory. Recently, the rank distance has been extended to encompass insert and delete events --- indels. Although there are efficient algorithms to calculate rank in this context, many results of the matrix theory for genome rearrangement are still based on notions of permutation group theory. In addition, the results are largely theoretical, and little is known about the biological applicability of this extension of the rank distance. In this work, we consolidate and expand recent results regarding the extension of the rank distance that considers indel events. In particular, we introduced a data structure called a column graph in order to develop simpler formulas to calculate the rank in linear time. This tooling allowed the matrix theory for genome rearrangement and derived algorithms to be simplified considerably. In addition, we performed phylogenetic inference experiments using simulated data and real genomes to investigate the biological applicability of rank distance. Our results attest that the rank distance is competitive when compared to the DCJ-Indel distance, a state-of-the-art method in genome rearrangement. Finally, we present a contribution to the study of enumeration of sorting scenarios under rank distance.
Examination Board
Headlines:
João Meidanis IC / UNICAMP
Guilherme Pimentel Telles IC / UNICAMP
Fábio Henrique Viduani Martinez FACOM / UFMS
Substitutes:
Ulisses Martins Dias FT / UNICAMP
Carla Negri Lintzmayer CMCC / UFABC