Usted está aquí: Inicio / Actividades / Coloquio Queretano / Actividades - Coloquio Queretano / From Best Matches to Gene Families: How to use paralogs in phylogenomics

From Best Matches to Gene Families: How to use paralogs in phylogenomics

Ponente: Peter F. Stadler
Institución: University of Leipzig
Tipo de Evento: Investigación, Divulgación
Cuándo 06/08/2019
de 17:00 a 18:00
Dónde Sala A2 del Centro Académico Cultural (CAC)
Agregar evento al calendario vCal

Best match graphs (BMGs) arise naturally as the first processing intermediate in
algorithms for orthology detection. Let T be a phylogenetic (gene) tree T and
sigma an assignment of leaves of T to species. The best match graph (G,sigma)
is a digraph that contains an arc from x to y if the genes x and y reside in
different species and y is one of possibly many (evolutionary) closest
relatives of x compared to all other genes contained in the species sigma(y). I
will give two alternative characterizations of BMGs and show that a minimally
resolved tree that explains a BMG can be reconstructed in cubic time. The
symmetric part of a BMGs represents the empirical estimate for the orthology
relation on the gene set as inferred from a reciprocal best match heuristic.
BMGs are therefore close relatives of co-graphs, which describe perfect
duplication/speciation scenarios. Whenever a BMG deviates from a cograph
structure, this implies that the reciprocal best match heuristic has produced
incorrect orthology assignments. A reasonable approach therefore it to correct
the data by editing the BMG into its nearest co-graph. Cographs, in turn, are
equivalent to event-labeled gene trees that identify duplication and speciation
events. These trees also impose constraints on the species tree and the
possible reconciliation maps. Taken together, therefore, it is possible to
start from reciprocal best matches of the proteoms of a set of species and
eventually arrive at the phylogenetic tree of these taxa without the use of a
conventional tree reconstruction method. In fact, an analysis of the workflow
show that it only makes use of gene duplication events, while sets of 1-1
orthologs do not contribute at all. In this sense the approach is
orthogonal to classical phylogenetic methods.