Palestra Extraordinária: Assembly Strategies for Large Genomes.
Prof. Dr. Mario Caccamo, The Genome Analysis Centre, Norwich, UK, na Série de Seminários 2010 da Pós-Graduação, dia 07/12/2010, às 10:00 h, Auditório do IC, Sala 85 - IC 2.
| What | Palestra |
|---|---|
| When |
07/12/2010 from 16:00 to 17:00 |
| Where | Auditório do IC - Sala 85 - IC 2 |
| Add event to calendar |
|
The next generation sequencing technologies (NGS) are characterised by the capacity to generate data at very high rates (up to 20Gb per day).The sequence reads, however, are short. The availability of high-quality reference genomes for model organisms such as human and mouse have been central in establishing these technologies as the tool of choice to implement population genetics studies based on re-sequencing. The ability to generate de novo assemblies from short reads for large eukaryote genomes, however, remains a challenge. Most of the current assembly tools struggle to deal with the massive datasets generated by these technologies. The latest assembly algorithms such as SOAPdenovo have been designed to offer efficient alternatives to represent these datasets in main memory, but in general they generate large number of relatively short contigs making the task of ordering and orientating them difficult. One approach to resolve the architecture of the underlying genome is to generate pair-end reads from a varied number of inserts. This has proven to be an important resource when trying to built scaffolds from smaller contigs across repeats. A similar approach can be used to explore the ability to generate longer reads proposed by the emerging single-molecule technologies. In this talk I will introduce Cortex, a memory-efficient de Bruijn graph implementation that can work as a de novo assembly tool and also provides a novel approach for variant discovery and genotype. ==================================================================== Dr Mario Caccamo is the Head of Bioinformatics at The Genome Analysis Centre (TGAC). His research interests focus on the development of efficient algorithms and software tools for the assembly and annotation of genomic sequences. Mario is a former MSc graduate from the IC-UNICAMP where he worked under the supervision of Prof Tomasz Kowaltowski on the implementation of software tools for the analysis of Natural Language data. Mario obtained his PhD in Theoretical Computer Science in 2003 from the University of Aarhus (Denmark) for his work on the formalisation of a calculus for category theory. Mario's first appointment after his PhD was at the Wellcome Trust Sanger Institute (Cambridge, UK) to work as a bioinformatician in the genome projects for both the model organism Danio rerio (Zebrafish) and Sus scrofa (Pig). Mario joined the European Bioinformatics Institute in July 2007 to work in the development of the European Genome-phenome Archive (EGA). The aim of this project was to implement a public repository for clinical data that is subject to consent agreement. Mario has also been the EBI delegate in the Genome Reference Consortium (GRC), this group’s activities are centered around the implementation of the tools and data standards required for the maintenance of the high-quality genome references including the human and mouse sequences. Mario joined TGAC to take his current position in July 2009. Mario is also a honorary lecturer at the University of East Anglia (UEA) and holds a faculty position at the Join Innes Centre, both institutions are in Norwich, UK. ====================================================================Organizadora: Profa. Ariadne Maria Brito Rizzoni Carvalho (ariadne@ic.unicamp.br) IC -- Unicamp Fone: (019) 3521-5864 ====================================================================
