CONTACT  |  SITE MAP  |  ABOUT US   
Ask an account
You are here : Home / Home URGI / About us / Publications / 2013 / TEs Annotation strategies for large genomes

2013

National,  COM (talks)

GDR 3546 'Elements Génétiques Mobiles', December 9-10, 2013, Paris, France.

10 Dec 2013   TEs Annotation strategies for large genomes

V Jamilloux, S Arnoux, T. Chaumier, M. Moissette, O Inizan, H Quesneville

The recent successes of new sequencing technologies allow today to sequence increasingly large genomes at reduced costs. Transposable elements (TEs) constitute the most structurally dynamic components and the largest portion of nuclear sequences of these large genomes, e.g. 85% of the maize genome (Schnable et al. 2009), and 88% of the wheat genome (Choulet et al. 2010).

However, TEs annotation in large genomes is a major challenge and must be the most automatic as possible and must be compared to reference annotation or intra-specific annotation. 

To this end, we design several strategies using REPET package ((Flutre et al. 2011, today v2.2) which propose 2 pipelines : TEdenovo  to build a denovo consensus library classified with Wicker et al. classification  and TEannot to annotate consensus copies in the genome.

These strategies support the idea: large plant genomes are mostly made of few TE families easy to found because in number of copies. We present 3 strategies and qualify them form an annotation reference.

Cas A : reference annotation, genome annotation with a reference library

Cas B : iterative approach, take Cas A annotation copies and splice them from genome to get a reduce genome, build an other denovo library on the spliced genome an annotate the initial genome with these 2 libraries

Cas C :  denovo iterative approach, build a denovo consensus library (LTR), annotate their copies, splice them to reduce the genome, build an other denovo library on the spliced genome an annotate the initial genome with these 2 libraries

Cas D : build  a denovo consensus library on partial genome (the longest contigs) and annotate the initial genome with this library

For this methodological development, we choose the wheat, an allohexaploid with three homoeologous genomes. It is one of the largest plant genomes with ~17Gbp and 88% of TEs (Choulet, 2010). We started with the 3B chromosome, the first to be fully sequenced.


Keywords: detection, wheat, genome, annotation, transposable element, REPET

Creation date: 12 Dec 2013