3rd International Conference GIETE 2012, February 24-28 2012, Asilomar Pacific Grove, California, USA
24 Feb 2012 An iterative process for TEs annotation in large genomesV Jamilloux, S Arnoux, O Inizan, H Quesneville
The recent successes of new sequencing technologies allow today the sequencing of very large genomes at reasonable costs. Transposable elements (TEs) constitute the most structurally dynamic components and the largest portion of their nuclear genomes, e.g. 85% of the maize genome (Schnable et al. 2009), and 88% of the wheat genome (Choulet et al. 2010). Therefore, TE annotation should be considered as a major task in genome projects, but currently not be obtained automatiquely. This crucial step is now a bottleneck for many genome analyses.
In this context, we improve, in the v2.0 release - see poster “REPET v2” (Arnoux et al. 2012) in this conference -, the REPET package (Flutre et al. 2011). REPET, gathers two pipelines: TEdenovo build a TEs library and TEannot annotate TE copies in the genome. Thus, we test a new strategie dedicated to annotate very large genomes, the iterative approach:
The rational is that these large genomes are made of mostly few TE families that recently invade. They will be detected in the first step and this will allow reducing the genome by an important factor. We tested this approach on A.thaliana and we will present the benchmarks obtained. We will also present preliminary results on the 3B chromosome of wheat (293890 contigs, 975Mbp), the first fully sequenced chromosome of an ~17Gbp allohexaploid genome.