CONTACT  |  SITE MAP  |  ABOUT US   
Ask an account
You are here : Home / Home URGI / About us / Publications / Archives / 2011 / Use cases and improvements of REPET package

2011

National,  COM (talks)

CNET, XVIIe edition 4th - 6th july 2011, Lyon France

06 Jul 2011   Use cases and improvements of REPET package : pipeline and tools to identify and annotate TEs in genomic sequences

V. Jamilloux, O. Inizan, S. Arnoux, T. Flutre, C. Hoede, N. Choisne, H. Quesneville

Today we know that the Transposable Elements (TEs) play a major role in genome structure, functions and evolution. This role is relative to their particular dynamics describes by this schema : after a chromosomal rearrangement, a TE copy duplicates herself and invades the host genome during the ‘Burst’ followed by a degenerative phase where each copy has some Indels, substitutions over time. The TEs nature and mobility make them very particular biologic elements. It is necessary to identify and annotate them rigorously with tools according to this model.
Furthermore, the detection tools are dedicate to one TE category, but in genome there is a large diversity.
Also, due to the very important increase of the genomic sequencing with the NGS, it is necessary to automatise and perform the TE annotation process and help the biologists to the manual curation.
To cover these needs, we develop REPET package , a package including 2 pipelines and tools. First, TEdenovo builds a TEs library of the studying genomic sequence. This de novo detection is performed with successive steps : self alignment to get HSPs, clusterises them and define a consensus of each cluster , detects TE features in each one and classifies each consensus , and finally filter the consensus and defines TE families . The second one is TEannot to annotate TE copies in the whole genome with the mapping of TEs library obtained by TEdenovo, then filtering the HSPs and connecting the fragments, detecting the microsatellites and combining the results into files usable by the biologist to visualize this annotation with genome browsers.

The development and validation phases of these pipelines have been done on model organisms : D. melanogaster, A. thaliana and Oryza sativa. To use TEdenovo and TEannot pipelines on large genomes like V. vinifera, we significantly improve and perform the used resources (eg. computational time....) and we continue to automate some processings to help the biologist in his manual curation procedure.

To demonstrate the value and efficiency of REPET pipelines, we present use cases illustrating the analyses of genome assemblies of different species (plants, fungi and pests), with several genome sizes (40 Mb to 700 Mb), sequenced by different technologies (Sanger or NGS) and by different approaches (“BAC to BAC” or Whole Genome Shot-gun).


Keywords: REPET, transposable element, annotation
Update: 20 Dec 2013
Creation date: 08 Jul 2011