Publications

International,  COM (communication) 07 Jul 2023   [hal-04156193] REPET novelties : a versatile and modular package

The detection and annotation of transposable elements (TEs) are now considered mandatory to any genome sequencing project. To this aim, the REPET package integrates bioinformatics pipelines dedicated to detect, annotate and analyse TEs in genomic sequences. The two main pipelines are (i) TEdenovo, that search for interspersed repeats, build consensus sequences and classify them according to TE features and (ii) TEannot, which mines a genome with a library of TE sequences, for instance the one produced by the TEdenovo pipeline, to provide TE annotations. The REPET package is in continuous improvement. Several implementations and algorithms to reduce the time required for analysing large genome have been tested. With our new speed improvements and tuned annotation strategies, REPET is now able to annotate and analyse easily large genomes up to 3 Gb. Now, we chain all required steps through a process called ”Repet-Factory”. This process uses parameters optimized for specificity and computing time. It is capable of successively annotate several genomes in batches with all the traceability required for reproducibility. We also simplified the distribution of REPET by developing a Docker image of REPET. And for an HPC usage, REPET is currently developped in SnakeMake with dependencies in Apptainer.

In ProdINRA

eZ Publish™ copyright © 1999-2024 eZ Systems AS