
Data set from paper
"Traces of transposable element in genome dark matter co-opted by flowering gene regulation networks." By Agnès Baud, Mariène Wan, Danielle Nouaud, Dominique Anxolabéhère, Hadi Quesneville1
==================================================================

*AlyrCrubTparAalpBrapAtha_TEannot2.gff3.fa.cleaned:
Genomic TE copy from Arabidopsis thaliana Col-0 ecotype, Arabidopsis lyrata, Capsella rubella, Arabis alpina, Brassica rapa and Schrenkiella parvula. 

Annotation provided with the REPET package v2.5 with its two pipelines TEdenovo and TEannot. On each genome, the similarity branch of TEdenovo was used with default parameters, followed by a TEannot with defaults parameters (sensitivity 2). From this first annotation, we selected consensus sequences that have at least one full length copy (i.e. aligned over more than 95% of the consensus length) to run a second TEannot pass. 

*Duster.gff
Duster TE copy annotation realized with AlyrCrubTparAalpBrapAtha_TEannot2.gff3.fa.cleaned

*Duster-specific_filtered.gff
Duster TE copy annotation that don't overlap any Gene, TAIR10 TEs, Brassicaceae, or A. thaliana REPET annotation.

*TAIR10TEs-specific_filtered.gff
TAIR10 TE copy annotation that don't overlap any Gene, Duster, Brassicaceae, or A. thaliana REPET annotation.

*Brassicaceae.gff3
“Brassicaceae” TE annotation obtained in a previous published study: Maumus F, Quesneville H. Ancestral repeats have shaped epigenome and genome composition for millions of years in Arabidopsis thaliana. Nat Commun. 2014;5: 4104. doi:10.1038/ncomms5104

*Brassicaceae-specific_filtered.gff
Brassicaceae TE annotation that don't overlap any Gene, TAIR10 TEs, Duster, or A. thaliana REPET annotation.

