Study : Gossypium arboreum cultivar:Shixiya-1 Genome sequencing


Due to the impressive repetitive sequences content in G. arboreum genome, the assembly based on short-read sequencing technology remains problematic. In this study, we structurally improved the assembly of A-genome G. arboreum by integrating long PacBio reads and Hi-C sequencing data. A total of 142.54 Gb of raw PacBio reads and 78.13 Gb Hi-C clean data were generated with approximately 77.6-fold and 45-fold genome coverage, respectively. The Illumina pair-end reads were also used to correct base calling for the assembly in order to increase accuracy. In order to correctly anchor the contigs into 13 pseudo-chromosomes, we use Hi-C reads to make chromosomal assembly. The PacBio genome has minimized sequence gaps, correctly reordering and reorienting contigs to significantly decrease the amount of incongruities.


