The URGI newsletter #1

CJW Network - Developers united in eZ Publish CJW Network - Developers united in eZ Publish
The URGI newsletter #1 Jun 2011

Editorial

Communication in the early 21st century becomes strategic, and scientific platforms do not escape this law. Apart from the conference talks and the organization of URGI tools training sessions, the first step in 2010 was to improve our web site. Design and ergonomy were reviewed to make it more attractive. Our "private" and "public" web sites were merged and became unique in order to simplify user navigation.

 

Today, URGI launches a second web media: a newsletter. By diversifying our communication channels, we expect to get in connection with more people, particularly non-traditional visitors of our web site. In this newsletter, we will give a summary of the most important news since our last newsletter, with articles of particular interest that we wish to share with our users.

 

For the first newsletter, we emphasized our year 2010 exceptional participation in several genome projects, leading to key publications in scientific journal with strong impact factors. This illustrates the URGI skills for gene and repeat annotation, and transposable element evolutionary analysis. These rare skills make us appealing partners for genome projects.

 

We hope that you will find this newsletter interesting and useful for your work, keeping you updated of URGI activities.

 

Hadi Quesneville

Publications

News

Events

Ready for Next-Gen Sequencing!

High throughput sequencing is now getting increasingly popular. For less than 10k€, a lab can sequence more than 25 millions reads. And this new and affordable tool has many applications: de novo sequencing, resequencing (for genomic diversity detection), RNA-Seq (detecting mRNA or short non-coding RNAs), ChIP-Seq (chromatin immunoprecipitation followed by sequencing), RIP-Seq (same as previous, where the RNA which is bound the protein of interest is sequenced), Hi-C (for chromosome conformation capture), etc.

Whereas the technology is appealing (see how many sequencers there are around the world ), computers are now compulsory to analyze the sequencing data. The shear size of the data you get from the sequencing prevents any manual analysis. As a consequence, the URGI is more and more solicited to perform the analysis of the high throughput sequencing.
We are currently hosting four projects involving high throughput sequencing. The first two projects, GrapeReSeq and Muscares , investigate the genomic variability in Vitis vinifira and close relatives. The third project uses RNA-Seq data to assess the role of transposable elements in the transcription in Drosophila melanogaster. The fourth project explores the epigenetic diversity of Arabidopsis thaliana with RNA-Seq. The last new project has been recently started, where some RNA-Seq, ChIP-Seq and RIP-Seq will help us to understand some epigenetic regulation in Arabidopsis thaliana.
Since URGI is a "dry lab" (there is no bench whatsoever here), we mainly work in collaboration with other lab to produce the material and interpret the data. The first two projects, which are the largest projects, are international joint works from Spain, Italy, Germany and France, where our lab is in charge of the SNP detection. In these projects, more than 30 lanes of Solexa Genome Analyzer data will be produced by the EPGV and analyzed by our lab. The other projects involve with the collaboration of other labs, mainly with IJPB .
The data we are currently using are quite diverse: it can either be DNA or (long and short) RNA sequencing, but ChIP-Seq and RIP-Seq data will be arriving soon. So far, we have analyzed Solexa Genome Analyzer and Roche Genome Sequencer reads, possibly single or pair ends, normalized or not, 5' capped or not.
The URGI is actively developing tools to analyze the deluge of data we are facing. Two heavily used tools are already available to the community: MapHits and S-MART. MapHits is a pipe-line which finds and selects the reliable SNPs and select the SNP which can be used to design a chip. The pipe-line is highly flexible and can be launched through the Galaxy Web interface that URGI is hosting (the pipe-line is currently available in restricted access only). S-MART is a tool box for the analysis of RNA-Seq data. It can be used on your personal computer, or through Galaxy (work in progress). Do not hesitate to contact us if you think that you may require MapHits or S-MART!
Another project is to develop a new module inside our information system GnpIS , codenamed GnpSeqNGS, dedicated to store high throughput sequencing data. This new petal of the GnpIS flower would be a repository for high throughput sequencing data, highly connected with the other modules, especially GnpSNP (for resequencing data). This new repository will host the data from our collaborations and make them available to the community through simple query interfaces.

Although the future is hardly predictable, we may suspect that the number of high throughput sequencing projects that the URGI would host will increase. This relieves two main questions, which are: Will we be having the computer hardware to do the work? Will we be having enough human resources to do the analysis?
To handle this large amount of data, the lab has massively invested into computer hardware . We currently host a cluster of more than 700 Intel Xeon and more than 60 TeraBytes are available, with backup system and high throughput network between the nodes. And this could be increased in quantity very soon!
URGI is actively collaborating with the other platforms of the APLIBIO network, which gathers 8 platforms of the Region Ile-de-France. Here, we share our experiences and our tools to stay up to date with the latest technologies. Relying on this network of experts, it will be possible to face the future of next-generation sequencing.

Matthias Zytnicki

To unsubscribe from this newsletter please visit the following link: unsubscribe
© 2024 http://urgi.versailles.inra.fr/
eZ Publish™ copyright © 1999-2024 eZ Systems AS