2020
Bonjour, nous proposons un stage de M2 sur les approches bioinformatiques pour la découverte de séquences d'origine virale dans les génomes eucaryotes.
Bioinformatics protocols to reveal the genetic footprints of host-virus interactions
Context
Our knowledge of viruses and host-virus relationships is important both in a fundamental and strategic view. Endogenous viral elements (EVEs) are viral genomes and fragments thereof that have integrated the genome of their host. They represent the genetic footprints of viral infections. Although largely neglected in the annotation of eukaryotic genomes, EVEs are of leading interest to address host-virus interactions and virus evolution, a field known as Paleovirology ( Aiewsakun and Katzourakis, 2015 ). EVEs provide access to hitherto unknown and/or extinct viral sequences in different hosts. These sequences help determining viral host range and, reciprocally, the host virome. For instance, while giant viruses were never shown to infect plant hosts, we found that the genome of the moss Physcomitrella patens contains clusters of genes with high phylogenetic affinities to NCLDV homologues ( Maumus et al., 2014 ). This surprising finding suggests that extent NCLDVs are/were capable to prey upon plants, thereby expanding together the host range for this virus family and the diversity of the plant virome. In addition, EVEs corresponding to novel viral sequences bring valuable support to studying the macroevolution of viruses. In a recent phylogenomics study, we have determined that EVEs from several Caulimoviridae genera are found in virtually all vascular plant genomes, including ferns, gymnosperms and all grades of angiosperms, often at high copy number ( Diop et al., 2018 ). Macroevolutionary analysis of our data allowed proposing a working scenario in which Caulimoviridae would have emerged during the Devonian era, about 320 million years ago. Furthermore, EVEs can occasionally reactivate and generate episomal viruses with potential infectivity and the identification of latent EVEs is important to monitor disease emergence and spread between wild and cultivated plants ( Takahashi et al., 2019 ). On the other hand, while most EVEs probably lack any function, some EVEs may confer a genetic basis of resistance or they could be domesticated to serve other cellular functions. Despite their importance in biology and evolution, EVEs are barely annotated in eukaryotic genomes and remain challenging to be identified.
Objectives
This project aims at exploring new bioinformatics protocols allowing searching for EVEs across eukaryotic genomes. Such protocols will take into account two essential concepts so as to offer the highest liberty for EVE discovery: the extreme modularity of eukaryotic viruses ( Koonin et al., 2015 ) and our incomplete knowledge of the modern and ancient virosphere.
During the first weeks, the prospective student will perform a review of the current methods and tools available to detect EVEs and horizontal gene transfer events. The goal will be to understand the approaches employed and to evaluate their advantages and caveats in both evolutionary and bioinformatics perspectives. The second part of the internship will be dedicated to experiment alternative approaches and to address their relevance to serve as basis of novel protocols for the automated and large-scale detection of EVEs. Approaches using HMM profiles for viral proteins will be considered more closely. Finally, a workflow will be applied on test genomes and potential EVEs will be assessed in more depth in order to establish a proof of concept.
Environment
URGI at INRAe in Versailles is a transdisciplinary unit dedicated to genome analysis and data integration. It is composed of about 20 permanent members, including several developers, engineers and researchers. It is internationally recognized for its expertise in the annotation and analysis of selfish genetic elements, including transposable elements and endogenous viruses.
Skills
The prospective student should be M2 or equivalent in bioinformatics. He/She should have a strong interest in evolutionary biology and should be pro-active. This project is expected to unfold into a thesis project. The working language will be French or English.
Contact: florian.maumus@inrae.fr
Bibliography
Aiewsakun, P., and Katzourakis, A. (2015). Endogenous viruses: Connecting recent and ancient viral evolution. Virology 479-480, 26-37.
Diop, S.I., Geering, A.D.W., Alfama-Depauw, F., Loaec, M., Teycheney, P.Y., and Maumus, F. (2018). Tracheophyte genomes keep track of the deep evolution of the Caulimoviridae. Sci Rep 8, 572.
Koonin, E.V., Dolja, V.V., and Krupovic, M. (2015). Origins and evolution of viruses of eukaryotes: The ultimate modularity. Virology 479-480, 2-25.
Maumus, F., Epert, A., Nogue, F., and Blanc, G. (2014). Plant genomes enclose footprints of past infections by giant virus relatives. Nat Commun 5, 4268.
Takahashi, H., Fukuhara, T., Kitazawa, H., and Kormelink, R. (2019). Virus Latency and the Impact on Plants. Front Microbiol 10, 2764.