Manuela Lovinella

Developing a 3rd-generation genome pipeline to uncover novel metabolic activities in Galdieria: a focus on secreted enzyme activities

My project

Novelty and timeliness

To exploit Galdieria, we must understand the constellation of diversity in metabolic capacity it displays. This is noteworthy as 10% of its genome is horizontally acquired [1] and massive genome differences are seen between strains, which reflect whole metabolic-islands of encoded enzymes. We surveyed a collection of around 200 isolates of Galdieria for growth rate, and alongside the O74W type strain (whose genome is sequenced and publically available), we have assembled a sub-collection of ~40 endemic isolates that are strikingly different in metabolic capacity. We have been investigating the potential of Oxford Nanopore’s MinION technology to provide rapid, affordable long-read capacity and generated 17,800 individual DNAmolecules reads from a Galdieria DNA preparation in just 2.5 hours with a modal read length of 30 kb, with many thousands of reads above 80kb in length. Our longest Galdieria read from a single molecule was 330 kb. In total this run generated 1.2 Gb in only 36 hours. We will develop this sequencing platform, which in itself holds great IB potential, to define those strains that have the greatest IB utility. This is the overarching theme of this consortium, and builds the network’s foundation.


Collaborations are on-going with the two supervisors of this subproject to generate >10X genomes coverage for all Galdieria isolates in York using the HiSeq3000 system; we have also already sequenced one genome using the new 3rd generation minION, and this has already been assembled. In phase 1 of the project, the student will use these short-read HiSeq3000 runs as a starting point to complete genome discoveries. Developing the minION platform is on-going, and the student would generate ~20X coverage of between 12- 24 genomes, determined by genome diversity in HiSeq300 runs. From there, assembly and annotation leads to a description of metabolic predictions using the innovative pipeline that exploits genome-scale metabolic networks [2]. In phase 2, special attention will be given towards producing informatic descriptions of excreted enzymes that can function under very low pH and elevated temperature. From the large diversity of annotated and classified enzymes, homology searches will be used to identify enzymes, such as xylanases and other cellulases, proteases and oxidative enzymes. In phase 3, with collaborations in the other two subprojects, the student will perform enzyme characterisation and develop information on process-production performance. By definition, all identified enzymes must function at high temperature and very low pH. Thus these will be the most heat-resistant enzymes ever isolated from a eukaryote, given that Galdieria lives at the absolute temperature limit of eukaryotic life.