Viral Dark Matter (#103)
Metagenomic sequencing has significantly advanced our ability to characterize the genetic and metabolic capacities of viral and microbial communities. The first viral metagenomes (viromes), derived from sampling two near-shore communities, showed that >75% of the sequences were unknowns. Since then, GenBank has grown by an order of magnitude but the unknowns in the viral metagenomes (viromes) remains >70% or even higher. By the same criterion, only about 15% of the sequences in the microbial fractions from many environments are unknowns and of that 15%, many are proviruses or ORFans, which are also probably viral origin. This high percentage of viral unknowns multiplied by the observation that viruses outnumber their hosts by more than ten to one means that viruses are probably the largest sector of unexplored diversity in the biosphere. We have launched a project to determine what this viral dark matter is doing using a combination of metaproteomics, high-throughput crystallization, and metabolomics. I will discuss the approach and some of the more exciting findings from this work.