The computational technique can identify subtle differences between the metabolic processes of the different microbial communities, and the information can be used to detect how the microbes are adapting to changes in the environment, such as the effects of global warming.
"It is also extremely relevant for human health. It is important to understand how a change in the microbial communities could affect us, and they are affected by global warming," Folker Meyer, a researcher at Argonne, told LabTechnologist.com. "We need to know which metabolic processes are present; we are now looking at DNA in the world around us to find new pathways to this information."
Researchers are now turning to the growing discipline of metagenomics to find this information. Whereas traditional genetics techniques would require each microbe in a sample to be cloned and studied separately, metagenomics takes a raw sample, made from a mixture of different organisms, and sequences all of the genomes present in the sample at once. The genetic information obtained is then pieced together, to identify the different organisms that are present in the sample, and to identify the metabolic processes they are exhibiting.
So far the researchers have analysed the frequency distributions of more than 14 million microbial and viral sequences from almost 90 different ecological communities. The computational resources needed to piece together all of the different fragments of DNA present in a sample are phenomenal, so the scientists at Argonne have developed innovative computational shortcuts to reduce the number of necessary computations, and high performance computing infrastructures to accelerate the speed at which these are performed.
The high-throughput pipeline used for this study, called the metagenomics RAST server, compares the sequences found in the sample to the SEED directory of known microbial genetic information, to identify which organisms are present, and to find genes known to be responsible for metabolic enzymes. This tells the researchers which enzymes the microbes were using for their metabolism, suggesting which metabolic processes were active in the sample.
Rather than repeating these calculations again and again for the same organism or for the same molecular processes within different organisms, the system intelligently saves resources and does not repeat the same calculations more than once.
"Without the RAST technology this simply wouldn't be possible - the samples sizes are too large," said Meyer. "The [SEED] database at the National Microbial Pathogen Data Resource is a huge resource. They have cleaned up the database for microbes, helping us to reconstruct the metabolic pathways of different communities."