Category Archives: Publications

New paper out: functional metagenomics powered by synthetic biology

Why do functional metagenomics and synthetic biology (or synbio) make such an interesting combination? This week, our new article in Nature Chemical Biology, ‘The evolving interface between synthetic biology and functional metagenomics’, sheds light on how progress in synthetic biology can advance, and already have advanced, the field of functional metagenomics.

We are facing a growing and aging world population, and mankind thus needs new drug molecules and ways to produce nutrients. Instead of using chemical synthesis, drugs and nutrients can be sustainably produced by modified bacteria. Moreover, most of those interesting molecules are already produced by billions of bacteria in the environment. Unfortunately, it is difficult to grow most types of bacteria in a laboratory, and it is therefore not possible to harness their useful capabilities directly. However, bacteria contain all the information needed to produce these valuable molecules in their DNA. Using methods known collectively as ‘functional metagenomics,’ the DNA of these bacteria can be recovered from the environment and used by host bacteria that can be cultivated in a lab. This allows us to make use of the capabilities of the billions of bacteria that are present in the environment without actually growing them, but by directly utilizing their DNA instead.

Construction of a metagenomic library. Environmental DNA is extracted, purified, fragmented and cloned into a shuttle vector. The library of plasmids is then transformed into an expression host such as Escherichia coli. Finally, the resulting clones can be analyzed according to their genotype and/or phenotype.

Which kind of metagenomics should be used?

In practice, there are two ‘metagenomic’ approaches, sequence-based approaches (where environmental DNA is sequenced and a function is assigned computationally) and function-based approaches (where the environmental DNA is transformed into a host bacteria and the genes are expressed and interrogated). In our article, we focused on the functional approach by specifically interrogating metagenomic DNA functionally using a genetic circuit.

The term “metagenomics” can refer to many different techniques and procedures. In our new article, we focused on using genetic circuits to functionally mine a metagenomic library.

In our publication, we first surveyed the ways in which genetic circuits have been used in the recent past to interrogate metagenomic libraries. Though scientists have been quite creative, researchers need to move from a ‘screening’ method to a more high-throughput interrogation methods. These ‘screening’ methods require researchers to painstakingly examine each bacterial colony for a visual change associated with the production of a compound, for example. In more effective high-throughput methods, researchers couple the production of a compound of interest to the survival of the cell. The spontaneous death of cells lacking the target compound replaces the labor-intensive process of scrutinizing massive number of clones. My colleague Hans has utilized this approach previously to identify new vitamin transporters as outlined in the illustration below.

Example of a genetic circuit consisting of a riboswitch coupled with two selectable markers, which can be used to mine a metagenomic library for vitamin B1 transporting or producing genes. The genetic switch can also be formalized as an AND-gate with vitamin B1 as the input and cell survival as the output.

Insights for improving genetic selection circuits can also be obtained from biocontainment research, as it is notoriously difficult to perform experiments in which all cultured cells commit suicide. For example, the research group led by Farren Isaacs showed how multi-layered circuits can aid in this, and a recent review from the Collins lab summarizes the latest advances in biocontainment systems.

We anticipate that the expansion of synthetic biology tools, such as automated circuit design and computational design of proteins, will usher in greater efficiencies in the mining of functional metagenomics libraries. These advances in functional metagenomics and synthetic biology are already demonstrating remarkable potential in industrial and medical applications. Our full paper, available at Nature Chemical Biology, goes into more depth on all the previously constructed genetic circuits and new technologies that will continue to propel the field forward:

van der Helm, E. Genee, H.J. Sommer, M.O.A (2018) ‘The evolving interface between synthetic biology and functional metagenomics’ Nature Chemical Biology 10.1038/s41589-018-0100-x

Other recourses

Gallagher, R. R., Patel, J. R., Interiano, A. L., Rovner, A. J., & Isaacs, F. J. (2015). Multilayered genetic safeguards limit growth of microorganisms to defined environments. Nucleic Acids Research, 43(3), 1945–54. https://doi.org/10.1093/nar/gku1378 [-]

Genee, H. J., Bali, A. P., Petersen, S. D., Siedler, S., Bonde, M. T., Gronenberg, L. S., Sommer, M. O. A. (2016). Functional mining of transporters using synthetic selections. Nature Chemical Biology, 12, 1015–1022. https://doi.org/10.1038/nchembio.218 [-]

Lee, J. W., Chan, C. T. Y., Slomovic, S., & Collins, J. J. (2018). Next-generation biocontainment systems for engineered organisms. Nature Chemical Biology, 14(6), 530–537. https://doi.org/10.1038/s41589-018-0056-x [-]

Nielsen, A. K., Der, B. S., Shin, J., Vaidyanathan, P., Densmore, D., & Voigt, C. A. (2016). Genetic circuit design automation. Science, 352(6281), 53–63. https://doi.org/10.1126/science.aac7341 [$]

Taylor, N. D., Garruss, A. S., Moretti, R., Chan, S., Arbing, M., Cascio, D., Raman, S. (2016). Engineering an allosteric transcription factor to respond to new ligands. Nature Methods, 13(2), 177–183. https://doi.org/10.1038/nmeth.3696 [-]

Note: parts of this blogpost are sourced from my PhD thesis

Leave a Comment

Filed under Publications

Background on the poreFUME pre-print

porefumlogoLast week our pre-print on nanopore sequencing came online at bioRxiv. Nanopore sequencing is a relatively new sequencing technology that is starting to come of age. As part of this process we last year started playing with the ONT MinION sequencer. This post summarizes a bit of the background behind the pre-print.

Previously I covered the London Calling 2015 event  where a lot of progress on the development of the MinION was showcased. We were keen to find out how the MinION could contribute to our daily lab work, but also to see what new ground can be covered with this new sequencing technology.

One of the aspects colleagues in the lab are working on is the dissemination of antibiotic resistance genes, as a major healthcare challenge is the emergence of pathogens that are resistant against antibiotics. Therefor we thought of combining the MinION with antibiotic resistance gene profiling. More specifically; coupling functional metagenomic selections with nanopore sequencing.

Previous work in this field, for example by Justin O’Grady and colleagues, showed the use of the MinION [$] to identify the structure and chromosomal insertion site of a bacterial antibiotic resistance island in Salmonella Typhi.

Instead of going after single isolates, we set out the map the antibiotic resistance genes that are present in the gut (resistome) of a hospitalized patient. The resistome can influence the outcome of antibiotic treatment and it is therefor highly interesting to get insights in this complex network.   Through a collaboration under the EvoTAR programma with Willem van Schaik of the University of Utrecht we had a clinical fecal sample available of an ICU patient, which we used in the experiments.

Typical workflow of the construction and selection of a metagenomic workflow.

Typical functional metagenomic workflow where metagenomic DNA is isolated from a (complex) environment, in this case a fecal sample. The DNA is sheared, ligated and transformed in E. coli. When profiling for antibiotic resistance genes, the cells are plated on agar containing various antibiotics. Finally the metagenomic inserts are sequenced an annotated.

Key in the whole experimental setup to capture the resistome is the use of functional metagenomic selections. In contrast to culturing individual microorganisms directly from a fecal sample, metagenomic DNA is extracted from the sample. This metagenomic DNA is subsequently sheared, ligated and transformed in E. coli and finally plated out on solid agar containing various antibiotics. Only E. coli cells that harbor a metagenomic DNA fragment that encodes for an antibiotic resistant phenotype can survive. With these functional metagenomic selections in hand, the complexity of the resistome can be rapidly mapped.

And this is were the MinION comes in. Although other sequencing technologies, such as the Illumina and the PacBio platform, are available, they do not provide both long reads and low capital requirements.

 

 

After some initial failed attempts to get the MinION sequencer running in our lab, we started to see >100 Mbase runs in October last year. Also PoreCamp last December in Birmingham provided, on top of a great experience and nice people, some useful data (next week a new round of PoreCamp takes place).

In order to analyze the sequencing data that Metrichor generates we developed the poreFUME pipeline, which automates the process of barcode demultiplexing, error correction (using nanocorrect) and antibiotic resistance gene annotation (using CARD). The poreFUMe software is available on Github as a python script. The subsequent analysis is as well available on Github in a Jupyter notebook.

The jupyter notebook is available here

The Jupyter notebook with the analysis in the pre-print is available here.

In order to benchmark the nanopore sequencing data we also Sanger and PacBio sequenced the sample. From these results we could achieve a >97% sequence accuracy and we were able to identify all the 26 antibiotic resistance genes in both the Pacbio and nanopore set.

Since the whole workflow can be performed relatively quickly, it would be really interesting to move these techniques to the next stage and do in-situ resistome profiling. Especially integrating Matt Loose’s read-until functionally could open up new avenues. Furthermore these experiments were done with the R7 chemistry, however it seems that the new R9 chemistry is able to deliver even higher accuracies and faster turn-around.

The fasta files and poreFUME output used in the analysis are already online, the raw PacBio and MinION data is available at ENA

Update 2016-11-01: Added the ENA link to the raw data

2 Comments

Filed under Publications

deFUME webserver paper published last week!

paperLast week we published our deFUME paper in the open access journal BMC Research Notes. The aim is an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, specifically targeting wet-lab scientists (or non-bioinformaticians).
A quick intro into function metagenomics: it’s a subfield of the more widly known metagenomics. The term metagenomics was first introduced by Handelsman and Clardy in 1998 and is a method to extract DNA from the environment (metagenome) and study this by either sequencing or functional analysis. The first case does what the name says, extract and sequence as much DNA as possible and using bioinformatics tools to try to determine the function. In this way Hess et al [2]  were able to computationally identify 27,755 putative carbohydrate-active genes in cow rumen. However a drawback of this method is that these genes need to experimentally validated.

Different phenotypes that can be observed, for example halo formation, pigmentation or morphological changes

Different phenotypes that can be observed when expressing a metagenomic library, for example halo formation, pigmentation or morphological changes.

Functional metagenomics works in that sense the other way around, a metagenomic library is transformed in a laboratory host (for example E. coli) and cultured while monitoring for a phenotypic change. For example if one is looking for proteases, the agar plate can be supplemented with milk and colonies creating a halo can be deemed positive for proteolytic activity. These colonies can subsequently be sequenced and predicted genes functionally annotated. For this last process we created the deFUME webserver, it integrates the whole process from vector trimming till domain annotation into one pipeline.

The workflow of deFUME is visualized in the figure below where processes are depicted in red and (intermediate) files in black:

deFUME webserver flowchart

deFUME web server flowchart, processes are in red and files/objects in black. From [1]

As input files deFUME takes either Sanger chromatograms (as .ab1 files) or, in case of a next generation run, the assembled nucleotide sequences in FASTA format. In the next steps the data is processed and annotated with BLAST and InterPro data. Leaving it for the user to interact with the data in an interactive table format for example to filter on e-value, remove hypothetical proteins or show more or less detail. Finally the annotations can be exported in FASTA or Genbank format or in a simple csv file.

Why would you use the webserver?

  1. It’s free for academic users
  2. It saves time compared to, for example running the same workflow in CLC
  3. It’s easy because you don’t spent time on intermediate files, for example vector trimming the contigs and pushing those to BLAST.
Screenshot of deFUME

Screenshot of deFUME showing the functional annotations (A) and the interactive toolbox (B). From [1]

So where did this idea originate from?

It actually started out in the summer 2013 with a small project at the CIID (Copenhagen institute for interaction design) where we designed all kinds of interactive visualizations. In the lab we had a functional metagenomic data set laying around but some colleagues found it challenging to analyze the data and interact with it. So out of curiosity I made the following sketch (on Github) in Processing that would, based on Interpro data, give a quick overview of the sequences and annotated Interpro domains.

Screenshot of the initial sketch made in Processing

Screenshot of the initial sketch made in Processing

This small processing sketch was a direct hit and the idea arose to make this kind of interaction wider available. One basic necessity would be to also include the data processing into the visualization so the user only has to push 1 button in order to get an interactive visualization.
Therefor we implemented a backend that runs on the Center for Biological Sequence (CBS) servers at the Danish Technical University (DTU) and handles the data pipeline, from basecalling to BLASTing. Another quick realization was that a Processing sketch is not extremely portable and user-friendly, a web interface on the other hand would be. Therefor we build a table based (using jqGrid) module to display the functional annotations and use the HTML5 canvas to draw a visual representation of the data. We used Javascript to let the different components talk to each other and some D3js to display a histogram of GO terms. On the backend the pipeline is implemented in Perl and all the data is structured and stored in a single JSON object that is delivered to the client using PHP.

What is next?
We are very happy with the current version but while developing we already came across a number of feature that would make a great appearance in version 2, for example EcoCyc integration, reporting of GC content over the stretch of the contig, exporting the InterPro annotations in the Genbank file and optimizing the coloring scheme. So incase you are a student and interested in working on deFUME you can drop me an email.

The deFUME paper can be found here, the webserver here with a working example here. Contributions can be made to the deFUME github repository.

[1] van der Helm, E., Geertz-Hansen, H. M., Genee, H. J., Malla, S. & Sommer, M. O. A. deFUME: Dynamic exploration of functional metagenomic sequencing data. BMC Res. Notes 8, 328 (2015).

[2] Hess, M. et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331, 463–7 (2011).

Leave a Comment

Filed under Publications