RetroBioCat: Building a highly accessible computer-aided design tool for biocatalytic retrosynthesis

Putting digital technologies into the hands of wet-lab experimentalists is key to realising the potential of computing in R&D. In this behind the paper blog post, we detail the process we went through in developing RetroBioCat from an initial proof-of-principle to a fully realised web-app.
Published in Chemistry
RetroBioCat: Building a highly accessible computer-aided design tool for biocatalytic retrosynthesis
Like

Our paper is available in Nature Catalysis at https://www.nature.com/articles/s41929-020-00556-z

RetroBioCat started life as a quick coding experiment to investigate whether RDKit reaction rules for a small number of enzymes could, via automated biocatalytic retrosynthesis, generate some of the enzyme cascades being worked on within the Turner and Flitsch research groups (Figure 1A). Following some success with this initial proof-of principle, we began to systematically incorporate reaction rules for more of the enzymes utilised in biocatalysis (with the help of the textbook ‘Biocatalysis in Organic Synthesis: The Retrosynthesis Approach by Turner & Humphreys’). What started off as a short coding experiment expanded into a complete python package and became a promising tool for suggesting biocatalytic reactions for the synthesis of a given molecule.

However, RetroBioCat would only be a useful tool if it was easily accessible to wet-lab scientists seeking to plan new reactions. Many python projects utilise jupyter notebooks to incorporate code, results, and text into a single document. Initially, RetroBioCat was designed using this approach, with the code hosted on GitHub and accessible entirely online via Binder. However, internal feedback revealed that this was not a popular route to interact with the program.  As a result, we developed an easily accessible web-app, which is available at https://retrobiocat.com (Figure 1B). 

Figure 1 – Progress from an early prototype to the current RetroBioCat web-app. A. An early prototype of a retrosynthesis network generated from a handful of rules, visualised using NetworkX.  SMILES strings are shown where structures are used in panel B. B. The same network as in A, generated using the network explorer on retrobiocat.com.

It is often said that 20% of the work takes 80% of the time - this was certainly the case for RetroBioCat. While an early prototype was developed rapidly, refining the code and ensuring that all reactions appeared as expected took longer than anticipated.  We expect that the creation of reaction rules for potential biotransformations will be an ongoing project, as the enzyme toolbox for biocatalysis is likely to continue expanding over the coming years.  With this in mind, we have built web-portals for suggesting, adding or amending reaction rules into the RetroBioCat web-app. As such, RetroBioCat has the potential to continue to grow beyond this publication, as more reaction rules are added.

Alongside the biocatalysis reaction rules, RetroBioCat utilises a database of reactions described in the literature, to suggest steps and enzymes which have previously been demonstrated. The UK national lockdown, resulting from the Covid-19 pandemic, presented an opportunity to begin crowd-sourcing the data entry required to build this database.  A special thank you goes to the volunteers, including researchers from the Turner and Flitsch groups, and AstraZeneca (Figure 2), who contributed their time during lockdown to this project. Going forward, we hope to continue to crowd-source the curation of literature biocatalytic reactions. Importantly, the data curation platform is openly available at retrobiocat.com. 

Figure 2 – RetroBioCat database contributors. A live view of the number of papers curated by each user (accessed 5th Jan 2020), available at retrobiocat.com.  This page is continually updated as new papers are added.

It is our hope that RetroBioCat will serve as a useful resource for chemists and biologists in the planning of new biocatalytic cascades and reactions. In addition, it has the potential to be a continually growing knowledge base for the community. 

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in