RetroBioCat: Building a highly accessible computer-aided design tool for biocatalytic retrosynthesis
Putting digital technologies into the hands of wet-lab experimentalists is key to realising the potential of computing in R&D. In this behind the paper blog post, we detail the process we went through in developing RetroBioCat from an initial proof-of-principle to a fully realised web-app.
Our paper is available in Nature Catalysis at https://www.nature.com/articles/s41929-020-00556-z
RetroBioCat started life as a quick coding experiment to investigate whether RDKit reaction rules for a small number of enzymes could, via automated biocatalytic retrosynthesis, generate some of the enzyme cascades being worked on within the Turner and Flitsch research groups (Figure 1A). Following some success with this initial proof-of principle, we began to systematically incorporate reaction rules for more of the enzymes utilised in biocatalysis (with the help of the textbook ‘Biocatalysis in Organic Synthesis: The Retrosynthesis Approach by Turner & Humphreys’). What started off as a short coding experiment expanded into a complete python package and became a promising tool for suggesting biocatalytic reactions for the synthesis of a given molecule.
However, RetroBioCat would only be a useful tool if it was easily accessible to wet-lab scientists seeking to plan new reactions. Many python projects utilise jupyter notebooks to incorporate code, results, and text into a single document. Initially, RetroBioCat was designed using this approach, with the code hosted on GitHub and accessible entirely online via Binder. However, internal feedback revealed that this was not a popular route to interact with the program. As a result, we developed an easily accessible web-app, which is available at https://retrobiocat.com (Figure 1B).
It is often said that 20% of the work takes 80% of the time - this was certainly the case for RetroBioCat. While an early prototype was developed rapidly, refining the code and ensuring that all reactions appeared as expected took longer than anticipated. We expect that the creation of reaction rules for potential biotransformations will be an ongoing project, as the enzyme toolbox for biocatalysis is likely to continue expanding over the coming years. With this in mind, we have built web-portals for suggesting, adding or amending reaction rules into the RetroBioCat web-app. As such, RetroBioCat has the potential to continue to grow beyond this publication, as more reaction rules are added.
Alongside the biocatalysis reaction rules, RetroBioCat utilises a database of reactions described in the literature, to suggest steps and enzymes which have previously been demonstrated. The UK national lockdown, resulting from the Covid-19 pandemic, presented an opportunity to begin crowd-sourcing the data entry required to build this database. A special thank you goes to the volunteers, including researchers from the Turner and Flitsch groups, and AstraZeneca (Figure 2), who contributed their time during lockdown to this project. Going forward, we hope to continue to crowd-source the curation of literature biocatalytic reactions. Importantly, the data curation platform is openly available at retrobiocat.com.
It is our hope that RetroBioCat will serve as a useful resource for chemists and biologists in the planning of new biocatalytic cascades and reactions. In addition, it has the potential to be a continually growing knowledge base for the community.