The paper in Nature is here: go.nature.com/2mpXRzg
Natural products (NPs) are bioactive small molecules produced by living organisms. Microbes, which are prolific producers of NPs, biosynthesize and use these compounds in their natural environments to kill or inhibit the growth of competing species. The structures and properties of NPs have been optimized by Nature during the evolution process to inhibit specific enzymes in target organisms. Therefore, NPs have been invaluable sources and inspirations for drug discovery towards treating many human diseases, including uses as antibiotics, antifungals, anticancer compounds, cholesterol lowering drugs, immunosuppressants, etc.
NPs also have valuable roles in agricultural applications, including some of the widely used crop protection agents such as herbicides, insecticides and fungicides. In both human health and agricultural applications, the emergence of drug resistance by the target organisms have significantly lowered the effectiveness of these molecules. For example, multidrug resistance bacteria have steadily spread in the last half century, resulting in “superbugs” that are untreatable or treatable only with last line of defense antibiotics. Similarly, the rapid rise of herbicide resistant weeds poses a major agricultural crisis by decreasing crop yields and impacting the global food supply. For these reasons, it is more urgent than ever to increase efforts towards the discovery of new NPs, both in terms of new structures classes and new biological activities.
The post-genomics era has brought a renaissance in NP discovery using genomic approaches. Rapid sequencing of microbes coupled with bioinformatic analyses have revealed the true biosynthetic potential of microbes is far from being realized. Because the genes encoding enzymes that synthesize NPs are typically clustered together to form a biosynthetic gene cluster (BGC), the number of BGCs encoded in a genome is a reasonable estimation of the total number of NPs an organism can produce. Since NPs are biosynthesized in responses to complex combinations of environmental and growth signals, only a small number of the total BGCs are turned on under laboratory culturing conditions. As a result, it is estimated that less than 10% of all BGCs produce NPs and even fewer have characterized NPs. Therefore, there exists a tremendous opportunity to explore the chemical space encoded in the other 90% of cryptic BGCs. Different methods have been used in the community to “activate” these pathways, including pathway-specific transcriptional activation, epigenetic methods to alter chromatin structure and effect global NP profiles, and heterologous expression of targeted pathways in model hosts. While these approaches are successful in awaking BGCs, the true biological activities of the produced NPs are typically unknown. Compared to more classical NP discovery, in which a phenotypical assay guides the purification of a NP, the genomic approaches are not activity-guided. Given the large number of BGCs available, it is essential that genome driven discovery of NPs to be prioritized by biological activity.
How to predict the activity of an NP based on BGC sequence? Answer to this question can unlock the true potential of the tens of thousands of unexplored BGCs. In our work, we took a resistance gene-guided approach. It is known for some time now that in order for the producing organism not to be harmed by the NP it produces, self-resistance mechanisms must be in place when the NP is produced. Several of such self-defense mechanisms are known, including the use of efflux pumps to actively transport the NPs to the extracellular space; the use of antidotal proteins to bind and detoxifythe NP; the use of specific enzymes to modify the target of the NP as a means to evade inhibition; and the use of a functionally equivalent resistance enzyme to compensate for the inhibition of the housekeeping enzyme by NP. This last mechanism is especially intriguing as the resistance enzyme can be a second and homologous copy to the housekeeping enzyme that is targeted by the NP. This second copy can perform the same catalytic function, but is sufficiently mutated to be insensitive to the NP. The resistance gene is also typically co-clustered with the NP BGC and is co-transcribed with the rest of the cluster. A well-known example is the lovastatin BGC, in which a second copy of the target HMG-CoA reductase is present to confer resistance to lovastatin, a potent HMG-CoA reductase inhibitor. We reasoned that this type of resistance mechanism can be used to predict the function of the NP encoded by a BGC, as the coclustered resistance enzyme can be a predictive window that links BGC sequence and NP activity.
To perform such target-guided genome mining, we sought to find a NP that can inhibit the branched chain amino acid (BCAA) pathway and be an herbicide lead. This pathway is present in bacteria, fungi and plants, but absence in animals and humans, thereby making it an attractive target for anti-infectives and herbicides. In fact, the first enzyme in the pathway, acetolactate synthase (ALS) is the most targeted enzyme among commercial herbicides. The third enzyme in this pathway, dihydroxyacid dehydratase (DHAD) has also been intensely pursued as an herbicide target, however, no effective inhibitor is available and no NP inhibitor has been reported. We reasoned that since this pathway is also essential in fungi, any fungal producer of a DHAD inhibitor must also have a resistance copy of the enzyme encoded in the gene cluster. Using a simple bioinformatic algorithm, we used DHAD as a query and searched for gene clusters that are colocalized with a DHAD homolog. One search constraint is the DHAD homolog need to be the second copy to the widely conserved housekeeping copy of DHAD. To our delight, we were able to find a well-conserved four-gene BGC in several well-characterized fungal species, including Aspergillus terreus. This BGC contains three biosynthetic enzymes: a terpene synthase and two cytochrome P450 monooxygenases; and a putative self-resistance enzyme AstD that is ~ 60% homologous to housekeeping DHADs.
We then used different methods to activate this gene cluster. We opted to use heterologous expression since the cluster is quite compact. Expression of this cluster in A. nidulans did not work well, but expression in the Baker’s yeast, Saccharomyces cerevisiae was a success. A new compound was produced from this host with a reasonable titer at 20 mg/L. Future strain improvement and optimization should be able to significantly increase this titer, as are the cases with other sesquiterpene products from yeast. Structure of the compound was solved by NMR techniques and was shown to be a known NP aspterric acid (AA), which was isolated almost forty years ago. We were initially disappointed with this finding as the compound is not new. However, our disappointment was short-lived as we realized that there is no known biological target for this compound. With our targeted-guided approach, we rediscovered this compound, but the key is to verify this compound indeed inhibits DHAD. To do that we assayed three different DHAD enzymes: the housekeeping fungal DHAD from A. terreus; the plant DHAD from Arabidopsis thaliana; and the putative resistance DHAD AstD from the AA BGC. Satisfyingly, AA was shown to be a potent inhibitor of both housekeeping DHAD, with Ki of ~ 300 nM. In contrast, no inhibition was observed with AstD even at the solubility limit of 8 mM. The activity of AstD is lower than that of the housekeeping DHAD, with a ~ 100 fold attenuation in kcat. We attributed this to the possibility that Nature, while trying to generate resistance to AstD through mutations, had to sacrifice some of the enzyme activity.
While the in vitro characterization was sufficient to prove our hypothesis of targeted genome mining is a viable strategy to search for NPs with desired biological activity, we wanted to push this project further in two parallel directions. First, we wanted to know if AA can truly be used as an herbicide and whether AstD can be a functional resistance gene in vivo (or in planta). Second, we wanted to understand the structural basis of AA inhibition through X-ray crystallography. To pursue these two very different directions, we formed collaborations with two other research labs.
To study the activity of AA on plants, we teamed up with Prof. Steve Jacobsen’s lab at UCLA. We performed growth assays of various plants in agar, and showed that AA showed impressive broad-spectrum herbicide activities. We next performed spray experiments on A. thaliana as a model plant. We soon realized that formulation of herbicide is just as important as the compound itself. After trying a few formulations without significant success, we decided to use a commercial formulation named Finale®. This formulation is used for the spray of the potent herbicide glufosinate. We reasoned that although AA is chemically distinct from glufosinate, the optimized components of Finale® is a good starting point for testing AA on plants. We used a glufosinate resistant strain of A. thaliana for this assay, and sprayed plants with Finale® spiked with 250 mM of AA. The results of the experiment showed that AA can effectively eliminate the growth of the plant, proving that this compound can be a promising herbicide lead with a new mode of action. We then tested in AstD can confer resistance of A. thaliana to AA. The Jacobsen lab generated a transgenic A. thaliana that expresses astD in the chloroplast, where BCAA biosynthesis takes place. The transgenic plant was able to grow in the presence of AA with no notable phenotype difference compared to control A. thaliana not treated with AA. This is an important result for the potential commercialization of AA as an herbicide, since the availability of a resistance gene may be useful in the generation of transgenic crops as seen in Roundup Ready crops that are resistant to glyphosate.
We collaborated with Prof. Jiahai Zhou at the Shanghai Institute of Organic Chemistry to solve the crystal structure of A. thaliana DHAD. Surprisingly, there was no structure of DHAD available in the literature prior to our study, despite its central role in the BCAA pathway. DHAD has a 2Fe-2S cluster in its active site, which makes it prone to oxidation. Therefore, protein purification and crystallization were performed under oxygen free environment in a glove box in the Zhou lab. We were able to obtain the apo (without 2Fe-2S) and the holo structures and reported these in the paper. The structures show an active site at the dimer interface. At the end of the active site is the 2Fe-2S cluster, providing a positively charged environment to bind to the carboxy end of AA. A large hydrophobic channel leads to the active site, where the alkyl parts of the substrates can bind. The large hydrophobic channel also explains how the bulkier, tricyclic AA can bind tightly. Structural modeling of AA binding was performed to show AA can fit in the pocket well. Homology modeling of AstD based on the DHAD structure showed that while the overall structures are similar, there are potential amino acid substitutions at the entrance of the active site that constricts the entrance in AstD to prevent binding of AA, but still allowing the natural substrate to enter. This paves the way for mutagenesis experiments to pinpoint exactly the residues implicated in resistance. We have subsequently solved the structures of DHAD complexed with AA, as well as the holo structure of AstD. These structures will be reported in a subsequent paper.
In conclusion, we used a target-guided genome mining approach to rediscover aspterric acid as a DHAD inhibitor. This is the first known natural product inhibitor of DHAD. Looking back on this project, there are two things that are clear: first, without the target-guided approach, we would not have worked on this gene cluster. The cluster is exceedingly simple by NP biosynthetic community standards, with only three fairly generic enzymes. This showcases Nature’s ability to generate functional molecular scaffolds with a very concise pathway; second, by looking at aspterric acid without the context of the resistance gene, we would not have associated this molecule with inhibition of the BCAA pathway, nor as a DHAD inhibitor. In hindsight, AA is a beautifully constructed molecule that is a perfect inhibitor for DHAD. One must marvel Nature’s creativity in crafting an inhibitor for amino acid biosynthesis from a sesquiterpene scaffold. Collectively, these observations and reflections further underscore the usefulness of using targeted-guided approach to identify and prioritize BGCs. While it is true that not all BGCs have an encoded resistance enzyme, we believe there are sufficient number of BGCs that do, and powerful data mining algorithms may be needed to search through the genomes. Lastly, studying mechanisms of resistance genes available in BGCs can also provide interesting insights into how Nature evolves a housekeeping enzyme to become resistant to its own evolved inhibitor, while retaining catalytic function. These can offer useful insights into the emergence of resistance against antibiotics, herbicides and other useful NPs. We will end with a question: which came first? The self-resistance enzyme or the NP itself?