Traditional methods to optimize a target performance using one variable at a time waste a lot of resources, and its assumption that different parameters are independent from one another results in conditions corresponding to local maxima. To address these issues, our group started to study more advanced optimization techniques to efficiently find the desired global maxima. Since it is difficult for chemists to understand the unclear tendency and correlation of parameters of a novel reaction and its outcome over multiple flow reaction variables (e.g., flow rate, diameter and length of pipe, or micromixer (reactor) type) without performing a full reaction assessment, we applied Gaussian process regression (GPR) to grasp the tendency and the correlation and estimate the optimal reaction conditions (Fig. 1). Our team used GPR on GPy to construct a regression model by using a limited number of observed data through ML, and search for a subsequent appropriate parameter value by using the model as a surrogate model of the flow reaction conditions.1,2
GPR was successfully applied for the parameters screening of an enantioselective organocatalyzed Rauhut–Currier and [3+2] annulation sequence that provided chiral spirooxindoles in high yield with good ee within a few seconds (Fig. 2).1 Despite the significant advances in this field, it was still difficult to efficiently and simultaneously optimize multiple flow reaction variables by appropriately balancing exploitation and exploration of the search in the reaction chemical space toward finding the desired global maxima. Recently, Bayesian optimization (BO), which is a powerful probabilistic method of determining the global maximum of a black-box objective function, is useful for multi-parameter screening in flow platforms as well as batch systems. Our teams applied the BO-assisted screening of numerical parameters for electrochemical oxidation of amines3 and electrochemical reductive carboxylation in flow.4
Generally, to utilize categorical variables for data-driven optimization, the steric and electronic properties of a molecule were converted to corresponding numeric values with descriptors, which required precise representation, and quantum chemical properties calculations for the construction of a practical model. It is difficult to attain the chemical reaction’s dataset with the selected categorical parameters and minimum features. It is also challenging to convert dominant, non-numerical parameters into numerical parameters through the selection of proper physical and engineering features, although these categorical parameters are crucial to achieving good outcomes. To demonstrate a more practical BO-assisted method of identifying optimal reaction conditions, we studied the direct optimization of categorical parameters with neither feature extraction nor model construction. In our study, we enhanced the BO algorithm by adopting a categorical variable as an integer value via one-hot encoding without employing ordinal encoding to avoid the effect of a relative magnitude between integer values (e.g., mixer A: ‘0’ represented by [1.0.0], mixer B: ‘1’ represented by [0.1.0], mixer C: ‘2’ represented by [0.0.1]). A categorical variable can be rounded to the closest integer and induced to the appropriate value, along with the optimization of a larger number of continuous numerical factors.5
Using BO-assisted screening of six numerical and categorical parameters, appropriate continuous flow synthetic conditions were determined for the production of functionalized biaryls via the redox-neutral cross-coupling reaction of iminoquinone monoacetals (IQMAs) or quinone monoacetals (QMAs) with arenols (Fig. 3). To determine a practical optimization methodology for the flow reaction conditions, we used IQMA, 2-naphthol, and a catalytic amount of TfOH in toluene to conduct six reactions to screen five continuous numerical parameters and one categorical parameter as follow:
- The amount of naphthol (1–3 equiv.)
- Temperature (20–60 °C)
- The concentration of IQMAs or QMAs in toluene (0.01–0.1 M)
- Flow rate (0.05–0.2 mL/min)
- Catalyst loading (0.5–2 mol%)
The mixer type (Comet X, β-type, and T-shaped)
In our study, we employed BO using parallel lower confidence bounds (LCB) as an acquisition function. Parallel BO efficiently evaluates an expensive objective function at several points, simultaneously. Optimization of the mixers was not efficiently achieved using other acquisition functions such as single EI (expected improvement), LCB, and parallel EI. We set a broader initial dataset using six data points to find suitable conditions, avoid expensive solvents and toxic reagents, and decrease the amount of chemicals. With a batch size of three, each mixer was suggested along with the next numerical parameters based on the initial dataset. After, the evaluation of these three estimated conditions by experiments, further consideration of all entries with the BO protocol suggested the next wave of entries. Gratifyingly, the functionalized biaryl 7 was obtained in high yields using a microflow system (Comet X micromixer, flow rate = 0.08 mL/min, and residence time = 15 min) as shown in (Fig. 4).
Although a cross-coupling reaction using quinone monoacetal 8 and 9 for further extension of the substrate scope was performed under the optimized conditions, the isolated yield of the desired biarenol 10 was only 38%. Thus, to determine the appropriate reaction conditions for QMAs 8, BO-assisted screening of 8 and 9 as model substrates was performed. Similarly, when BO with parallel LCB and experimental evaluation was repeatedly performed, the yield of product 10 was improved to 69% with the use of β-type mixer conditions (Fig. 5). In the previous reaction (Fig. 4), a different mixer (Comet-X), and lower concentration of IQMAs 5 was required. When we tested this lower concentration using QMAs 8, low conversion was observed, while testing IQMAs 5 under these new conditions generated many side products. Hence the difference in the mixer suitable for each reaction can be rationalized to be due to the difference in the respective stirring methods.
Crucially, our algorithm6 can screen for engineering variables such as the type of micromixer, providing a method for chemists that does not require complicated quantification or descriptors. Our group is currently investigating BO-assisted screening of multiple categorical parameters in large-scale synthesis7 and the highly enantioselective synthesis of biaryls using an immobilized chiral catalyst in flow.
- Kondo, M., Wathsala, H. D. P., Sako, M., Hanatani, Y., Ishikawa, K., Hara, S., Takaai, T., Washio, T., Takizawa, S. & Sasai, H. Exploration of flow reaction conditions using machine-learning for enantioselective organocatalyzed Rauhut–Currier and [3+2] annulation sequence. Chem. Commun. 56, 1259–1262 (2020).
- Sato, E., Fujii, M., Tanaka, H., Mitsudo, K., Kondo, M., Takizawa, S., Sasai, H., Washio, T., Ishikawa, K., & Suga, S. Application of an electrochemical microflow reactor for cyanosilylation: Machine learning-assisted exploration of suitable reaction conditions for semi-large-scale synthesis. J. Org. Chem. 86, 16035–16044 (2021).
- Kondo, M., Sugizaki, A., Khalid, M. I., Wathsala, H. D. P., Ishikawa, K., Hara, S., Takaai, T., Washio, T., Takizawa, S. & Sasai, H. Energy-, time-, and labor-saving synthesis of α-ketiminophosphonates: machine-learning-assisted simultaneous multiparameter screening for electrochemical oxidation. Green Chem. 23, 5825–5831 (2021).
- Naito, Y., Kondo, M., Nakamura, Y., Shida, N., Ishikawa, K., Washio, T., Takizawa, S. & Atobe, M. Bayesian optimization with constraint on passed charge for multiparameter screening of electrochemical reductive carboxylation in a flow microreactor. Chem. Commun. 58, 3893–3896 (2022).
- Kondo, M., Wathsala, H.D.P., Salem, M.S.H, Ishikawa, K., Hara, S., Takaai, T., Washio, T., Sasai, H. & Takizawa, S. Bayesian optimization-driven parallel-screening of multiple parameters for the flow synthesis of biaryl compounds. Commun. Chem. 5, 1-9 (2022).
- Kondo, M., Wathsala, H. D. P., Salem, M. S. H., Ishikawa, K., Hara, S., Takaai, T., Washio, T., Sasai, H. & Takizawa, S. (2022 October 6). Scripts for categorical Bayesian optimization-assisted screening of reaction conditions in the flow biaryl synthesis. [script]. Zenodo. https://doi.org/10.5281/zenodo.7151503.