All the information in the world in a spoonful of polyesters

All the information in the world in a spoonful of polyesters

Polymerization is a chemical reaction leading to the creation of long chain molecules via a successive formation of covalent bonds between unit molecules (monomers). For a century, synthetic macromolecules have served as materials for every aspect of modern civilization. When two or more monomers are used simultaneously for polymerization, the resulting polymers possess only randomized or repetitive sequences with statistical distribution of the number of monomers joined during polymerization. Unlike DNA, therefore, synthetic polymers are unable to store information as a sequence of monomers. At a dawn of the centennial of a ‘macromolecular hypothesis’ proposed by Prof. Staudinger, polymer chemists still face a decades-old challenge: Can the chemical structures of synthetic polymers be as accurate as those of biopolymers?

We initiated our research to synthesize high molecular-weight polymers having absolutely defined molecular weight and sequences. We chose an iterative convergent method to synthesize poly(α-hydroxy acid)s (PAHs) as an alternative to solid-phase synthesis because this synthetic approach has been widely used for the creation of dendrimers and polymers without molecular weight distributions. Each growth step increases the molecular weight of the product exponentially, but the purification is required to isolate the monodisperse polymers. We adopted the purification by size-exclusion chromatography (SEC) utilizing the difference of the hydrodynamic volumes between reagents and a coupled product. Because this difference in size persists throughout the iteration of convergent couplings, this purification method greatly increased the molecular weight of the isolated polymer. We have succeeded in the synthesis of poly(rac-phenyllactic acid)s composed of 256 repeating units (> 38000 Da) via much reduced number of synthetic steps than those required for the synthesis of same polymers by solid phase synthesis.  

In an iterative convergent method, the coupled product becomes a reagent for the consecutive coupling step. This self-iterative nature of the convergent approach only allows the resulting polymers to have repetitive or segmental sequences. To overcome this limitation, we devised a cross-convergent approach to obtain sequence-defined polymers. We used two monomers, phenyllactic acid and lactic acid representing 0 and 1, respectively, to encode digital information as a sequence of a copolymer, poly(phenyllactic-co-lactic acid) (PcL). All permutations of two monomers (00, 01, 10, and 11) were synthesized and cross-converged to form an information-encoded PcL. The word 'SEQUENCE' was encoded as binary code in a 64-bit storing PcL (Figure 1). The stored information could be fully retrieved by a single measurement of a tandem mass spectrum using a MALDI-TOF spectrometer (Figure 2). Information in a longer PcL was decoded by a MALDI-TOF spectrum of chemically degraded PcL. Our work suggests that copolyesters can store digital information with a density higher than DNA, which render these copolymers be a low-cost alternative to DNA as a molecular storage medium.

In addition to the development of the molecular media for information storage, we hope that our method could contribute to the exploration of new properties and functions arising from the unlimited diversity of the chemical structures of monodisperse and sequence-defined polymers. These polymers should open a new era in polymer chemistry as our exciting journey to perfect polymers continues. 

Figure 1. Writing information in a PcL by cross-convergent synthesis. The color-coded segments store a 2-bit code corresponding to an ASCII character.

Figure 2. MALDI-TOF mass spectra of PcLs storing 64-bit and 128-bit information. Tandem mass spectrometry (MALDI-TOF MS/MS) was used for decoding the sequence of monomers, which was subsequently converted to binary code and ASCII characters.