Prediction of protein folds and design of protein assemblies recently demonstrated amazing advances. While new protein folds, particularly smaller ones that are within the current de novo design ability, can be constructed without explicit consideration of the folding pathway, the ability to understand, or moreover, to design the folding pathway could unravel exciting potentials and shed new light to this important process.
One of the strategies for designing new types of protein folds is the coiled-coil protein origami (CCPO, 1,2), according to which protein polyhedral scaffolds are designed with edges defined by coiled-coil (CC) dimers in such way that the polypeptide chain traverses each edge exactly twice and defines the stable fold. Polyhedra can describe almost any shape and mathematics tells us that this type of the single chain CCPO design can in principle always be possible regardless of the shape. The key questions are whether the sufficient number of orthogonal building modules is available to build more complex shapes and whether the folding process can be guided.
While the design of an expanded CC toolbox set is progressing as well3, two recent reports demonstrated that the CCPO technology can advance beyond this limitation 4,5.
First, we realized that if the CCPO is composed of more than a single chain, the same building modules might be used in each of the chains. It turned out that this is not a trivial task; nevertheless we demonstrated that CCPOs can self-assemble from two chains that are either divided in an asymmetric (e.g. 12+2) or a pseudo-symmetric way (9+9). This principle was demonstrated on a trigonal bipyramid, a new protein cage composed of 9 edges. In pseudo-symmetric assembly the bipyramid is composed of two preorganized tetrahedral halves that can have the same three CC modules in each chain. In addition, we demonstrated that proteolysis can be used to regulate the self-assembly. This is reported in a paper by Fabio Lapenta et al4, where he worked hard to crack some unexpected problems regarding how to design the sequence and divide it into two segments. Characterization by SAXS turned out to be the key to realization that the assembly where the termini are at the opposed vertices of the bipyramid looked at first glance to be in order, however the resulting cage was collapsed. Nevertheless after many attempts Fabio managed to identify the design rule according to which the assembly was as planned.4
The second new approach was to exploit the possibility of stepwise folding pathway of modular proteins such as CCPOs. The question we posed was what parameter is governing the order of formation of CC modules during folding. Due to their modular architecture, CCPO cages are a great model system for investigating the role of the distance between the interacting segments in determining the folding pathway vis-a-vis their thermodynamic stability. It has long been established that folding rates of small proteins are proportional to their contact order, i.e. average distance between native contacts in the amino acid sequence. We found by stopped flow kinetics experiments on site-specifically labelled proteins that the distance between contacting segments plays a key role in determining the folding pathway5. Simulations and stopped-flow FRET experiments were used to show that folding of CCPO cages is governed by the spatial distance between pairs of building modules in the primary structure and in subsequent folding intermediates. Folding results were implemented into a robust mathematical folding model, which was successfully utilized for the design of CCPO tetrahedra containing an increasing number of identical CC building modules.
The best way to test a rule is to check what predictions can be made based on this rule and test them experimentally. We realized that given that we could guide the folding pathway, we should be able, in principle, to use the same type of building modules within the same chain multiple times if we knew how to arrange them so that they would uniquely interact with their correct partner. Jana Aupič prepared a simple mathematical model that evaluates the folding potential of all combinations and found that if we apply the above mentioned rule we could indeed design such a sequence. A sequence for a tetrahedron was designed where one out of six pairs occurred twice. According to SAXS, CD and refolding experiments the design folded as efficiently as the design composed on unique pairs. The same was observed for the design where two segments have been repeated twice and even with the design that was composed of only three different CC pairs, all prepared and tested by Žiga Strmšek. Remarkably only half the initial orthogonal building modules were sufficient for the successful folding.
Our results have several implications. First, they demonstrate that the achievable size and complexity of modular coiled-coil based assemblies is not limited by the available orthogonal set as previously thought. More broadly, our results bare significance for naturally occurring tandem repeat proteins that represent a substantial part of the eukaryotic proteome. It has been observed that homologous domains are only rarely positioned adjacently in the amino acid sequence of multi-domain proteins, which may serve to prevent misfolding. For cellular proteins, avoiding misfolding is crucial, since it not only leads to loss of protein function but can also cause formation of aggregate species connected to pathologies.
We can expect that similar rules are likely to hold for other programmable biopolymers, such as nucleic acids, where the shortage of orthogonal sequences has however not been an issue. In fact we showed several years ago that DNA oligos of different stability can be used to make knotted structures that fold rapidly6.
The CCPO protein design platform, that is the topic of the ERC Advanced Grant MaCChines, has been so far able to keep delivering new concepts – from the first demonstration of the 3D CC-based tetrahedron, in vivo folding of CCPOs, design of the folding pathway and multiple use of the same type of CC modules for the assembly from multiple chains and introduction of their regulation.
Department of Synthetic iology and Immunology, National institute of chemistry, Ljubljana, Slovenia
1. Ljubetič, A. et al. Design of coiled-coil protein-origami cages that self-assemble in vitro and in vivo. Nat. Biotechnol. 35, 1094–1101 (2017).
2. Gradišar, H. et al. Design of a single-chain polypeptide tetrahedron assembled from coiled-coil segments. Nat. Chem. Biol. 9, 362–6 (2013).
3. Boldridge, W. C. et al. A multiplexed bacterial two-hybrid for rapid characterization of protein-protein interactions and iterative protein design. bioRxiv 2020.11.12.377184 (2020). doi:10.1101/2020.11.12.377184
4. Lapenta, F. et al. Self-assembly and regulation of protein cages from pre-organised coiled-coil modules. Nat. Commun. 12, 1–12 (2021).
5. Aupič, J. et al. Designed folding pathway of modular coiled-coilbased proteins. Nat. Commun. 12, 1–12 (2021).
6. Kočar, V. et al. Design principles for rapid folding of knotted DNA nanostructures. Nat. Commun. 7, (2016).