论文标题
从进化到重复蛋白的折叠
From evolution to folding of repeat proteins
论文作者
论文摘要
重复蛋白质是用类似氨基酸伸展的串联副本制成的,将其折叠成细长的结构。由于它们的对称性,这些蛋白质构成了出色的模型系统,以研究演化与结构,折叠和功能的关系。在这里,我们提出了一个方案,将序列水平的进化信息映射到一个粗粒模型,以折叠重复蛋白,并使用它来研究数千种重复蛋白的折叠。我们通过逆Potts模型方案与重复的明确机械模型和重复删除的明确机械模型的结合来对能量学进行建模,以在单个残基级别计算系统的进化参数。这用于告知类似ISIN的模型,该模型允许在特定案例研究中与实验数据高度兼容的中间状态的折叠曲线,明显的域出现和职业。我们分析了数千种天然藻蛋白重复蛋白的折叠,并发现有多种折叠机制是可能的。对于具有足够的序列相似元素和强烈相互作用的阵列获得了完全合作的全或任何转变,而如果元素没有相似之处,并且它们之间的相互作用在能量上较弱,则会出现非合作元素间歇性折叠。在两者之间,我们表征了成核促进和多域折叠机制。最后,我们表明可以通过简单的能量评分定量预测重复阵列的稳定性和协同性,这为通过共进化模型指导蛋白质折叠设计铺平了道路。
Repeat proteins are made with tandem copies of similar amino acid stretches that fold into elongated architectures. Due to their symmetry, these proteins constitute excellent model systems to investigate how evolution relates to structure, folding and function. Here, we propose a scheme to map evolutionary information at the sequence level to a coarse-grained model for repeat-protein folding and use it to investigate the folding of thousands of repeat-proteins. We model the energetics by a combination of an inverse Potts model scheme with an explicit mechanistic model of duplications and deletions of repeats to calculate the evolutionary parameters of the system at single residue level. This is used to inform an Ising-like model that allows for the generation of folding curves, apparent domain emergence and occupation of intermediate states that are highly compatible with experimental data in specific case studies. We analyzed the folding of thousands of natural Ankyrin-repeat proteins and found that a multiplicity of folding mechanisms are possible. Fully cooperative all-or-none transition are obtained for arrays with enough sequence-similar elements and strong interactions between them, while non-cooperative element-by-element intermittent folding arose if the elements are dissimilar and the interactions between them are energetically weak. In between, we characterised nucleation-propagation and multi-domain folding mechanisms. Finally, we showed that stability and cooperativity of a repeat-array can be quantitatively predicted from a simple energy score, paving the way for guiding protein folding design with a co-evolutionary model.