Science, discussed.

Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites

In their latest submission to bioRxiv, Echave, Jackson, and Wilke present a model to explain site-specific evolutionary rates among amino acids in a protein based on biophysical principles.

Relationship between protein thermodynamic constraints 
and variation of evolutionary rates among sites
Julian Echave, Eleisha L. Jackson, Claus O Wilke

They compare two  approaches which are based on (1) overall thermodynamic stability changes after mutation  and (2) the changes of internal stress due to packing of side chains in the protein interior. The latter is measured by an Elastic Network Model that treats pairwise atomic distances as harmonic springs. Mutations act as perturbations of these springs that are at an energy minimum only for the wild type reference structure.

They find that the two models perform equally well on average in predicting empirical site variations of evolutionary rates, but that their predictive power can vary when compared between individual proteins, suggesting that internal stress plays a larger role in some proteins than in others.

One potential shortcoming of their stability model is the assumption that epistasis is negligible. In other words, mutational effects per site are treated independently and additively. The importance of epistasis has been argued by many, although there is conflicting evidence in the case of stability effects of mutations. Some argue that the stability change of a particular substitution is independent of the genetic background, i.e. the identity of amino acids at non-mutated sites. Others argue for epistasis, finding that compensatory mutations will enable otherwise destabilizing substitutions at other sites.

Interestingly, one lab has published experimental results on the same protein (influenza nucleoprotein) and in one paper find no evidence for stability-based epistasis, while finding it in another study. I think this illustrates that stability-based epistasis needs more attention.

I would like to see a more nuanced discussion of epistasis than just citing one half of the literature. How would epistasis have to be implemented? What are the challenges? Their “neutral” stability model that either accepts or rejects a mutation based on a single stability cut-off would probably need to be extended to allow stability effects of varying magnitudes.

They also compare two force fields commonly used for stability calculations as provided by FoldX and Rosetta’s ddg_monomer, finding only a negligible difference for their model predictions. All stability effect predictors still have the same fundamental flaw of treating a single conformation as the reference native structure. One of the major challenges, however, still is the proper treatment of unfolded conformations. A large proportion of stability effects are likely to manifest in the unfolded ensemble. Simulations of protein folding are still very time consuming and therefore unlikely to be applied to such large scale mutational studies as presented here any time soon.

The advances in combining evolutionary models with structural/biophysical data are very promising and the authors have made very significant contributions (especially in previous publications). The present article suggests that future models will incorporate a combination of different structure-based measures.


About Tobias Sikosek

Researcher in Computational Biophysics and Protein Evolution at the University of Toronto

8 comments on “Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites

  1. Julian Echave


    Thank you for your interest and comments.

    Epistasis is “lost” in the model when the mean-field assumption that the ddG due to mutations is sequence independent (i.e that ddG is additive). Of course, the ddG due to mutating a site, does depend on sequence, especially of the immediate environment of the mutated site, and this is automatically taken into account since ddG is calculated for a protein sequence of a member of the family. So this assumption is not as “strong” as it sounds: it will be valid as long as ddG is independent of the precise sequence within proteins of the same protein family, which are expected to conserve amino-acid properties.

    Even though one could imagine ways to relax such assumption, for example by calculating ddG for different proteins of the same family, it does not seem necessary to account for patterns of relative average site-specific evolution, which is the aim of the paper, since that such rates are usually obtained also by making a “mean-field” assumption: the sites are assumed to evolve independently, i.e. the substitution probability matrix of the whole sequence is assumed to be the product of substitution probability matrices of sites, which amounts to the assumption of additive ddG for a ddG-dependent model.

    I think one could somehow extend the model to consider site-site coevolution. However, when doing so, one should take into account that methods to find coevolving sites frequently use as “null model” an independent-sites site-independent substitution process. Any departure from such model is interpreted as coevolution. Therefore, one could obtain seemingly coevolving sites that are not actually coevolving but rather have similar true substitution matrices that are different from the model matrix. This sort of “coevolution” could be captured by the present model. For example, it could be used to simulate sets of sequences following a given tree to be used as an improved “baseline” to detect truly trully coevolving sites, or to obtain a better null model that is independent-sites but not sites-independent and allows to define coevolving sites as those whose correlations are higher than expected under the present model (this is the way correlation is defined in physics: as departures from a self-consistent mean-field approximation).

  2. James V. Kohl

    Metabolism of nutrients to species-specific pheromones that control the physiology of reproduction in species from microbes to man enables epistasis via nutrient-dependent RNA-directed DNA methylation and RNA-mediated amino acid substitutions that differentiate cell types in the context of thermodynamic cycles of protein biosynthesis and degradation.

    Species-specific pheromones control the physiology of reproduction and “fix” the amino acid substitutions in populations. The substitutions stabilize the DNA in organized genomes and the substitutions prevent most of the damage that would otherwise result from accumulated mutations.

    Nutrient-dependent DNA repair mechanisms also typically prevent the accumulation of damage that might otherwise occur during life history transitions. For example, vitamin D uptake or natural production links ecological adaptations in populations where malaria is endemic via substitution of the amino acid associated with the hemoglobin S (sickle cell) variant.

    The idea of protein evolution is foreign to me because protein evolution is not exemplified in model organisms. The universal trend of amino acid gain and loss in proteins occurs when nutrient-dependent substitutions stabilize the DNA in organized genomes. Accumulated mutations that cause too much dysfunction are typically eliminated — until nutrient stress and/or social stress overwhelm the ability of organisms that might otherwise continue to ecologically adapt as if they were immortal.

    Two external factors cause changes in amino acid compositions of proteins in all genera that lead to biodiversity via mortality of individuals. The external factors are nutrient uptake and the pheromone-controlled physiology of reproduction. Together, they link the epigenetic landscape to the physical landscape of DNA in organized genomes via the bio-physically constrained chemistry of protein folding. That is how protein folding can be linked to the conserved molecular mechanisms of ecological adaptations in all species via amino acid substitutions.

    • @p_gl

      Dear James, thank you for commenting on our blog post “Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites”. While we appreciate your contribution, we believe that the clarity of your comment and its relevance for this post could be improved, in particular for what concerns the initial three paragraphs.
      We thus invite you to elaborate further, and kindly ask that you provide references for your arguments.

      • James V. Kohl

        In our 1996 Hormones and Behavior review we wrote: “Small intranuclear proteins also participate in generating alternative splicing techniques of pre-mRNA and, by this mechanism, contribute to sexual differentiation…” No other conserved molecular mechanisms have since been linked to cell type differentiation of any cells in any individuals of any genera. I have elaborated on our claim that cell type differentiation is RNA-mediated in a series of published works and included examples in:

        Nutrient-dependent/pheromone-controlled adaptive evolution: a model.
        Kohl JV. Socioaffect Neurosci Psychol. 2013 Jun 14;3:20553. doi: 10.3402/snp.v3i0.20553. eCollection 2013. Review.

        I welcome comments from anyone who can place the bio-physically constrained chemistry of protein folding into any other context without evolutionary inferences. For example, claims that proteins evolve have not been supported by experimental evidence of biologically-based cause and effect. Links between metabolic networks and genetic networks clearly show how ecological variation leads to ecological adaptations without the involvement of mutations that perturb protein folding.

        See also:
        Human pheromones and food odors: epigenetic influences on the socioaffective nature of evolved behaviors. Kohl JV. Socioaffect Neurosci Psychol. 2012 Mar 15;2:17338. doi: 10.3402/snp.v2i0.17338. eCollection 2012.

        Human pheromones: integrating neuroendocrinology and ethology. Kohl JV, Atzmueller M, Fink B, Grammer K. Neuro Endocrinol Lett. 2001 Oct;22(5):309-21. Review.

        From fertilization to adult sexual behavior. Diamond M, Binstock T, Kohl JV.
        Horm Behav. 1996 Dec;30(4):333-53.

  3. James, as far as I could understand your comment, I think we’re talking about different levels of organization. The paper that initiated this discussion, deals with evolution at the level of single proteins, without considering epistatic interactions at levels higher than the single-protein level (e.g. protein-protein interactions, co-participation on metabolic networks, etc). Such single-molecule level is suitable for the study of site-dependent patterns of evolution of single-domain monomeric proteins, including site-site coevolution. At this level, epistasis refers to coevolution between different protein sites of the same protein molecule that may result from global protein-level constraints, such as stability, rapid folding, or the right native structure. Thus, even though your comment is interesting, I’m not sure of its relevance regarding the previous discussion.

  4. James V. Kohl

    Thank you. I think we are using different terms. I use ecological variation and ecological adaptation because RNA-mediated events clearly link the epigenetic landscape to the physical landscape of DNA in the organized genomes of species from microbes to man via amino acid substitutions that stabilize protein folding, which is how cell types are differentiated.

    Cell type differentiation occurs in a context that links atoms to ecosystems via what is currently known about quantum mechanics, quantum smell, quantum biology, and quantum consciousness.

    Use of terms such as “protein evolution” or “co-evolution” introduces theories that include mutations and inferences that extend the time it takes for an amino acid substitution to concurrently effect differentiation of many different cell types, including cell type differentiation linked to species-specific behaviors. I included an example of a human population in what is now Central China in my model of cell type differentiation: Nutrient-dependent/pheromone-controlled adaptive evolution: a model.

    Additional support for my model has since shown up in “Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes”

    The population patterns seem to reveal what can be expected, which is that mutations perturb protein folding and that nutrient-depedent amino acid substitutions stabilize DNA in organized genomes. That strongly suggests that mutations do not lead to epistasis via what is currently known about biologically-based cause and effect at any level of examination or organization.

  5. James V. Kohl

    Excerpt: All biologically-based facts reported during the past year link feedback loops to A 3D Map of the Human Genome at Kilobase Resolution [that] Reveals [the] Principles of Chromatin Looping. Indeed, the reporting of the 2014 Breakthrough in regenerative medicine included a link to a video representation of the report on chromatin looping Reversing Aging Processes with One Protein: a 2014 Breakthrough

    The likelihood that mutations, which perturb protein folding, contribute to anything but pathology appears to have been all but eliminated from the thoughts of intelligent serious scientists during the past year.

  6. Pingback: Thermodynamic constraints did not evolve | RNA-Mediated

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Creative Commons License
This blog is licensed under a Creative Commons Attribution 3.0 License.


Enter your email address to follow this blog and receive notifications of new posts by email.

Join 1,559 other followers

All Categories

%d bloggers like this: