In December 2013, Stergachis and colleagues published a high-profile paper on the dual role Transcription Factors (TFs) might have as a consequence of their binding to protein coding regions.
This observation could imply that not all codons in a protein sequence are equal, as some would be under selective pressure due to TFs binding. The implications, as nicely summarised by Weatheritt and Babu in a commentary also published on Science, go beyond protein expression profiles and might be correlated with pathogen targets, diseases, etc.
The dual function of TFs also means some core molecular biology concepts would have to be revised.
In a paper just accepted for publication on Molecular Biology and Evolution, Ke Xing and Xionglei He revisit Stergachis’ analysis and point to a bias in the phylogenetic scoring used that affected their conclusions. With their analysis, no significant differences can be observed in codon usage (particularly in the third base of the codon) between TF-bound and TF-depleted site (see fig 1 from the paper, left). In brief, and without being a computational biologist myself, the implication is that TF binding in coding regions of the genome is most likely a by-product of the low complexity of TF recognition sequences. This does not exclude any functional role of TF binding in exons, or the effects of other trans factors in regulating properties as expression and evolution. However, it suggests that the ‘duon’ hypothesis is the result of TFs bias on G/C sites more than a secondary selective pressure on protein function.
Although I expect the duon hypothesis to be discussed further, I hope more data will be gathered on this topic as challenging existing theories (e.g. on gene regulation) and envisioning new underlying biological processes (e.g. indirect and localised selective pressures) is the best outcome scientific publishing can aim at.