Most proteins in all organisms undergo crucial N-terminal modifications involving N-terminal methionine excision, N--acetylation or N-myristoylation (N-Myr), or S-palmitoylation. We investigated the occurrence of these poorly annotated but essential modifications in proteomes, focusing on eukaryotes. Experimental data for the N-terminal sequences of animal, fungi, and archaeal proteins, were used to build dedicated predictive modules in a new software. In vitro N-Myr experiments were performed with both plant and animal N-myristoyltransferases, for accurate prediction of the modification. N-terminal modifications from the fully sequenced genome of Arabidopsis thaliana were determined by MS. We identified 105 ne... More
Most proteins in all organisms undergo crucial N-terminal modifications involving N-terminal methionine excision, N--acetylation or N-myristoylation (N-Myr), or S-palmitoylation. We investigated the occurrence of these poorly annotated but essential modifications in proteomes, focusing on eukaryotes. Experimental data for the N-terminal sequences of animal, fungi, and archaeal proteins, were used to build dedicated predictive modules in a new software. In vitro N-Myr experiments were performed with both plant and animal N-myristoyltransferases, for accurate prediction of the modification. N-terminal modifications from the fully sequenced genome of Arabidopsis thaliana were determined by MS. We identified 105 new modified protein N-termini, which were used to check the accuracy of predictive data. An accuracy of more than 95% was achieved, demonstrating (i) overall conservation of the specificity of the modification machinery in higher eukaryotes and (ii) robustness of the prediction tool. Predictions were made for various proteomes. Proteins that had undergone both N-terminal methionine (Met) cleavage and N-acetylation were found to be strongly overrepresented among the most abundant proteins, in contrast to those retaining their genuine unblocked Met. Here we propose that the nature of the second residue of an ORF is a key marker of the abundance of the mature protein in eukaryotes.