The QUALITY ASSURANCE Team was headed by John Markley, PhD, and was responsible for assessing the integrity of CESG proteins.
All purified proteins from the E. coli cell-based pipeline and all protein candidates for structure determination produced from the wheat germ cell-free pipeline were analyzed by MALDI-TOF and ESI mass spectrometry (MS) to confirm target protein identity and integrity and, as relevant, to determine incorporation of selenomethione (Se-Met) or isotopes. Also analyzed are proteins chemically modified by reductive methylation (to enhance crystallization).
CESG has focused on the plant Arabidopsis thaliana as its initial structural genomics target but has expanded to mouse, human, and other organisms. This and other structural genomics projects aimed to describe new protein folds so that, ultimately, examples of all folds found in nature are available to researchers.
Mass spectrometry has become an extremely useful technique in protein research of all kinds. Electrospray Ionization (ESI) MS can determine the mass of whole proteins to within several Daltons. Proteolytic digestion combined with Matrix-Assisted Laser Desorption / Ionization Time-Of- Flight (MALDI-TOF) MS and Liquid Chromatography Tandem Mass Spectrometry (LC-MS/MS) can determine amino acid sequence and protein identity.
It was expected that routine use of MS in this project would safeguard against misidentified proteins, determine incorporation of selenomethione, 15N and 13C labels, and provide evidence for post-translational modifications and covalently bound cofactors. As well, MS was expected to be useful in solving less routine, more research-oriented challenges.
Cases in which the initial MS results failed to agree with the predicted mass are investigated further through peptide sequencing by tryptic digest/LC-MS/MS. The efficiency of reductive methylation ranges from 85% to greater than 95%. Incorporation of Se-Met was greater than 90% in the great majority of cases while incorporation of stable isotopes 13C was generally greater than 95%. The incorporation of 15N appears to be as efficient as wheat germ cell-free protein synthesis as in E. coli-based synthesis. Use of ICP-MS to detect metal ligands resulted in the identification of proteins with bound Ni2+, Fe2+, or Ca2+.
Routine metal analysis of purified proteins using Inductively Coupled Plasma Mass Spectrometry (ICP-MS) has discovered metals in 44 of 171 proteins assayed (26%); by comparison, 17% of the 139947 proteins in the Swiss-Prot protein sequence database on 11/28/03 were associated with one or more “biological” metals. Eleven of these 44 proteins have gone on to produce structures. Nickel was the most commonly detected metal in our assays (we suspect that nickel from IMAC columns sometimes substitutes for the physiological metal during purification, or may bind non-specifically to some proteins) followed, in order of abundance, by Fe> Zn> Ca> Cu> Mn. Satisfyingly, after discounting proteins that might contain adventitious Ni, the frequency of metals found in the proteins purified by CESG match the distribution observed in the SwissProt database (in order of abundance: Fe>Zn>Ca>Mg>Mn>Cu>Mb>Co>Ni).
Enzyme Activity Screens
Proteins purified in substantial quantities by CESG are run through a series of high-throughput screens designed to detect phosphatases, phosphodiesterases, esterases, dehydrogenases, and monooxygenases. These enzyme activity screens are based on those described in Kuznetsova et al. [Enzyme Genomics: Application of general enzymatic screens to discover new enzymes. FEMS Microbiol Rev (2005) 29(2):263-79]. In addition, many CESG targets have been further characterized to reveal their biological substrates. A number of proteins were solved by CESG for which enzyme assays have demonstrated activities.
Multiple Approaches to Functional Insights
We pursued several routes to functional discovery. These were: Nomination of Targets By the Scientific Community, which has resulted in seventeen structures with pre-established biological relevance. These include the human heme degradative enzyme, heme oxygenase; High-Throughput Enzyme Activity Screening has been applied to about 500 purified proteins and has identified a large number of putative phosphodiesterases, phosphatases, dehydrogenases, and oxidases. Several enzymes have been been subjected to further extensive analysis in collaboration with labs with expertise in functional characterization, including a plant ADP-glucose phosphorylase that may play an unrecognized role in starch production; Metal Analysis, which by ICP-MS has been applied routinely and has discovered metals in over one quarter of the 171 purified proteins surveyed, including an iron-bound cysteine dioxygenase from mouse, the human homolog for which has been implicated in a number of diseases; NMR Functional Follow-Up Studies have identified substrates and protein:protein interactions, including a frog poly(A)-binding protein that may play a fundamental role in RNA processing and poly(A) metabolism; and X-ray Crystallography Functional Follow-Up Studies have been performed on a number of proteins with successfully determined structures, including a series of structures of mouse pyrimidine 5’-nucleotidase type 1 trapped in various catalytic states. A deficiency in this enzyme’s activity in humans results in nonspherotic hemolytic anemia. Notably, researchers outside of CESG have been involved in many of the functional follow-ups as well as in analysis of community-nominated target structures, resulting in eighteen collaborative publications thus far.