102 research outputs found
A Helix Propensity Scale Based on Experimental Studies of Peptides and Proteins
AbstractThe average globular protein contains 30% α-helix, the most common type of secondary structure. Some amino acids occur more frequently in α-helices than others; this tendency is known as helix propensity. Here we derive a helix propensity scale for solvent-exposed residues in the middle positions of α-helices. The scale is based on measurements of helix propensity in 11 systems, including both proteins and peptides. Alanine has the highest helix propensity, and, excluding proline, glycine has the lowest, ∼1kcal/mol less favorable than alanine. Based on our analysis, the helix propensities of the amino acids are as follows (kcal/mol): Ala=0, Leu=0.21, Arg=0.21, Met=0.24, Lys=0.26, Gln=0.39, Glu=0.40, Ile=0.41, Trp=0.49, Ser=0.50, Tyr=0.53, Phe=0.54, Val=0.61, His=0.61, Asn=0.65, Thr=0.66, Cys=0.68, Asp=0.69, and Gly=1
Peptide Sequence and Conformation Strongly Influence Tryptophan Fluorescence
AbstractThis article probes the denatured state ensemble of ribonuclease Sa (RNase Sa) using fluorescence. To interpret the results obtained with RNase Sa, it is essential that we gain a better understanding of the fluorescence properties of tryptophan (Trp) in peptides. We describe studies of N-acetyl-L-tryptophanamide (NATA), a tripeptide: AWA, and six pentapeptides: AAWAA, WVSGT, GYWHE, HEWTV, EAWQE, and DYWTG. The latter five peptides have the same sequence as those surrounding the Trp residues studied in RNase Sa. The fluorescence emission spectra, the fluorescence lifetimes, and the fluorescence quenching by acrylamide and iodide were measured in concentrated solutions of urea and guanidine hydrochloride. Excited-state electron transfer from the indole ring of Trp to the carbonyl groups of peptide bonds is thought to be the most important mechanism for intramolecular quenching of Trp fluorescence. We find the maximum fluorescence intensities vary from 49,000 for NATA with two carbonyls, to 24,400 for AWA with four carbonyls, to 28,500 for AAWAA with six carbonyls. This suggests that the four carbonyls of AWA are better able to quench Trp fluorescence than the six carbonyls of AAWAA, and this must reflect a difference in the conformations of the peptides. For the pentapeptides, EAWQE has a fluorescence intensity that is more than 50% greater than DYWTG, showing that the amino acid sequence influences the fluorescence intensity either directly through side-chain quenching and/or indirectly through an influence on the conformational ensemble of the peptides. Our results show that peptides are generally better models for the Trp residues in proteins than NATA. Finally, our results emphasize that we have much to learn about Trp fluorescence even in simple compounds
Distinct Secondary Structures of the Leucine-Rich Repeat Proteoglycans Decorin and Biglycan: Glycosylation-Dependent Conformational Stability
Biglycan and decorin, closely related small leucine-rich repeat proteoglycans, have been overexpressed in eukaryotic cers and two major glycoforms isolated under native conditions: a proteoglycan substituted with glycosaminoglycan chains; and a core protein form secreted devoid of glycosaminoglycans. A comparative biophysical study of these glycoforms has revealed that the overall secondary structures of biglycan and decorin are different. Far-UV Circular Dichroism (CD) spectroscopy of decorin and biglycan proteoglycans indicates that, although they are predominantly Beta-sheet, biglycan has a significantly higher content of alpha-helical structure. Decorin proteoglycan and core protein are very similar, whereas the biglycan core protein exhibits closer similarity to the decorin glycoforms than to. the biglycan proteoglycan form. However, enzymatic removal of the chondroitin sulfate chains from biglycan proteoglycan does not induce a shift to the core protein structure, suggesting that the fmal form is influenced by polysaccharide addition only during biosynthesis. Fluorescence emission spectroscopy demonstrated that the single tryptophan residue, which is at a conserved position at the C-terminal domain of both biglycan and decorin, is found in similar microenvironments. This indicates that at least in this specific domain, the different glycoforms do exhibit apparent conservation of structure. Exposure of decorin and biglycan to 10 M urea resulted in an increase in fluorescent intensity, which indicates that the emission from tryptophan in the native state is quenched. Comparison of urea-induced protein unfolding curves provided further evidence that decorin and biglycan assume different structures in solution. Decorin proteoglycan and core protein unfold in a manner similar to a classic two-state model, in which there is a steep transition to an unfolded state between 1-2 M urea. The biglycan core protein also shows a similar steep transition. However, biglycan proteoglycan shows a broad unfolding transition between 1-6 M urea, probably indicating the presence of stable unfolding intermediates
Changing the net charge from negative to positive makes ribonuclease Sa cytotoxic
Ribonuclease Sa (pI = 3.5) from Streptomyces aureofaciens and its 3K (D1K, D17K, E41K) (pI = 6.4) and 5K (3K + D25K, E74K) (pI = 10.2) mutants were tested for cytotoxicity. The 5K mutant was cytotoxic to normal and v-ras-transformed NIH3T3 mouse fibroblasts, but RNase Sa and 3K were not. The structure, stability, and activity of the three proteins are comparable, but the net charge at pH 7 increases from -7 for RNase Sa to -1 for 3K and to +3 for 5K. These results suggest that a net positive charge is a key determinant of ribonuclease cytotoxicity. The cytotoxic 5K mutant preferentially attacks v-ras-NIH3T3 fibroblasts, suggesting that mammalian cells expressing the ras-oncogene are potential targets for ribonuclease-based drugs
Charge-charge interactions are key determinants of the pK values of ionizable groups in ribonuclease Sa (pI=3.5) and a basic variant (pI=10.2)
The pK values of the titratable groups in ribonuclease Sa (RNase Sa) (pI=3.5), and a charge-reversed variant with five carboxyl to lysine substitutions, 5K RNase Sa (pI=10.2), have been determined by NMR at 20 °C in 0.1 M NaCl. In RNase Sa, 18 pK values and in 5K, 11 pK values were measured. The carboxyl group of Asp33, which is buried and forms three intramolecular hydrogen bonds in RNase Sa, has the lowest pK (2.4), whereas Asp79, which is also buried but does not form hydrogen bonds, has the most elevated pK (7.4). These results highlight the importance of desolvation and charge–dipole interactions in perturbing pK values of buried groups. Alkaline titration revealed that the terminal amine of RNase Sa and all eight tyrosine residues have significantly increased pK values relative to model compounds. A primary objective in this study was to investigate the influence of charge–charge interactions on the pK values by comparing results from RNase Sa with those from the 5K variant. The solution structures of the two proteins are very similar as revealed by NMR and other spectroscopic data, with only small changes at the N terminus and in the α-helix. Consequently, the ionizable groups will have similar environments in the two variants and desolvation and charge–dipole interactions will have comparable effects on the pK values of both. Their pK differences, therefore, are expected to be chiefly due to the different charge–charge interactions. As anticipated from its higher net charge, all measured pK values in 5K RNase are lowered relative to wild-type RNase Sa, with the largest decrease being 2.2 pH units for Glu14. The pK differences (pKSa−pK5K) calculated using a simple model based on Coulomb's Law and a dielectric constant of 45 agree well with the experimental values. This demonstrates that the pK differences between wild-type and 5K RNase Sa are mainly due to changes in the electrostatic interactions between the ionizable groups. pK values calculated using Coulomb's Law also showed a good correlation (R=0.83) with experimental values. The more complex model based on a finite-difference solution to the Poisson–Boltzmann equation, which considers desolvation and charge–dipole interactions in addition to charge–charge interactions, was also used to calculate pK values. Surprisingly, these values are more poorly correlated (R=0.65) with the values from experiment. Taken together, the results are evidence that charge–charge interactions are the chief perturbant of the pK values of ionizable groups on the protein surface, which is where the majority of the ionizable groups are positioned in proteins.This work was supported by grants GM-37039 and GM-52483 from the National Institutes of Health (USA), grants BE-1060 and BE-1281 from the Robert A. Welch Foundation, and a grant PB-93-06777 to M.R. from the Dirección General de Investigación Cientı́fica y Técnica (Spain
Comparative Structural Analysis of Human DEAD-Box RNA Helicases
DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
- …