1,202 research outputs found

    Increasing peptide identifications and decreasing search times for ETD spectra by pre-processing and calculation of parent precursor charge

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Electron Transfer Dissociation [ETD] can dissociate multiply charged precursor polypeptides, providing extensive peptide backbone cleavage. ETD spectra contain charge reduced precursor peaks, usually of high intensity, and whose pattern is dependent on its parent precursor charge. These charge reduced precursor peaks and associated neutral loss peaks should be removed before these spectra are searched for peptide identifications. ETD spectra can also contain ion-types other than c and z<b>Ë™</b>. Modifying search strategies to accommodate these ion-types may aid in increased peptide identifications. Additionally, if the precursor mass is measured using a lower resolution instrument such as a linear ion trap, the charge of the precursor is often not known, reducing sensitivity and increasing search times. We implemented algorithms to remove these precursor peaks, accommodate new ion-types in noise filtering routine in OMSSA and to estimate any unknown precursor charge, using Linear Discriminant Analysis [LDA].</p> <p>Results</p> <p>Spectral pre-processing to remove precursor peaks and their associated neutral losses prior to protein sequence library searches resulted in a 9.8% increase in peptide identifications at a 1% False Discovery Rate [FDR] compared to previous OMSSA filter. Modifications to the OMSSA noise filter to accommodate various ion-types resulted in a further 4.2% increase in peptide identifications at 1% FDR. Moreover, ETD spectra when searched with charge states obtained from the precursor charge determination algorithm is shown to be up to 3.5 times faster than the general range search method, with a minor 3.8% increase in sensitivity.</p> <p>Conclusion</p> <p>Overall, there is an 18.8% increase in peptide identifications at 1% FDR by incorporating the new precursor filter, noise filter and by using the charge determination algorithm, when compared to previous versions of OMSSA.</p

    MMDB: annotating protein sequences with Entrez's 3D-structure database

    Get PDF
    Three-dimensional (3D) structure is now known for a large fraction of all protein families. Thus, it has become rather likely that one will find a homolog with known 3D structure when searching a sequence database with an arbitrary query sequence. Depending on the extent of similarity, such neighbor relationships may allow one to infer biological function and to identify functional sites such as binding motifs or catalytic centers. Entrez's 3D-structure database, the Molecular Modeling Database (MMDB), provides easy access to the richness of 3D structure data and its large potential for functional annotation. Entrez's search engine offers several tools to assist biologist users: (i) links between databases, such as between protein sequences and structures, (ii) pre-computed sequence and structure neighbors, (iii) visualization of structure and sequence/structure alignment. Here, we describe an annotation service that combines some of these tools automatically, Entrez's ‘Related Structure’ links. For all proteins in Entrez, similar sequences with known 3D structure are detected by BLAST and alignments are recorded. The ‘Related Structure’ service summarizes this information and presents 3D views mapping sequence residues onto all 3D structures available in MMDB ()

    Automated annotation of chemical names in the literature with tunable accuracy

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A significant portion of the biomedical and chemical literature refers to small molecules. The accurate identification and annotation of compound name that are relevant to the topic of the given literature can establish links between scientific publications and various chemical and life science databases. Manual annotation is the preferred method for these works because well-trained indexers can understand the paper topics as well as recognize key terms. However, considering the hundreds of thousands of new papers published annually, an automatic annotation system with high precision and relevance can be a useful complement to manual annotation.</p> <p>Results</p> <p>An automated chemical name annotation system, MeSH Automated Annotations (MAA), was developed to annotate small molecule names in scientific abstracts with tunable accuracy. This system aims to reproduce the MeSH term annotations on biomedical and chemical literature that would be created by indexers. When comparing automated free text matching to those indexed manually of 26 thousand MEDLINE abstracts, more than 40% of the annotations were false-positive (FP) cases. To reduce the FP rate, MAA incorporated several filters to remove "incorrect" annotations caused by nonspecific, partial, and low relevance chemical names. In part, relevance was measured by the position of the chemical name in the text. Tunable accuracy was obtained by adding or restricting the sections of the text scanned for chemical names. The best precision obtained was 96% with a 28% recall rate. The best performance of MAA, as measured with the F statistic was 66%, which favorably compares to other chemical name annotation systems.</p> <p>Conclusions</p> <p>Accurate chemical name annotation can help researchers not only identify important chemical names in abstracts, but also match unindexed and unstructured abstracts to chemical records. The current work is tested against MEDLINE, but the algorithm is not specific to this corpus and it is possible that the algorithm can be applied to papers from chemical physics, material, polymer and environmental science, as well as patents, biological assay descriptions and other textual data.</p

    CDD: a Conserved Domain Database for protein classification

    Get PDF
    The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as Proteins, Taxonomy and PubMed®, and can be accessed at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. CD-Search, which is available at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, is a fast, interactive tool to identify conserved domains in new protein sequences. CD-Search results for protein sequences in Entrez are pre-computed to provide links between proteins and domain models, and computational annotation visible upon request. Protein–protein queries submitted to NCBI's BLAST search service at http://www.ncbi.nlm.nih.gov/BLAST are scanned for the presence of conserved domains by default. While CDD started out as essentially a mirror of publicly available domain alignment collections, such as SMART, Pfam and COG, we have continued an effort to update, and in some cases replace these models with domain hierarchies curated at the NCBI. Here, we report on the progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system

    Observation of Hadronic W Decays in t-tbar Events with the Collider Detector at Fermilab

    Full text link
    We observe hadronic W decays in t-tbar -> W (-> l nu) + >= 4 jet events using a 109 pb-1 data sample of p-pbar collisions at sqrt{s} = 1.8 TeV collected with the Collider Detector at Fermilab (CDF). A peak in the dijet invariant mass distribution is obtained that is consistent with W decay and inconsistent with the background prediction by 3.3 standard deviations. From this peak we measure the W mass to be 77.2 +- 4.6 (stat+syst) GeV/c^2. This result demonstrates the presence of two W bosons in t-tbar candidates in the W (-> l nu) + >= 4 jet channel.Comment: 20 pages, 4 figures, submitted to PR

    Measurement of the lepton charge asymmetry in W-boson decays produced in p-pbar collisions

    Full text link
    We describe a measurement of the charge asymmetry of leptons from W boson decays in the rapidity range 0 enu, munu events from 110+/-7 pb^{-1}of data collected by the CDF detector during 1992-95. The asymmetry data constrain the ratio of d and u quark momentum distributions in the proton over the x range of 0.006 to 0.34 at Q2 \approx M_W^2. The asymmetry predictions that use parton distribution functions obtained from previously published CDF data in the central rapidity region (0.0<|y_l|<1.1) do not agree with the new data in the large rapidity region (|y_l|>1.1).Comment: 13 pages, 3 tables, 1 figur
    • …
    corecore