Search CORE

16 research outputs found

A Rational Deconstruction of Landin's SECD Machine with the J Operator

Author: Kevin Millikin
Olivier Danvy
Robert Tennent
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 28/11/2008
Field of study

Landin's SECD machine was the first abstract machine for applicative expressions, i.e., functional programs. Landin's J operator was the first control operator for functional languages, and was specified by an extension of the SECD machine. We present a family of evaluation functions corresponding to this extension of the SECD machine, using a series of elementary transformations (transformation into continu-ation-passing style (CPS) and defunctionalization, chiefly) and their left inverses (transformation into direct style and refunctionalization). To this end, we modernize the SECD machine into a bisimilar one that operates in lockstep with the original one but that (1) does not use a data stack and (2) uses the caller-save rather than the callee-save convention for environments. We also identify that the dump component of the SECD machine is managed in a callee-save way. The caller-save counterpart of the modernized SECD machine precisely corresponds to Thielecke's double-barrelled continuations and to Felleisen's encoding of J in terms of call/cc. We then variously characterize the J operator in terms of CPS and in terms of delimited-control operators in the CPS hierarchy. As a byproduct, we also present several reduction semantics for applicative expressions with the J operator, based on Curien's original calculus of explicit substitutions. These reduction semantics mechanically correspond to the modernized versions of the SECD machine and to the best of our knowledge, they provide the first syntactic theories of applicative expressions with the J operator

arXiv.org e-Print Archive

Crossref

Episciences.org

Identification and Quantification of Proteoforms by Mass Spectrometry

Author: Anderson Lissa C.
Fellers Ryan T.
Ge Ying
Kelleher Neil L.
LeDuc Richard D.
Liu Xiaowen
Miller Rachel M.
Millikin Robert J.
Payne Samuel H.
Schaffer Leah V.
Shortreed Michael R.
Smith Lloyd M.
Sun Liangliang
Thomas Paul M.
Tucholski Trisha
Wang Zhe
Wu Si
Wu Zhijie
Yu Dahang
Publication venue: 'Wiley'
Publication date: 01/05/2019
Field of study

A proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post-translational modifications. In top-down proteomic analyses, proteoforms are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top-down proteomic workflows. In this review, we outline some recent advances and discuss current challenges and future directions for the field

IUPUIScholarWorks

Enhanced protein isoform characterization through long-read proteogenomics

Author: Castaldi Peter J.
Chatzipantsiou Christina
Conesa Ana
Dai Yunxiang
Deslattes Mays Anne
Jeffery Erin D.
Jordan Ben T.
Kaur Simi
Luckey Chance John
Mehlferber Madison M.
Miller Rachel M.
Millikin Robert J.
Sheynkman Gloria M.
Shortreed Michael R.
Smith Lloyd M.
Tiberi Simone
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

[Background] The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms.[Results] We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis.[Conclusions] Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.This work was supported by a National Institutes of Health (NIH) grant R35GM142647 (G.M.S.), NIH grant R35GM126914 (L.M.S.), and Jackson Laboratory (A.D.M.). The codeathon which initiated the project was supported by the NIH STRIDES Initiative at the NIH.Peer reviewe

PubMed Central

Digital.CSIC

Application of the 187Re-187Os geochronometer to crustal materials: Systematics, methodology, data reporting, and interpretation

Author: Creaser Robert A.
Hnatyshin Danny
Millikin Alexie E.G.
Rooney Alan D.
Saintilan Nicolas J.
Selby David
Toma Jonathan
Publication venue: Geological Society of America
Publication date: 01/01/2024
Field of study

The rhenium-osmium (187Re-187Os) system is a highly versatile chronometer that is regularly applied to a wide range of geological and extraterrestrial materials. In addition to providing geo- or cosmo-chronological information, the Re-Os system can also be used as a tracer of processes across a range of temporal (millennial to gigayear) and spatial scales (lower mantle to cryosphere). An increasing number of sulfide minerals are now routinely dated, which further expands the ability of this system to refine mineral exploration models as society moves toward a new, green economy with related technological needs. An expanding range of natural materials amenable to Re-Os geochronology brings additional complexities in data interpretation and the resultant translation of measured isotopic ratios to a properly contextualized age. Herein, we provide an overview of the 187Re-187Os system as applied to sedimentary rocks, sulfides, and other crustal materials and highlight further innovations on the horizon. Additionally, we outline next steps and best practices required to improve the precision of the chronometer and establish community-wide data reduction procedures, such as the decay constant, regression technique, and software packages to use. These best practices will expand the utility and viability of published results and essential metadata to ensure that such data conform to evolving standards of being findable, accessible, interoperable, and reusable (FAIR)

Durham Research Online

Repository for Publications and Research Data

Ultrafast Peptide Label-Free Quantification with FlashLFQ

Author: Lloyd M. Smith (200274)
Michael R. Shortreed (200226)
Robert J. Millikin (4578955)
Stefan K. Solntsev (4578952)
Publication venue
Publication date
Field of study

The rapid and accurate quantification of peptides is a critical element of modern proteomics that has become increasingly challenging as proteomic data sets grow in size and complexity. We present here FlashLFQ, a computer program for high-speed label-free quantification of peptides following a search of bottom-up mass spectrometry data. FlashLFQ is approximately an order of magnitude faster than established label-free quantification methods. The increased speed makes it practical to base quantification upon all of the charge states for a given peptide rather than solely upon the charge state that was selected for MS2 fragmentation. This increases the number of quantified peptides, improves replicate-to-replicate reproducibility, and increases quantitative accuracy. We integrated FlashLFQ into the graphical user interface of the MetaMorpheus search software, allowing it to work together with the global post-translational modification discovery (G-PTM-D) engine to accurately quantify modified peptides. FlashLFQ is also available as a NuGet package, facilitating its integration into other software, and as a standalone command line software program for the quantification of search results from other programs (e.g., MaxQuant)

FigShare

A Hybrid Spectral Library and Protein Sequence Database Search Strategy for Bottom-Up and Top-Down Proteomic Data Analysis

Author: Lloyd M. Smith (200274)
Michael R. Shortreed (200226)
Robert J. Millikin (5367767)
Yuling Dai (8534499)
Zach Rolfs (5367770)
Publication venue: 'American Chemical Society (ACS)'
Publication date: 07/10/2022
Field of study

Tandem mass spectrometry (MS/MS) is widely employed for the analysis of complex proteomic samples. While protein sequence database searching and spectral library searching are both well-established peptide identification methods, each has shortcomings. Protein sequence databases lack fragment peak intensity information, which can result in poor discrimination between correct and incorrect spectrum assignments. Spectral libraries usually contain fewer peptides than protein sequence databases, which limits the number of peptides that can be identified. Notably, few post-translationally modified peptides are represented in spectral libraries. This is because few search engines can both identify a broad spectrum of PTMs and create corresponding spectral libraries. Also, programs that generate spectral libraries using deep learning approaches are not yet able to accurately predict spectra for the vast majority of PTMs. Here, we address these limitations through use of a hybrid search strategy that combines protein sequence database and spectral library searches to improve identification success rates and sensitivity. This software uses Global PTM Discovery (G-PTM-D) to produce spectral libraries for a wide variety of different PTMs. These features, along with a new spectrum annotation and visualization tool, have been integrated into the freely available and open-source search engine MetaMorpheus

FigShare

Recommended from our members

Binary Classifier for Computing Posterior Error Probabilities in MetaMorpheus

Author: Frey Brian L
Liu Lei
Miller Rachel M
Millikin Robert J
Rolfs Zach
Schaffer Leah V
Shortreed Michael R
Smith Lloyd M
Publication venue: eScholarship, University of California
Publication date: 02/04/2021
Field of study

MetaMorpheus is a free, open-source software program for the identification of peptides and proteoforms from data-dependent acquisition tandem MS experiments. There is inherent uncertainty in these assignments for several reasons, including the limited overlap between experimental and theoretical peaks, the m/z uncertainty, and noise peaks or peaks from coisolated peptides that produce false matches. False discovery rates provide only a set-wise approximation for incorrect spectrum matches. Here we implemented a binary decision tree calculation within MetaMorpheus to compute a posterior error probability, which provides a measure of uncertainty for each peptide-spectrum match. We demonstrate its utility for increasing identifications and resolving ambiguities in bottom-up, top-down, proteogenomic, and nonspecific digestion searches

eScholarship - University of California

Serial KinderMiner (SKiM) discovers and annotates biomedical knowledge using co-occurrence and transformer models

Author: Brian Bockelman
Cannon Lock
Finn Kuusisto
Ian Ross
James Thomson
John Steill
Kalpana Raja
Lam C. Tsoi
Miron Livny
Robert J. Millikin
Ron Stewart
Xuancheng Tu
Zijian Ni
Publication venue: BMC
Publication date: 01/11/2023
Field of study

Abstract Background The PubMed archive contains more than 34 million articles; consequently, it is becoming increasingly difficult for a biomedical researcher to keep up-to-date with different knowledge domains. Computationally efficient and interpretable tools are needed to help researchers find and understand associations between biomedical concepts. The goal of literature-based discovery (LBD) is to connect concepts in isolated literature domains that would normally go undiscovered. This usually takes the form of an A–B–C relationship, where A and C terms are linked through a B term intermediate. Here we describe Serial KinderMiner (SKiM), an LBD algorithm for finding statistically significant links between an A term and one or more C terms through some B term intermediate(s). The development of SKiM is motivated by the observation that there are only a few LBD tools that provide a functional web interface, and that the available tools are limited in one or more of the following ways: (1) they identify a relationship but not the type of relationship, (2) they do not allow the user to provide their own lists of B or C terms, hindering flexibility, (3) they do not allow for querying thousands of C terms (which is crucial if, for instance, the user wants to query connections between a disease and the thousands of available drugs), or (4) they are specific for a particular biomedical domain (such as cancer). We provide an open-source tool and web interface that improves on all of these issues. Results We demonstrate SKiM’s ability to discover useful A–B–C linkages in three control experiments: classic LBD discoveries, drug repurposing, and finding associations related to cancer. Furthermore, we supplement SKiM with a knowledge graph built with transformer machine-learning models to aid in interpreting the relationships between terms found by SKiM. Finally, we provide a simple and intuitive open-source web interface ( https://skim.morgridge.org ) with comprehensive lists of drugs, diseases, phenotypes, and symptoms so that anyone can easily perform SKiM searches. Conclusions SKiM is a simple algorithm that can perform LBD searches to discover relationships between arbitrary user-defined concepts. SKiM is generalized for any domain, can perform searches with many thousands of C term concepts, and moves beyond the simple identification of an existence of a relationship; many relationships are given relationship type labels from our knowledge graph

Directory of Open Access Journals

Precursor intensity-based label-free quantification software tools for proteomic and multi-omic analysis within the galaxy platform

Author: Argentini Andrea
Easterly Caleb W.
Eguinoa Ignacio
Griffin Timothy J.
Jagtap Pratik D.
Johnson James E.
Kumar Praveen
Martens Lennart
McGowan Thomas
Mehta Subina
Millikin Robert J.
Sajulga Ray
Shortreed Michael R.
Smith Lloyd M.
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

For mass spectrometry-based peptide and protein quantification, label-free quantification (LFQ) based on precursor mass peak (MS1) intensities is considered reliable due to its dynamic range, reproducibility, and accuracy. LFQ enables peptide-level quantitation, which is useful in proteomics (analyzing peptides carrying post-translational modifications) and multi-omics studies such as metaproteomics (analyzing taxon-specific microbial peptides) and proteogenomics (analyzing non-canonical sequences). Bioinformatics workflows accessible via the Galaxy platform have proven useful for analysis of such complex multi-omic studies. However, workflows within the Galaxy platform have lacked well-tested LFQ tools. In this study, we have evaluated moFF and FlashLFQ, two open-source LFQ tools, and implemented them within the Galaxy platform to offer access and use via established workflows. Through rigorous testing and communication with the tool developers, we have optimized the performance of each tool. Software features evaluated include: (a) match-between-runs (MBR); (b) using multiple file-formats as input for improved quantification; (c) use of containers and/or conda packages; (d) parameters needed for analyzing large datasets; and (e) optimization and validation of software performance. This work establishes a process for software implementation, optimization, and validation, and offers access to two robust software tools for LFQ-based analysis within the Galaxy platform

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Implications of apathy and depression for everyday functioning in HIV/AIDS in Brazil

Author: Ana Paula de Pereira
Anastos
Atkinson
Atkinson
Atkinson
Barclay
Basso
Beck
Becker
Bing
Carey
Castellon
Castellon
Chelune
Ciesla
Clea Elisa Ribeiro
Cysique
Erin Morgan
Fleischer
Francisco Barbosa
Heaton
Heaton
Hinkin
Hoare
Ingrid Maich
J.Hamp Atkinson
Jayraan Badiee
Jin
Kamat
Lawton
Maj
Maj
Mariana Cherner
Marin
Marin
Martin
Mello
Millikin
Navia
Paul
Paul
Rabkin
Rabkin
Robert
Robinson-Papp
Ronald Ellis
Rujvi Kamat
Sergio de Almeida
Sheehan
Silveira
Starkstein
Tate
Thames
Thomas D. Marcotte
Weiss
Wilson
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref