1,020 research outputs found

    Minimum message length inference of secondary structure from protein coordinate data

    Get PDF
    Motivation: Secondary structure underpins the folding pattern and architecture of most proteins. Accurate assignment of the secondary structure elements is therefore an important problem. Although many approximate solutions of the secondary structure assignment problem exist, the statement of the problem has resisted a consistent and mathematically rigorous definition. A variety of comparative studies have highlighted major disagreements in the way the available methods define and assign secondary structure to coordinate data

    Without magic bullets: the biological basis for public health interventions against protein folding disorders

    Get PDF
    Protein folding disorders of aging like Alzheimer's and Parkinson's diseases currently present intractable medical challenges. 'Small molecule' interventions - drug treatments - often have, at best, palliative impact, failing to alter disease course. The design of individual or population level interventions will likely require a deeper understanding of protein folding and its regulation than currently provided by contemporary 'physics' or culture-bound medical magic bullet models. Here, a topological rate distortion analysis is applied to the problem of protein folding and regulation that is similar in spirit to Tlusty's (2010a) elegant exploration of the genetic code. The formalism produces large-scale, quasi-equilibrium 'resilience' states representing normal and pathological protein folding regulation under a cellular-level cognitive paradigm similar to that proposed by Atlan and Cohen (1998) for the immune system. Generalization to long times produces diffusion models of protein folding disorders in which epigenetic or life history factors determine the rate of onset of regulatory failure, in essence, a premature aging driven by familiar synergisms between disjunctions of resource allocation and need in the context of socially or physiologically toxic exposures and chronic powerlessness at individual and group scales. Application of an HPA axis model is made to recent observed differences in Alzheimer's onset rates in White and African American subpopulations as a function of an index of distress-proneness

    Hunter-gatherers in a howling wilderness: Neoliberal capitalism as a language that speaks itself

    Get PDF
    The 'self-referential' character of evolutionary process noted by Goldenfeld and Woese (2010) can be restated in the context of a generalized Darwinian theory applied to economic process through a 'language' model: The underlying inherited and learned culture of the firm, the short-time cognitive response of the firm to patterns of threat and opportunity that is sculpted by that culture, and the embedding socioeconomic environment, are represented as interacting information sources constrained by the asymptotic limit theorems of information theory. If unregulated, the larger, compound, source that characterizes high probability evolutionary paths of this composite then becomes, literally, a self-dynamic language that speaks itself. Such a structure is, for those enmeshed in it, more akin to a primitive hunter-gatherer society at the mercy of internal ecological dynamics than to, say, a neolithic agricultural community in which a highly ordered, deliberately adapted, ecosystem is consciously farmed so as to match its productivity to human needs

    Graph Signal Processing: Overview, Challenges and Applications

    Full text link
    Research in Graph Signal Processing (GSP) aims to develop tools for processing data defined on irregular graph domains. In this paper we first provide an overview of core ideas in GSP and their connection to conventional digital signal processing. We then summarize recent developments in developing basic GSP tools, including methods for sampling, filtering or graph learning. Next, we review progress in several application areas using GSP, including processing and analysis of sensor network data, biological data, and applications to image processing and machine learning. We finish by providing a brief historical perspective to highlight how concepts recently developed in GSP build on top of prior research in other areas.Comment: To appear, Proceedings of the IEE

    Lost in translation: Toward a formal model of multilevel, multiscale medicine

    Get PDF
    For a broad spectrum of low level cognitive regulatory and other biological phenomena, isolation from signal crosstalk between them requires more metabolic free energy than permitting correlation. This allows an evolutionary exaptation leading to dynamic global broadcasts of interacting physiological processes at multiple scales. The argument is similar to the well-studied exaptation of noise to trigger stochastic resonance amplification in physiological subsystems. Not only is the living state characterized by cognition at every scale and level of organization, but by multiple, shifting, tunable, cooperative larger scale broadcasts that link selected subsets of functional modules to address problems. This multilevel dynamical viewpoint has implications for initiatives in translational medicine that have followed the implosive collapse of pharmaceutical industry 'magic bullet' research. In short, failure to respond to the inherently multilevel, multiscale nature of human pathophysiology will doom translational medicine to a similar implosion

    Computational Methods for Protein Inference in Shotgun Proteomics Experiments

    Get PDF
    In den letzten Jahrzehnten kam es zu einem signifikanten Anstiegs des Einsatzes von Hochdurchsatzmethoden in verschiedensten Bereichen der Naturwissenschaften, welche zu einem regelrechten Paradigmenwechsel führte. Eine große Anzahl an neuen Technologien wurde entwickelt um die Quantifizierung von Molekülen, die in verschiedenste biologische Prozesse involviert sind, voranzutreiben und zu beschleunigen. Damit einhergehend konnte eine beträchtliche Steigerung an Daten festgestellt werden, die durch diese verbesserten Methoden generiert wurden. Durch die Bereitstellung von computergestützten Verfahren zur Analyse eben dieser Masse an Rohdaten, spielt der Forschungsbereich der Bioinformatik eine immer größere Rolle bei der Extraktion biologischer Erkenntnisse. Im Speziellen hilft die computergestützte Massenspektrometrie bei der Prozessierung, Analyse und Visualisierung von Daten aus massenspektrometrischen Hochdursatzexperimenten. Bei der Erforschung der Gesamtheit aller Proteine einer Zelle oder einer anderweitigen Probe biologischen Materials, kommen selbst neueste Methoden an ihre Grenzen. Deswegen greifen viele Labore zu einer, dem Massenspektrometer vorgeschalteten, Verdauung der Probe um die Komplexität der zu messenden Moleküle zu verringern. Diese sogenannten "Bottom-up"-Proteomikexperimente mit Massenspektrometern führen allerdings zu einer erhöhten Schwierigkeit bei der anschließenden computergestützen Analyse. Durch die Verdauung von Proteinen zu Peptiden müssen komplexe Mehrdeutigkeiten während Proteininferenz, Proteingruppierung und Proteinquantifizierung berücksichtigt und/oder aufgelöst werden. Im Rahmen dieser Dissertation stellen wir mehrere Entwicklungen vor, die dabei helfen sollen eine effiziente und vollständig automatisierte Analyse von komplexen und umfangreichen \glqq Bottom-up\grqq{}-Proteomikexperimenten zu ermöglichen. Um die hinderliche Komplexität diskreter, Bayes'scher Proteininferenzmethoden zu verringern, wird neuerdings von sogenannten Faltungsbäumen (engl. "convolution trees") Gebrauch gemacht. Diese bieten bis jetzt jedoch keine genaue und gleichzeitig numerisch stabile Möglichkeit um "max-product"-Inferenz zu betreiben. Deswegen wird in dieser Dissertation zunächst eine neue Methode beschrieben die das mithilfe eines stückweisen bzw. extrapolierendem Verfahren ermöglicht. Basierend auf der Integration dieser Methode in eine mitentwickelte Bibliothek für Bayes'sche Inferenz, wird dann ein OpenMS-Tool für Proteininferenz präsentiert. Dieses Tool ermöglicht effiziente Proteininferenz auf Basis eines diskreten Bayes'schen Netzwerks mithilfe eines "loopy belief propagation" Algorithmus'. Trotz der streng probabilistischen Formulierung des Problems übertrifft unser Verfahren die meisten etablierten Methoden in Recheneffizienz. Das Interface des Algorithmus' bietet außerdem einzigartige Eingabe- und Ausgabeoptionen, wie z.B. das Regularisieren der Anzahl von Proteinen in einer Gruppe, proteinspezifische "Priors", oder rekalibrierte "Posteriors" der Peptide. Schließlich zeigt diese Arbeit einen kompletten, einfach zu benutzenden, aber trotzdem skalierenden Workflow für Proteininferenz und -quantifizierung, welcher um das neue Tool entwickelt wurde. Die Pipeline wurde in nextflow implementiert und ist Teil einer Gruppe von standardisierten, regelmäßig getesteten und von einer Community gepflegten Standardworkflows gebündelt unter dem Projekt nf-core. Unser Workflow ist in der Lage selbst große Datensätze mit komplizierten experimentellen Designs zu prozessieren. Mit einem einzigen Befehl erlaubt er eine (Re-)Analyse von lokalen oder öffentlich verfügbaren Datensätzen mit kompetetiver Genauigkeit und ausgezeichneter Performance auf verschiedensten Hochleistungsrechenumgebungen oder der Cloud.Since the beginning of this millennium, the advent of high-throughput methods in numerous fields of the life sciences led to a shift in paradigms. A broad variety of technologies emerged that allow comprehensive quantification of molecules involved in biological processes. Simultaneously, a major increase in data volume has been recorded with these techniques through enhanced instrumentation and other technical advances. By supplying computational methods that automatically process raw data to obtain biological information, the field of bioinformatics plays an increasingly important role in the analysis of the ever-growing mass of data. Computational mass spectrometry in particular, is a bioinformatics field of research which provides means to gather, analyze and visualize data from high-throughput mass spectrometric experiments. For the study of the entirety of proteins in a cell or an environmental sample, even current techniques reach limitations that need to be circumvented by simplifying the samples subjected to the mass spectrometer. These pre-digested (so-called bottom-up) proteomics experiments then pose an even bigger computational burden during analysis since complex ambiguities need to be resolved during protein inference, grouping and quantification. In this thesis, we present several developments in the pursuit of our goal to provide means for a fully automated analysis of complex and large-scale bottom-up proteomics experiments. Firstly, due to prohibitive computational complexities in state-of-the-art Bayesian protein inference techniques, a refined, more stable technique for performing inference on sums of random variables was developed to enable a variation of standard Bayesian inference for the problem. nextflow and part of a set of standardized, well-tested, and community-maintained workflows by the nf-core collective. Our workflow runs on large-scale data with complex experimental designs and allows a one-command analysis of local and publicly available data sets with state-of-the-art accuracy on various high-performance computing environments or the cloud

    Proceedings of the Fifth Workshop on Information Theoretic Methods in Science and Engineering

    Get PDF
    These are the online proceedings of the Fifth Workshop on Information Theoretic Methods in Science and Engineering (WITMSE), which was held in the Trippenhuis, Amsterdam, in August 2012
    corecore