8 research outputs found

    Predict, Refine, Synthesize: Self-Guiding Diffusion Models for Probabilistic Time Series Forecasting

    Full text link
    Diffusion models have achieved state-of-the-art performance in generative modeling tasks across various domains. Prior works on time series diffusion models have primarily focused on developing conditional models tailored to specific forecasting or imputation tasks. In this work, we explore the potential of task-agnostic, unconditional diffusion models for several time series applications. We propose TSDiff, an unconditionally trained diffusion model for time series. Our proposed self-guidance mechanism enables conditioning TSDiff for downstream tasks during inference, without requiring auxiliary networks or altering the training procedure. We demonstrate the effectiveness of our method on three different time series tasks: forecasting, refinement, and synthetic data generation. First, we show that TSDiff is competitive with several task-specific conditional forecasting methods (predict). Second, we leverage the learned implicit probability density of TSDiff to iteratively refine the predictions of base forecasters with reduced computational overhead over reverse diffusion (refine). Notably, the generative performance of the model remains intact -- downstream forecasters trained on synthetic samples from TSDiff outperform forecasters that are trained on samples from other state-of-the-art generative time series models, occasionally even outperforming models trained on real data (synthesize)

    Neural forecasting: Introduction and literature overview

    Full text link
    Neural network based forecasting methods have become ubiquitous in large-scale industrial forecasting applications over the last years. As the prevalence of neural network based solutions among the best entries in the recent M4 competition shows, the recent popularity of neural forecasting methods is not limited to industry and has also reached academia. This article aims at providing an introduction and an overview of some of the advances that have permitted the resurgence of neural networks in machine learning. Building on these foundations, the article then gives an overview of the recent literature on neural networks for forecasting and applications.Comment: 66 pages, 5 figure

    An integrated workflow for crosslinking mass spectrometry

    Get PDF
    We present a concise workflow to enhance the mass spectrometric detection of crosslinked peptides by introducing sequential digestion and the crosslink identification software xiSEARCH. Sequential digestion enhances peptide detection by selective shortening of long tryptic peptides. We demonstrate our simple 12‐fraction protocol for crosslinked multi‐protein complexes and cell lysates, quantitative analysis, and high‐density crosslinking, without requiring specific crosslinker features. This overall approach reveals dynamic protein–protein interaction sites, which are accessible, have fundamental functional relevance and are therefore ideally suited for the development of small molecule inhibitors

    Nutzung neuer Informationsquellen fĂŒr die Proteinstrukturvorhersage

    No full text
    Three-dimensional protein structures are an invaluable stepping stone towards the understanding of cellular processes. Computational protein structure prediction holds the promise of providing these structural models at low cost and effort. However, the major bottleneck towards effective protein structure prediction is the high dimensionality and vast size of the protein conformational space. These properties of the conformational space make it extremely difficult to locate the native structure through search. Information alleviates this issue by guiding search towards the native protein structure. Thus, information is invaluable in conformational space search. Not surprisingly, state-of-the-art structure prediction methods heavily rely on information. Obviously, unlocking novel sources of information should further increase our ability to accurately predict protein structure. This thesis leverages three novel sources of information to advance protein structure prediction. First, we leverage physicochemical information that is encoded in energy functions and predicted structure models. Native contact networks form characteristic patterns to be energetically favorable. This thesis develops a network-based representation to capture these patterns and uses this representation to predict residue-residue contacts. The second source of information is experimental data from high-density cross-linking/ mass spectrometry (CLMS) experiments. We integrate this information in an experimental/ computational hybrid method for protein structure determination. The third information source is corroborating information. Corroborating information judges the likelihood of the co-occurence of structural constraints. Nearly all methods provide these constraints in isolation, thereby neglecting any corroborating evidence between them. We develop a network-based analysis method to refine structure constraints with corroborating information. We demonstrate the value of these information sources in extensive ab initio structure prediction experiments with a customized conformational space search algorithm and a novel structure prediction pipeline. This pipeline reached state-of-the-art contact and ab initio structure prediction performance in the 11th community-wide Critical Assessment of Protein Structure Prediction experiment (CASP11). Using our CLMS-based hybrid method, we reconstruct the domain structures of human serum albumin in solution and in its native environment, human blood serum. This represents a disruptive first step towards a mass spectrometry-driven, ab initio structure determination method that is able to probe protein structure where it really matters: In their natural environment, which is their very place of action.Die Kenntnis von dreidimensionalen Proteinstrukturen ist fĂŒr das VerstĂ€ndnis von zellulĂ€ren Prozessen unverzichtbar. ComputergestĂŒtzte Verfahren zur Proteinstrukturvorhersage haben das Potenzial diese strukturellen Modelle mit wenig Aufwand und niedrigen Kosten zu generieren. Allerdings ist die hohe DimensionalitĂ€t und schiere GrĂ¶ĂŸe des Konformationsraumes ein großes Hindernis auf dem Weg zur effektiven Strukturvorhersage. Diese Eigenschaften des Suchraumes machen es extrem schwierig die native Proteinstruktur mittels Suchalgorithmen zu finden. Information leitet die Suche nach der nativen Struktur. Daher ist Information fĂŒr die Suche im Konformationsraum unverzichtbar. Viele Proteinstrukturvorhersagemethoden nutzen ein hohes Maß an Information. Offensichtlich sollte das Erschließen neuer Informationsquellen unsere FĂ€higkeit zur genauen Strukturvorhersage massiv erweitern. Diese Dissertation demonstriert den Einsatz drei neuartiger Informationsquellen in der Strukturvorhersage. Die erste Informationsquelle ist physikalisch-chemische Information, enthalten in Energiefunktionen und vorhergesagten Strukturmodellen. Native Kontakte bilden charakteristische Netzwerke aus, um energetisch gĂŒnstig zu sein. Diese Dissertation entwickelt eine Netzwerk-basierende ReprĂ€sentation dieser charakteristischen Netzwerke um Proteinkontakte vorherzusagen. Cross-link/Massenspektrometrie (CLMS) Daten mit extrem hoher Dichte sind die zweite Informationsquelle. Wir integrieren diese Information in einer experimentellen/ computergestĂŒtzten Hybridmethode fĂŒr die Strukturbestimmung. Die dritte Informationsquelle sind sich unterstĂŒtzende Informationen. Diese beurteilen die Wahrscheinlichkeit vom simultanen Auftreten mehrerer struktureller Zwangsbedingungen. Nahezu alle Methoden sagen diese Zwangsbedingungen isoliert vorher und ignorieren daher unterstĂŒtzende Informationen. Wir entwickeln eine Netzwerkanalysemethode um mit dieser Information Zwangsbedingungen zu verfeinern. Wir demonstrieren den Nutzen dieser Informationsquellen in umfangreichen ab initio Strukturvorhersageexperimenten mit einem modifizierten Suchalgorithmus und eines neuartigen Strukturvorhersagesystems. Mit diesem System waren genaue Kontaktvorhersagen und ab initio Strukturvorhersagen in dem elften „Critical Assessment of Protein Structure Prediction“ Experiment möglich. Mit unserer CLMS-basierenden Hybridmethode konnten wir die Struktur der DomĂ€nen von Humanalbumin rekonstruieren. Dies war fĂŒr isoliertes Humanalbumin und fĂŒr Humanalbumin in Blutserum möglich, welches die natĂŒrliche Umgebung dieses Proteins darstellt. Dies ist ein wichtiger erster Stritt in Richtung einer neuen CLMS-basierenden Strukturbestimmungsmethode. Diese ist in der Lage strukturelle Informationen da zu sammeln wo es wirklich darauf ankommt: In der natĂŒrlichen Umgebung von Proteinen, in welchen sie ihre Funktion ausĂŒben

    High-dimensional multivariate forecasting with low-rank Gaussian Copula processes

    No full text
    Predicting the dependencies between observations from multiple time series is critical for applications such as anomaly detection, financial risk management, causal analysis, or demand forecasting. However, the computational and numerical difficulties of estimating time-varying and high-dimensional covariance matrices often limits existing methods to handling at most a few hundred dimensions or requires making strong assumptions on the dependence between series. We propose to combine an RNN-based time series model with a Gaussian copula process output model with a low-rank covariance structure to reduce the computational complexity and handle non-Gaussian marginal distributions. This permits to drastically reduce the number of parameters and consequently allows the modeling of time-varying correlations of thousands of time series. We show on several real-world datasets that our method provides significant accuracy improvements over state-of-the-art baselines and perform an ablation study analyzing the contributions of the different components of our model
    corecore