7 research outputs found
Deep Generative Modeling for Financial Time Series with Application in VaR: A Comparative Review
In the financial services industry, forecasting the risk factor distribution
conditional on the history and the current market environment is the key to
market risk modeling in general and value at risk (VaR) model in particular. As
one of the most widely adopted VaR models in commercial banks, Historical
simulation (HS) uses the empirical distribution of daily returns in a
historical window as the forecast distribution of risk factor returns in the
next day. The objectives for financial time series generation are to generate
synthetic data paths with good variety, and similar distribution and dynamics
to the original historical data. In this paper, we apply multiple existing deep
generative methods (e.g., CGAN, CWGAN, Diffusion, and Signature WGAN) for
conditional time series generation, and propose and test two new methods for
conditional multi-step time series generation, namely Encoder-Decoder CGAN and
Conditional TimeVAE. Furthermore, we introduce a comprehensive framework with a
set of KPIs to measure the quality of the generated time series for financial
modeling. The KPIs cover distribution distance, autocorrelation and
backtesting. All models (HS, parametric and neural networks) are tested on both
historical USD yield curve data and additional data simulated from GARCH and
CIR processes. The study shows that top performing models are HS, GARCH and
CWGAN models. Future research directions in this area are also discussed
Computational Methods for Protein Structure Comparison and Analysis
Proteins are involved in almost all functions in a living cell, and functions of proteins are realized by their tertiary structures. Protein three-dimensional structures can be solved by multiple experimental methods, but computational approaches serve as an important complement to experimental methods for comparing and analyzing protein structures. Protein structure comparison allows the transfer of knowledge about known proteins to a novel protein and plays an important role in function prediction. Obtaining a global perspective of the variety and distribution of protein structures also lays a foundation for our understanding of the building principle of protein structures. This dissertation introduces our computational method to compare protein 3D structures and presents a novel mapping of protein shapes that represents the variety and the similarities of 3D shapes of proteins and their assemblies. The methods developed in this work can be applied to obtain new biological insights into protein atomic structures and electron density maps
A global map of the protein shape universe.
Proteins are involved in almost all functions in a living cell, and functions of proteins are realized by their tertiary structures. Obtaining a global perspective of the variety and distribution of protein structures lays a foundation for our understanding of the building principle of protein structures. In light of the rapid accumulation of low-resolution structure data from electron tomography and cryo-electron microscopy, here we map and classify three-dimensional (3D) surface shapes of proteins into a similarity space. Surface shapes of proteins were represented with 3D Zernike descriptors, mathematical moment-based invariants, which have previously been demonstrated effective for biomolecular structure similarity search. In addition to single chains of proteins, we have also analyzed the shape space occupied by protein complexes. From the mapping, we have obtained various new insights into the relationship between shapes, main-chain folds, and complex formation. The unique view obtained from shape mapping opens up new ways to understand design principles, functions, and evolution of proteins
Protein Shape Retrieval Contest
This track aimed at retrieving protein evolutionary classification based on their surfaces meshes only. Given that proteins are dynamic, non-rigid objects and that evolution tends to conserve patterns related to their activity and function, this track offers a challenging issue using biologically relevant molecules. We evaluated the performance of 5 different algorithms and analyzed their ability, over a dataset of 5,298 objects, to retrieve various conformations of identical proteins and various conformations of ortholog proteins (proteins from different organisms and showing the same activity). All methods were able to retrieve a member of the same class as the query in at least 94% of the cases when considering the first match, but show more divergent when more matches were considered. Last, similarity metrics trained on databases dedicated to proteins improved the results
Classification in cryo-electron tomograms
International audienc
Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment
We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy
Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment
We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and proteinâprotein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy.We are most grateful to the PDBe at the European Bioinformatics Institute in Hinxton, UK, for hosting the CAPRI website. Our deepest thanks go to all the structural biologists and to the following structural genomics initiatives: Northeast Structural Genomics Consortium, Joint Center for Structural Genomics, NatPro PSI:Biology, New York Structural Genomics Research Center, Midwest Center for Structural Genomics, Structural Genomics Consortium, for contributing the targets for this joint CASP-CAPRI experiment. MFL acknowledges support from the FRABio FR3688 Research Federation âStructural & Functional Biochemistry of Biomolecular Assemblies.âPeer Reviewe