2,000 research outputs found

    Copulas in finance and insurance

    Get PDF
    Copulas provide a potential useful modeling tool to represent the dependence structure among variables and to generate joint distributions by combining given marginal distributions. Simulations play a relevant role in finance and insurance. They are used to replicate efficient frontiers or extremal values, to price options, to estimate joint risks, and so on. Using copulas, it is easy to construct and simulate from multivariate distributions based on almost any choice of marginals and any type of dependence structure. In this paper we outline recent contributions of statistical modeling using copulas in finance and insurance. We review issues related to the notion of copulas, copula families, copula-based dynamic and static dependence structure, copulas and latent factor models and simulation of copulas. Finally, we outline hot topics in copulas with a special focus on model selection and goodness-of-fit testing

    Comparison of Statistical Testing and Predictive Analysis Methods for Feature Selection in Zero-inflated Microbiome Data

    Get PDF
    Background: Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of microbiome data. Microbiome data consist of operational taxonomic unit (OTU) count data characterized by zero-inflation, over-dispersion, and grouping structure among the sample. Currently, statistical testing methods based on generalized linear mixed effect models (GLMM) are commonly performed to identify OTUs that are associated with a phenotype such as human diseases or plant traits. There are a number of limitations for statistical testing methods including these two: (1) the validity of p-value/q-value depends sensitively on the correctness of models, and (2) the statistical significance does not necessarily imply predictivity. Statistic testing methods depend on model correctness and attempt to select ”marginally relevant” features, not the most predictive ones. Predictive analysis using methods such as LASSO is an alternative approach for feature selection. To the best of our knowledge, this approach has not been used widely for analyzing microbiome data. Methodology: We use four synthetic datasets simulated from zero-inflated negative binomial distribution and a real human gut microbiome data to compare the feature selection performance of LASSO with the likelihood ratio test methods applied to GLMMs. We also investigate the performance of cross-validation in estimating the out-of-sample predictivity of selected features in zero-inflated data. Results: Our studies with synthetic datasets show that the feature selection performance of LASSO is remarkably excellent in zero-inflated data and is comparable with the likelihood ratio test applied to the true data generating model. The feature selection performance of LASSO is better when the distributions of counts are more differentiated by the phenotype, which is a categorical variable in our synthetic datasets. In addition, we performed LOOCV on the train set and out-of-sample prediction on the test set. The performance of the cross-validatory (CV) predictive measures are very close to the out-of-sample predictivity measures. This indicates that LOOCV predictive metrics provide honest measures of the predictivity of the features selected by LASSO. Therefore, the CV predictive measures are good guidance for choosing cutoffs (shrinkage parameter λ\lambda) in selecting features with LASSO. By contrast, when wrong models are fitted to a dataset, the differences between the q-values and the actual false discovery rates are huge; hence, their q-values are tremendously misleading for selecting features. Our comparison of LASSO and statistical testing methods (likelihood ratio test in our analysis) in the real dataset shows that small q-values do not necessarily imply high predictivity of the selected OTUs. However, the researchers often use q-values to find the predictors. That is why we need to look at q-values carefully. Conclusions: Statistical testing methods perform greatly in zero-inflated datasets on both synthetic and real data. However, a serious model checking should be conducted before we use q-values to choose features. Predictive analysis with LASSO is recommended to supplement q-values for selecting features and for measuring the predictivity of selected features

    Modeling Dependencies in Finance using Copulae

    Get PDF
    In this paper we provide a review of copula theory with applications to finance. We illustrate the idea on the bivariate framework and discuss the simple, elliptical and Archimedean classes of copulae. Since the cop- ulae model the dependency structure between random variables, next we explain the link between the copulae and common dependency measures, such as Kendall's tau and Spearman's rho. In the next section the copulae are generalized to the multivariate case. In this general setup we discuss and provide an intensive literature review of estimation and simulation techniques. Separate section is devoted to the goodness-of-fit tests. The importance of copulae in finance we illustrate on the example of asset allocation problems, Value-at-Risk and time series models. The paper is complemented with an extensive simulation study and an application to financial data.Distribution functions, Dimension Reduction, Risk management, Statistical models

    Copulas in finance and insurance

    Get PDF
    Copulas provide a potential useful modeling tool to represent the dependence structure among variables and to generate joint distributions by combining given marginal distributions. Simulations play a relevant role in finance and insurance. They are used to replicate efficient frontiers or extremal values, to price options, to estimate joint risks, and so on. Using copulas, it is easy to construct and simulate from multivariate distributions based on almost any choice of marginals and any type of dependence structure. In this paper we outline recent contributions of statistical modeling using copulas in finance and insurance. We review issues related to the notion of copulas, copula families, copula-based dynamic and static dependence structure, copulas and latent factor models and simulation of copulas. Finally, we outline hot topics in copulas with a special focus on model selection and goodness-of-fit testing.Dependence structure, Extremal values, Copula modeling, Copula review

    Portable random number generators

    Get PDF
    Computers are deterministic devices, and a computer-generated random number is a contradiction in terms. As a result, computer-generated pseudorandom numbers are fraught with peril for the unwary. We summarize much that is known about the most well-known pseudorandom number generators: congruential generators. We also provide machine-independent programs to implement the generators in any language that has 32-bit signed integers-for example C, C++, and FORTRAN. Based on an extensive search, we provide parameter values better than those previously available.Programming (Mathematics) ; Computers

    Random number generation with multiple streams for sequential and parallel computing

    Get PDF
    International audienceWe provide a review of the state of the art on the design and implementation of random number generators (RNGs) for simulation, on both sequential and parallel computing environments. We focus on the need for multiple streams and substreams of random numbers, explain how they can be constructed and managed, review software libraries that offer them, and illustrate their usefulness via examples. We also review the basic quality criteria for good random number generators and their theoretical and empirical testing

    Efficient Color-Dressed Calculation of Virtual Corrections

    Get PDF
    With the advent of generalized unitarity and parametric integration techniques, the construction of a generic Next-to-Leading Order Monte Carlo becomes feasible. Such a generator will entail the treatment of QCD color in the amplitudes. We extend the concept of color dressing to one-loop amplitudes, resulting in the formulation of an explicit algorithmic solution for the calculation of arbitrary scattering processes at Next-to-Leading order. The resulting algorithm is of exponential complexity, that is the numerical evaluation time of the virtual corrections grows by a constant multiplicative factor as the number of external partons is increased. To study the properties of the method, we calculate the virtual corrections to nn-gluon scattering.Comment: 48 pages, 23 figure

    SAFE-NET: Secure and Fast Encryption using Network of Pseudo-Random Number Generators

    Get PDF
    We propose a general framework to design a general class of random number generators suit- able for both computer simulation and computer security applications. It can include newly pro- posed generators SAFE (Secure And Fast Encryption) and ChaCha, a variant of Salsa, one of the four finalists of the eSTREAM ciphers. Two requirements for ciphers to be considered se- cure is that they must be unpredictable with a nice distributional property. Proposed SAFE-NET is a network of n nodes with external pseudo-random number generators as inputs nodes, several inner layers of nodes with a sequence of random variates through ARX (Addition, Rotation, XOR) transformations to diffuse the components of the initial state vector. After several rounds of transformations (with complex inner connections) are done, the output layer with n nodes are outputted via additional transformations. By utilizing random number generators with desirable empirical properties, SAFE-NET injects randomness into the keystream generation process and constantly updates the cipher’s state with external pseudo-random numbers during each iteration. Through the integration of shuffle tables and advanced output functions, extra layers of security are provided, making it harder for attackers to exploit weaknesses in the cipher. Empirical results demonstrate that SAFE-NET requires fewer operations than ChaCha while still producing a sequence of uniformly distributed random numbers
    • 

    corecore