1,170,503 research outputs found

    Evaluating hedge fund performance: a stochastic dominance approach

    Get PDF
    We introduce a general and flexible framework for hedge fund performance evaluation and asset allocation: stochastic dominance (SD) theory. Our approach utilizes statistical tests for stochastic dominance to compare the returns of hedge funds. We form hedge fund portfolios by using SD criteria and examine the out-of-sample performance of these hedge fund portfolios. Compared to performance of portfolios of randomly selected hedge funds and mean-variance e¢ cient hedge funds, our results show that fund selection method based on SD criteria greatly improves the performance of hedge fund portfolio

    Uplift Modeling with Multiple Treatments and General Response Types

    Full text link
    Randomized experiments have been used to assist decision-making in many areas. They help people select the optimal treatment for the test population with certain statistical guarantee. However, subjects can show significant heterogeneity in response to treatments. The problem of customizing treatment assignment based on subject characteristics is known as uplift modeling, differential response analysis, or personalized treatment learning in literature. A key feature for uplift modeling is that the data is unlabeled. It is impossible to know whether the chosen treatment is optimal for an individual subject because response under alternative treatments is unobserved. This presents a challenge to both the training and the evaluation of uplift models. In this paper we describe how to obtain an unbiased estimate of the key performance metric of an uplift model, the expected response. We present a new uplift algorithm which creates a forest of randomized trees. The trees are built with a splitting criterion designed to directly optimize their uplift performance based on the proposed evaluation method. Both the evaluation method and the algorithm apply to arbitrary number of treatments and general response types. Experimental results on synthetic data and industry-provided data show that our algorithm leads to significant performance improvement over other applicable methods

    Comparison between statistical and dynamical downscaling of rainfall over the Gwadar‐Ormara basin, Pakistan

    Get PDF
    Abstract This paper evaluated and compared the performance of a statistical downscaling method and a dynamical downscaling method to simulate the spatial–temporal rainfall distribution. Outputs from RegCM4 Regional Climate Model (RCM) and the CanESM2 Atmosphere–Ocean General Circulation Model (AOGCM) were selected for the data scarce Gwadar‐Ormara basin, Pakistan. The evaluation was based on the climatological average and standard deviation for historic (1971–2000) and future (2041–2070) time periods under Representative Concentration Pathways (RCP) 4.5 and 8.5 scenarios. The performance evaluation showed that statistical downscaling is preferred to simulate and project rainfall patterns in the study area. Additionally, the Statistical DownScaling Model (SDSM) showed low R2 values in calibration and validation of the simulations with respect to observed data for the historic period. Overall, SDSM generated satisfactory results in simulating the monthly rainfall cycle of the entire basin. In this study, RegCM4 showed large rainfall errors and missed one rainfall season in the historic period. This study also explored whether the grid‐based rainfall time series of the Asian Precipitation—Highly Resolved Observational Daily Integration Towards Evaluation (APHRODITE) dataset could be used to enlarge and complement the sample of in situ observed rainfall time series. A spatial correlogram was used for observed and APHRODITE rainfall data to assess the consistency between the two data sources, which resulted in rejecting APHRODITE data. For the future time period (2041–2070) under RCPs 4.5 and 8.5 scenarios, rainfall projections did not show significant difference for both downscaling approaches. This may relate to the driving model (CanESM2 AOGCM) and not necessarily suggests poor performance of downscaling; either statistical or dynamical. Hence, the study recommends evaluating a multi‐model ensemble including other GCMs and RCMs for the same area of study

    Network tomography based on 1-D projections

    Full text link
    Network tomography has been regarded as one of the most promising methodologies for performance evaluation and diagnosis of the massive and decentralized Internet. This paper proposes a new estimation approach for solving a class of inverse problems in network tomography, based on marginal distributions of a sequence of one-dimensional linear projections of the observed data. We give a general identifiability result for the proposed method and study the design issue of these one dimensional projections in terms of statistical efficiency. We show that for a simple Gaussian tomography model, there is an optimal set of one-dimensional projections such that the estimator obtained from these projections is asymptotically as efficient as the maximum likelihood estimator based on the joint distribution of the observed data. For practical applications, we carry out simulation studies of the proposed method for two instances of network tomography. The first is for traffic demand tomography using a Gaussian Origin-Destination traffic model with a power relation between its mean and variance, and the second is for network delay tomography where the link delays are to be estimated from the end-to-end path delays. We compare estimators obtained from our method and that obtained from using the joint distribution and other lower dimensional projections, and show that in both cases, the proposed method yields satisfactory results.Comment: Published at http://dx.doi.org/10.1214/074921707000000238 in the IMS Lecture Notes Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    RealText-cs - Corpus based domain independent Content Selection model

    Get PDF
    Content selection is a highly domain dependent task responsible for retrieving relevant information from a knowledge source using a given communicative goal. This paper presents a domain independent content selection model using keywords as communicative goal. We employ DBpedia triple store as our knowledge source and triples are selected based on weights assigned to each triple. The calculation of the weights is carried out through log likelihood distance between a domain corpus and a general reference corpus. The method was evaluated using keywords extracted from QALD dataset and the performance was compared with cross entropy based statistical content selection. The evaluation results showed that the proposed method can perform 32% better than cross entropy based statistical content selection
    • …
    corecore