Search CORE

24,025 research outputs found

Active Sampling for Large-scale Information Retrieval Evaluation

Author: Kanoulas Evangelos
Li Dan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Evaluation is crucial in Information Retrieval. The development of models, tools and methods has significantly benefited from the availability of reusable test collections formed through a standardized and thoroughly tested methodology, known as the Cranfield paradigm. Constructing these collections requires obtaining relevance judgments for a pool of documents, retrieved by systems participating in an evaluation task; thus involves immense human labor. To alleviate this effort different methods for constructing collections have been proposed in the literature, falling under two broad categories: (a) sampling, and (b) active selection of documents. The former devises a smart sampling strategy by choosing only a subset of documents to be assessed and inferring evaluation measure on the basis of the obtained sample; the sampling distribution is being fixed at the beginning of the process. The latter recognizes that systems contributing documents to be judged vary in quality, and actively selects documents from good systems. The quality of systems is measured every time a new document is being judged. In this paper we seek to solve the problem of large-scale retrieval evaluation combining the two approaches. We devise an active sampling method that avoids the bias of the active selection methods towards good systems, and at the same time reduces the variance of the current sampling approaches by placing a distribution over systems, which varies as judgments become available. We validate the proposed method using TREC data and demonstrate the advantages of this new method compared to past approaches

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Variance optimal hedging for continuous time additive processes and applications

Author: Goutte Stéphane
Oudjane Nadia
Russo Francesco
Publication venue
Publication date: 01/12/2012
Field of study

For a large class of vanilla contingent claims, we establish an explicit F\"ollmer-Schweizer decomposition when the underlying is an exponential of an additive process. This allows to provide an efficient algorithm for solving the mean variance hedging problem. Applications to models derived from the electricity market are performed

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

HAL-Paris 13

A Scalable MCEM Estimator for Spatio-Temporal Autoregressive Models

Author: Bormann Nils-Christian
Hunziker Philipp
Kachi Aya
Wucherpfennig Julian
Publication venue
Publication date: 18/07/2018
Field of study

Very large spatio-temporal lattice data are becoming increasingly common across a variety of disciplines. However, estimating interdependence across space and time in large areal datasets remains challenging, as existing approaches are often (i) not scalable, (ii) designed for conditionally Gaussian outcome data, or (iii) are limited to cross-sectional and univariate outcomes. This paper proposes an MCEM estimation strategy for a family of latent-Gaussian multivariate spatio-temporal models that addresses these issues. The proposed estimator is applicable to a wide range of non-Gaussian outcomes, and implementations for binary and count outcomes are discussed explicitly. The methodology is illustrated on simulated data, as well as on weekly data of IS-related events in Syrian districts.Comment: 29 pages, 8 figure

arXiv.org e-Print Archive

edoc

Structural positions and risk budgeting : quantifying the impact of structural positions and deriving implications for active portfolio management

Author: Herold Ulf
Publication venue
Publication date: 01/01/2001
Field of study

Structural positions are very common in investment practice. A structural position is defined as a permanent overweighting of a riskier asset class relative to a prespecified benchmark portfolio. The most prominent example for a structural position is the equity bias in a balanced fund that arises by consistently overweighting equities in tactical asset allocation. Another example is the permanent allocation of credit in a fixed income portfolio with a government benchmark. The analysis provided in this article shows that whenever possible, structural positions should be avoided. Graphical illustrations based on Pythagorean theorem are used to make a connection between the active risk/return and the total risk/return framework. Structural positions alter the risk profile of the portfolio substantially, and the appeal of active management – to provide active returns uncorrelated to benchmark returns and hence to shift the efficient frontier outwards – gets lost. The article demonstrates that the commonly used alpha – tracking error criterion is not sufficient for active management. In addition, structural positions complicate measuring managers’ skill. The paper also develops normative implications for active portfolio management. Tactical asset allocation should be based on the comparison of expected excess returns of an asset class to the equilibrium risk premium of the same asset class and not to expected excess returns of other asset classes. For the cases, where structural positions cannot be avoided, a risk budgeting approach is introduced and applied to determine the optimal position size. Finally, investors are advised not to base performance evaluation only on simple manager rankings because this encourages managers to take structural positions and does not reward efforts to produce alpha. The same holds true for comparing managers’ information ratios. Information ratios, in investment practice defined as the ratio of active return to active risk, do not uncover structural positions

Hochschulschriftenserver - Universität Frankfurt am Main

Recommended from our members

Deep Neural Network Cloud-Type Classification (DeepCTC) model and its application in evaluating PERSIANN-CCS

Author: Ganguly S
Gorooh VA
Hsu KL
Kalia S
Nemani RR
Nguyen P
Sorooshian S
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Satellite remote sensing plays a pivotal role in characterizing hydrometeorological components including cloud types and their associated precipitation. The Cloud Profiling Radar (CPR) on the Polar Orbiting CloudSat satellite has provided a unique dataset to characterize cloud types. However, data from this nadir-looking radar offers limited capability for estimating precipitation because of the narrow satellite swath coverage and low temporal frequency. We use these high-quality observations to build a Deep Neural Network Cloud-Type Classification (DeepCTC) model to estimate cloud types from multispectral data from the Advanced Baseline Imager (ABI) onboard the GOES-16 platform. The DeepCTC model is trained and tested using coincident data from both CloudSat and ABI over the CONUS region. Evaluations of DeepCTC indicate that the model performs well for a variety of cloud types including Altostratus, Altocumulus, Cumulus, Nimbostratus, Deep Convective and High clouds. However, capturing low-level clouds remains a challenge for the model. Results from simulated GOES-16 ABI imageries of the Hurricane Harvey event show a large-scale perspective of the rapid and consistent cloud-type monitoring is possible using the DeepCTC model. Additionally, assessments using half-hourly Multi-Radar/Multi-Sensor (MRMS) precipitation rate data (for Hurricane Harvey as a case study) show the ability of DeepCTC in identifying rainy clouds, including Deep Convective and Nimbostratus and their precipitation potential. We also use DeepCTC to evaluate the performance of the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Cloud Classification System (PERSIANN-CCS) product over different cloud types with respect to MRMS referenced at a half-hourly time scale for July 2018. Our analysis suggests that DeepCTC provides supplementary insights into the variability of cloud types to diagnose the weakness and strength of near real-time GEO-based precipitation retrievals. With additional training and testing, we believe DeepCTC has the potential to augment the widely used PERSIANN-CCS algorithm for estimating precipitation

eScholarship - University of California

Stochastic Programming with Probability

Author: Andrieu Laetitia
Cohen Guy
Vázquez-Abad Felisa
Publication venue
Publication date: 01/08/2007
Field of study

In this work we study optimization problems subject to a failure constraint. This constraint is expressed in terms of a condition that causes failure, representing a physical or technical breakdown. We formulate the problem in terms of a probability constraint, where the level of "confidence" is a modelling parameter and has the interpretation that the probability of failure should not exceed that level. Application of the stochastic Arrow-Hurwicz algorithm poses two difficulties: one is structural and arises from the lack of convexity of the probability constraint, and the other is the estimation of the gradient of the probability constraint. We develop two gradient estimators with decreasing bias via a convolution method and a finite difference technique, respectively, and we provide a full analysis of convergence of the algorithms. Convergence results are used to tune the parameters of the numerical algorithms in order to achieve best convergence rates, and numerical results are included via an example of application in finance

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

Unbiased Comparative Evaluation of Ranking Functions

Author: Owen A. B.
Pavlu V.
Peng Ye D. D.
Sparck-Jones K.
Voorhees E. M.
Yuan C.
Zhao P.
Publication venue
Publication date: 25/04/2016
Field of study

Eliciting relevance judgments for ranking evaluation is labor-intensive and costly, motivating careful selection of which documents to judge. Unlike traditional approaches that make this selection deterministically, probabilistic sampling has shown intriguing promise since it enables the design of estimators that are provably unbiased even when reusing data with missing judgments. In this paper, we first unify and extend these sampling approaches by viewing the evaluation problem as a Monte Carlo estimation task that applies to a large number of common IR metrics. Drawing on the theoretical clarity that this view offers, we tackle three practical evaluation scenarios: comparing two systems, comparing

k

systems against a baseline, and ranking

k

systems. For each scenario, we derive an estimator and a variance-optimizing sampling distribution while retaining the strengths of sampling-based evaluation, including unbiasedness, reusability despite missing data, and ease of use in practice. In addition to the theoretical contribution, we empirically evaluate our methods against previously used sampling heuristics and find that they generally cut the number of required relevance judgments at least in half.Comment: Under review; 10 page

arXiv.org e-Print Archive

Crossref