49,835 research outputs found

    Robust methods for inferring sparse network structures

    Get PDF
    This is the post-print version of the final paper published in Computational Statistics & Data Analysis. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.Networks appear in many fields, from finance to medicine, engineering, biology and social science. They often comprise of a very large number of entities, the nodes, and the interest lies in inferring the interactions between these entities, the edges, from relatively limited data. If the underlying network of interactions is sparse, two main statistical approaches are used to retrieve such a structure: covariance modeling approaches with a penalty constraint that encourages sparsity of the network, and nodewise regression approaches with sparse regression methods applied at each node. In the presence of outliers or departures from normality, robust approaches have been developed which relax the assumption of normality. Robust covariance modeling approaches are reviewed and compared with novel nodewise approaches where robust methods are used at each node. For low-dimensional problems, classical deviance tests are also included and compared with penalized likelihood approaches. Overall, copula approaches are found to perform best: they are comparable to the other methods under an assumption of normality or mild departures from this, but they are superior to the other methods when the assumption of normality is strongly violated

    Estimating Position Bias without Intrusive Interventions

    Full text link
    Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \cite{Joachims/etal/17a} can provably overcome presentation bias when observation propensities are known, it remains to show how to effectively estimate these propensities. In this paper, we propose the first method for producing consistent propensity estimates without manual relevance judgments, disruptive interventions, or restrictive relevance modeling assumptions. First, we show how to harvest a specific type of intervention data from historic feedback logs of multiple different ranking functions, and show that this data is sufficient for consistent propensity estimation in the position-based model. Second, we propose a new extremum estimator that makes effective use of this data. In an empirical evaluation, we find that the new estimator provides superior propensity estimates in two real-world systems -- Arxiv Full-text Search and Google Drive Search. Beyond these two points, we find that the method is robust to a wide range of settings in simulation studies

    Automatic Bayesian Density Analysis

    Full text link
    Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for {exploratory data analysis} are usually not flexible enough to deal with the uncertainty inherent to real-world data: they are often restricted to fixed latent interaction models and homogeneous likelihoods; they are sensitive to missing, corrupt and anomalous data; moreover, their expressiveness generally comes at the price of intractable inference. As a result, supervision from statisticians is usually needed to find the right model for the data. However, since domain experts are not necessarily also experts in statistics, we propose Automatic Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible at large. Specifically, ABDA allows for automatic and efficient missing value estimation, statistical data type and likelihood discovery, anomaly detection and dependency structure mining, on top of providing accurate density estimation. Extensive empirical evidence shows that ABDA is a suitable tool for automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19

    A review of regional and global estimates of unconventional gas resources

    Get PDF
    This Research Report assesses the currently available evidence on the size of unconventional gas resources at the regional and global level. Focusing in particular on shale gas, it provides a comprehensive summary and comparison of the estimates that have been produced to date. It also examines the methods by which these resource estimates have been produced the strengths and weaknesses of those methods, the range of uncertainty in the results and the factors that are relevant to their interpretation

    An Interview with Thomas J. Sargent

    Get PDF
    The rational expectations hypothesis swept through macroeconomics during the 1970’s and permanently altered the landscape. It remains the prevailing paradigm in macroeconomics, and rational expectations is routinely used as the standard solution concept in both theoretical and applied macroeconomic modelling. The rational expectations hypothesis was initially formulated by John F. Muth Jr. in the early 1960s. Together with Robert Lucas Jr., Thomas (Tom) Sargent pioneered the rational expectations revolution in macroeconomics in the 1970s. We interviewed Tom Sargent for Macroeconomic Dynamics .

    How does big data affect GDP? Theory and evidence for the UK

    Get PDF
    We present an economic approach to measuring the impact of Big Data on GDP and GDP growth. We define data, information, ideas and knowledge. We present a conceptual framework to understand and measure the production of “Big Data”, which we classify as transformed data and data-based knowledge. We use this framework to understand how current official datasets and concepts used by Statistics Offices might already measure Big Data in GDP, or might miss it. We also set out how unofficial data sources might be used to measure the contribution of data to GDP and present estimates on its contributions to growth. Using new estimates of employment and investment in Big Data as set out in Chebli, Goodridge et al. (2015) and Goodridge and Haskel (2015a) and treating transformed data and data-based knowledge as capital assets, we estimate that for the UK: (a) in 2012, “Big Data” assets add £1.6bn to market sector GVA; (b) in 2005-2012, account for 0.02% of growth in market sector value-added; (c) much Big Data activity is already captured in the official data on software – 76% of investment in Big Data is already included in official software investment, and 76% of the contribution of Big Data to GDP growth is also already in the software contribution; and (d) in the coming decade, data-based assets may contribute around 0.07% to 0.23% pa of annual growth on average
    corecore