49,062 research outputs found

    A Framework to Adjust Dependency Measure Estimates for Chance

    Full text link
    Estimating the strength of dependency between two variables is fundamental for exploratory analysis and many other applications in data mining. For example: non-linear dependencies between two continuous variables can be explored with the Maximal Information Coefficient (MIC); and categorical variables that are dependent to the target class are selected using Gini gain in random forests. Nonetheless, because dependency measures are estimated on finite samples, the interpretability of their quantification and the accuracy when ranking dependencies become challenging. Dependency estimates are not equal to 0 when variables are independent, cannot be compared if computed on different sample size, and they are inflated by chance on variables with more categories. In this paper, we propose a framework to adjust dependency measure estimates on finite samples. Our adjustments, which are simple and applicable to any dependency measure, are helpful in improving interpretability when quantifying dependency and in improving accuracy on the task of ranking dependencies. In particular, we demonstrate that our approach enhances the interpretability of MIC when used as a proxy for the amount of noise between variables, and to gain accuracy when ranking variables during the splitting procedure in random forests.Comment: In Proceedings of the 2016 SIAM International Conference on Data Minin

    Comparing performance of statistical models for individual’s ability index and ranking

    Get PDF
    Efficient allocation of resources is the basic problem in economics. Firms, educational institutions, universities are faces problem of estimating true abilities and ranking of individuals to be selected for job, admissions and scholarship awards etc. This study will provide a guide line what technique should to be used for estimating true ability indices and ranking that reveals ability with maximum efficiency as well as it clearly has the advantage of differentiating among individuals having equal raw score. Two major theories Classical Testing Theory and Item Response Theory have been using in the literature. We design two different Monte Carlo studies to investigate which theory is better and which model perform more efficiently. By discussing the weaknesses of CTT this study proved that IRT is superior to CTT. Different IRT models have been used in literature; we measured the performance of these models and found that Logistic P2 model is best model. By using this best model we estimate the ability indices of the students on the basis of their entry test scores and then compare with their abilities obtained from final board examination result (used as proxy of true abilities). This is a reasonable because the final exam consists of various papers and chance variation in ability Index is a minimum. With real life application this study also proved that IRT estimate the true abilities more efficiently as compared to classical methodology.Ability Index, Monte Carlo study, Logistic and Probit models, Item Response Theory, Classical Test Theory, Ranking of Students

    Chance-Constrained Equilibrium in Electricity Markets With Asymmetric Forecasts

    Full text link
    We develop a stochastic equilibrium model for an electricity market with asymmetric renewable energy forecasts. In our setting, market participants optimize their profits using public information about a conditional expectation of energy production but use private information about the forecast error distribution. This information is given in the form of samples and incorporated into profit-maximizing optimizations of market participants through chance constraints. We model information asymmetry by varying the sample size of participants' private information. We show that with more information available, the equilibrium gradually converges to the ideal solution provided by the perfect information scenario. Under information scarcity, however, we show that the market converges to the ideal equilibrium if participants are to infer the forecast error distribution from the statistical properties of the data at hand or share their private forecasts

    Discovering Reliable Dependencies from Data: Hardness and Improved Algorithms

    Get PDF
    The reliable fraction of information is an attractive score for quantifying (functional) dependencies in high-dimensional data. In this paper, we systematically explore the algorithmic implications of using this measure for optimization. We show that the problem is NP-hard, which justifies the usage of worst-case exponential-time as well as heuristic search methods. We then substantially improve the practical performance for both optimization styles by deriving a novel admissible bounding function that has an unbounded potential for additional pruning over the previously proposed one. Finally, we empirically investigate the approximation ratio of the greedy algorithm and show that it produces highly competitive results in a fraction of time needed for complete branch-and-bound style search.Comment: Accepted to Proceedings of the IEEE International Conference on Data Mining (ICDM'18

    Is late-life dependency increasing or not? A comparison of the Cognitive Function and Ageing Studies (CFAS)

    Get PDF
    Background: Little is known about how dependency levels have changed between generational cohorts of older people. We estimated years lived in different care states at age 65 in 1991 and 2011 and new projections of future demand for care. Methods: Two population-based studies of older people in defined geographical areas conducted two decades apart (the Cognitive Function and Ageing Studies) provided prevalence estimates of dependency in four states: high (24-hour care); medium (daily care); low (less than daily); independent. Years in each dependency state were calculated by Sullivan’s method. To project future demand, the proportions in each dependency state (by age group and sex) were applied to the 2014 England population projections. Findings: Between 1991 and 2011 there were significant increases in years lived from age 65 with low (men:1·7 years, 95%CI 1·0-2·4; women:2·4 years, 95%CI 1·8-3·1) and high dependency (men:0·9 years, 95%CI 0·2-1·7; women:1·3 years, 95%CI 0·5-2·1). The majority of men’s extra years of life were independent (36%) or with low dependency (36%) whilst for women the majority were spent with low dependency (58%), only 5% being independent. There were substantial reductions in the proportions with medium and high dependency who lived in care homes, although, if these dependency and care home proportions remain constant in the future, further population ageing will require an extra 71,000 care home places by 2025. Interpretation: On average older men now spend 2.4 years and women 3.0 years with substantial care needs (medium or high dependency), and most will live in the community. These findings have considerable implications for older people’s families who provide the majority of unpaid care, but the findings also supply valuable new information for governments and care providers planning the resources and funding required for the care of their future ageing populations

    Thousands of models, one story: current account imbalances in the global economy

    Get PDF
    The global financial crisis has led to a revival of the empirical literature on current account imbalances. This paper contributes to that literature by investigating the importance of evaluating model and parameter uncertainty prior to reaching any firm conclusion. We explore three alternative econometric strategies: examining all models, selecting a few, and combining them all. Out of thousands (or indeed millions) of models a story emerges. The chance that current accounts were aligned with fundamentals prior to the financial crisis appears to be minimal.Macroeconomics - Econometric models

    Towards a scope management of non-functional requirements in requirements engineering

    Get PDF
    Getting business stakeholders’ goals formulated clearly and project scope defined realistically increases the chance of success for any application development process. As a consequence, stakeholders at early project stages acquire as much as possible knowledge about the requirements, their risk estimates and their prioritization. Current industrial practice suggests that in most software projects this scope assessment is performed on the user’s functional requirements (FRs), while the non-functional requirements (NFRs) remain, by and large, ignored. However, the increasing software complexity and competition in the software industry has highlighted the need to consider NFRs as an integral part of software modeling and development. This paper contributes towards harmonizing the need to build the functional behavior of a system with the need to model the associated NFRs while maintaining a scope management for NFRs. The paper presents a systematic and precisely defined model towards an early integration of NFRs within the requirements engineering (RE). Early experiences with the model indicate its ability to facilitate the process of acquiring the knowledge on the priority and risk of NFRs

    Dynamics of Poverty in Ethiopia

    Get PDF
    poverty dynamics, vulnerability, households, duration
    • …
    corecore