49,062 research outputs found
A Framework to Adjust Dependency Measure Estimates for Chance
Estimating the strength of dependency between two variables is fundamental
for exploratory analysis and many other applications in data mining. For
example: non-linear dependencies between two continuous variables can be
explored with the Maximal Information Coefficient (MIC); and categorical
variables that are dependent to the target class are selected using Gini gain
in random forests. Nonetheless, because dependency measures are estimated on
finite samples, the interpretability of their quantification and the accuracy
when ranking dependencies become challenging. Dependency estimates are not
equal to 0 when variables are independent, cannot be compared if computed on
different sample size, and they are inflated by chance on variables with more
categories. In this paper, we propose a framework to adjust dependency measure
estimates on finite samples. Our adjustments, which are simple and applicable
to any dependency measure, are helpful in improving interpretability when
quantifying dependency and in improving accuracy on the task of ranking
dependencies. In particular, we demonstrate that our approach enhances the
interpretability of MIC when used as a proxy for the amount of noise between
variables, and to gain accuracy when ranking variables during the splitting
procedure in random forests.Comment: In Proceedings of the 2016 SIAM International Conference on Data
Minin
Comparing performance of statistical models for individualâs ability index and ranking
Efficient allocation of resources is the basic problem in economics. Firms, educational institutions, universities are faces problem of estimating true abilities and ranking of individuals to be selected for job, admissions and scholarship awards etc. This study will provide a guide line what technique should to be used for estimating true ability indices and ranking that reveals ability with maximum efficiency as well as it clearly has the advantage of differentiating among individuals having equal raw score. Two major theories Classical Testing Theory and Item Response Theory have been using in the literature. We design two different Monte Carlo studies to investigate which theory is better and which model perform more efficiently. By discussing the weaknesses of CTT this study proved that IRT is superior to CTT. Different IRT models have been used in literature; we measured the performance of these models and found that Logistic P2 model is best model. By using this best model we estimate the ability indices of the students on the basis of their entry test scores and then compare with their abilities obtained from final board examination result (used as proxy of true abilities). This is a reasonable because the final exam consists of various papers and chance variation in ability Index is a minimum. With real life application this study also proved that IRT estimate the true abilities more efficiently as compared to classical methodology.Ability Index, Monte Carlo study, Logistic and Probit models, Item Response Theory, Classical Test Theory, Ranking of Students
Chance-Constrained Equilibrium in Electricity Markets With Asymmetric Forecasts
We develop a stochastic equilibrium model for an electricity market with
asymmetric renewable energy forecasts. In our setting, market participants
optimize their profits using public information about a conditional expectation
of energy production but use private information about the forecast error
distribution. This information is given in the form of samples and incorporated
into profit-maximizing optimizations of market participants through chance
constraints. We model information asymmetry by varying the sample size of
participants' private information. We show that with more information
available, the equilibrium gradually converges to the ideal solution provided
by the perfect information scenario. Under information scarcity, however, we
show that the market converges to the ideal equilibrium if participants are to
infer the forecast error distribution from the statistical properties of the
data at hand or share their private forecasts
Discovering Reliable Dependencies from Data: Hardness and Improved Algorithms
The reliable fraction of information is an attractive score for quantifying
(functional) dependencies in high-dimensional data. In this paper, we
systematically explore the algorithmic implications of using this measure for
optimization. We show that the problem is NP-hard, which justifies the usage of
worst-case exponential-time as well as heuristic search methods. We then
substantially improve the practical performance for both optimization styles by
deriving a novel admissible bounding function that has an unbounded potential
for additional pruning over the previously proposed one. Finally, we
empirically investigate the approximation ratio of the greedy algorithm and
show that it produces highly competitive results in a fraction of time needed
for complete branch-and-bound style search.Comment: Accepted to Proceedings of the IEEE International Conference on Data
Mining (ICDM'18
Is late-life dependency increasing or not? A comparison of the Cognitive Function and Ageing Studies (CFAS)
Background: Little is known about how dependency levels have changed between generational cohorts of older people. We estimated years lived in different care states at age 65 in 1991 and 2011 and new projections of future demand for care. Methods: Two population-based studies of older people in defined geographical areas conducted two decades apart (the Cognitive Function and Ageing Studies) provided prevalence estimates of dependency in four states: high (24-hour care); medium (daily care); low (less than daily); independent. Years in each dependency state were calculated by Sullivanâs method. To project future demand, the proportions in each dependency state (by age group and sex) were applied to the 2014 England population projections. Findings: Between 1991 and 2011 there were significant increases in years lived from age 65 with low (men:1¡7 years, 95%CI 1¡0-2¡4; women:2¡4 years, 95%CI 1¡8-3¡1) and high dependency (men:0¡9 years, 95%CI 0¡2-1¡7; women:1¡3 years, 95%CI 0¡5-2¡1). The majority of menâs extra years of life were independent (36%) or with low dependency (36%) whilst for women the majority were spent with low dependency (58%), only 5% being independent. There were substantial reductions in the proportions with medium and high dependency who lived in care homes, although, if these dependency and care home proportions remain constant in the future, further population ageing will require an extra 71,000 care home places by 2025. Interpretation: On average older men now spend 2.4 years and women 3.0 years with substantial care needs (medium or high dependency), and most will live in the community. These findings have considerable implications for older peopleâs families who provide the majority of unpaid care, but the findings also supply valuable new information for governments and care providers planning the resources and funding required for the care of their future ageing populations
Thousands of models, one story: current account imbalances in the global economy
The global financial crisis has led to a revival of the empirical literature on current account imbalances. This paper contributes to that literature by investigating the importance of evaluating model and parameter uncertainty prior to reaching any firm conclusion. We explore three alternative econometric strategies: examining all models, selecting a few, and combining them all. Out of thousands (or indeed millions) of models a story emerges. The chance that current accounts were aligned with fundamentals prior to the financial crisis appears to be minimal.Macroeconomics - Econometric models
Towards a scope management of non-functional requirements in requirements engineering
Getting business stakeholdersâ goals formulated clearly and project scope defined realistically increases the chance of success for any application development process. As a consequence, stakeholders at early project stages acquire as much as possible knowledge about the requirements, their risk estimates and their prioritization. Current industrial practice suggests that in most software projects this scope assessment is performed on the userâs functional requirements (FRs), while the non-functional requirements (NFRs) remain, by and large, ignored. However, the increasing software complexity and competition in the software industry has highlighted the need to consider NFRs as an integral part of software modeling and development. This paper contributes towards harmonizing the need to build the functional behavior of a system with the need to model the associated NFRs while maintaining a scope management for NFRs. The paper presents a systematic and precisely defined model towards an early integration of NFRs within the requirements engineering (RE). Early experiences with the model indicate its ability to facilitate the process of acquiring the knowledge on the priority and risk of NFRs
Dynamics of Poverty in Ethiopia
poverty dynamics, vulnerability, households, duration
- âŚ