331 research outputs found
Combining the Attribute Oriented Induction and Graph Visualization to Enhancement Association Rules Interpretation
The important methods of data mining is large and from these methods is mining of association rule. The miningof association rule gives huge number of the rules. These huge rules make analyst consuming more time when searchingthrough the large rules for finding the interesting rules. One of the solutions for this problem is combing between one of theAssociation rules visualization method and generalization method. Association rules visualization method is graph-basedmethod. Generalization method is Attribute Oriented Induction algorithm (AOI). AOI after combing calls ModifiedAOI because it removes and changes in the steps of the traditional AOI. The graph technique after combing also callsgrouped graph method because it displays the aggregated that results rules from AOI. The results of this paper are ratio ofcompression that gives clarity of visualization. These results provide the ability for test and drill down in the rules orunderstand and roll up
Rule mining in maintenance: analysing large knowledge bases
Association rule mining is a very powerful tool for extracting knowledge from records contained in industrial databases. A difficulty is that the mining process may result in a huge set of rules that may be difficult to analyse. This problem is often addressed by an a priori filtering of the candidate rules, that does not allow the user to have access to all the potentially interesting knowledge. Another popular solution is visual mining, where visualization techniques allow to browse through the rules. We suggest in this article a different approach: generating a large number of rules as a first step, then drill-down the produced rule base using alternatively semantic analysis (based on a priori knowledge) and objective analysis (based on numerical characteristics of the rules). It will be shown on real industrial examples in the maintenance domain that UML Class Diagrams may provide an efficient support for subjective analysis, the practical management of the rules (display, sorting and filtering) being insured by a classical Spreadsheet
Beyond L1: Faster and Better Sparse Models with skglm
We propose a new fast algorithm to estimate any sparse generalized linear
model with convex or non-convex separable penalties. Our algorithm is able to
solve problems with millions of samples and features in seconds, by relying on
coordinate descent, working sets and Anderson acceleration. It handles
previously unaddressed models, and is extensively shown to improve state-of-art
algorithms. We provide a flexible, scikit-learn compatible package, which
easily handles customized datafits and penalties
How do risk attitudes affect measured confidence?
We examine the relationship between confidence in own absolute performance and risk attitudes using two confidence elicitation procedures: self-reported (non-incentivised) confidence and an incentivised procedure that elicits the certainty equivalent of a bet based on performance. The former procedure reproduces the “hard-easy effect” (underconfidence in easy tasks and overconfidence in hard tasks) found in a large number of studies using non-incentivised self-reports. The latter procedure produces general underconfidence, which is significantly reduced, but not eliminated when we filter out the effects of risk attitudes. Finally, we find that self-reported confidence correlates significantly with features of individual risk attitudes including parameters of individual probability weighting
Interactive visual exploration of association rules with rule-focusing methodology
International audienceOn account of the enormous amounts of rules that can be produced by data mining algorithms, knowledge post-processing is a difficult stage in an association rule discovery process. In order to find relevant knowledge for decision making, the user (a decision maker specialized in the data studied) needs to rummage through the rules. To assist him/her in this task, we here propose the rule-focusing methodology, an interactive methodology for the visual post-processing of association rules. It allows the user to explore large sets of rules freely by focusing his/her attention on limited subsets. This new approach relies on rule interestingness measures, on a visual representation, and on interactive navigation among the rules. We have implemented the rule-focusing methodology in a prototype system called ARVis. It exploits the user's focus to guide the generation of the rules by means of a specific constraint-based rule-mining algorithm
A COMPREHENSIVE GEOSPATIAL KNOWLEDGE DISCOVERY FRAMEWORK FOR SPATIAL ASSOCIATION RULE MINING
Continuous advances in modern data collection techniques help spatial scientists gain access to massive and high-resolution spatial and spatio-temporal data. Thus there is an urgent need to develop effective and efficient methods seeking to find unknown and useful information embedded in big-data datasets of unprecedentedly large size (e.g., millions of observations), high dimensionality (e.g., hundreds of variables), and complexity (e.g., heterogeneous data sources, space–time dynamics, multivariate connections, explicit and implicit spatial relations and interactions). Responding to this line of development, this research focuses on the utilization of the association rule (AR) mining technique for a geospatial knowledge discovery process.
Prior attempts have sidestepped the complexity of the spatial dependence structure embedded in the studied phenomenon. Thus, adopting association rule mining in spatial analysis is rather problematic. Interestingly, a very similar predicament afflicts spatial regression analysis with a spatial weight matrix that would be assigned a priori, without validation on the specific domain of application. Besides, a dependable geospatial knowledge discovery process necessitates algorithms supporting automatic and robust but accurate procedures for the evaluation of mined results. Surprisingly, this has received little attention in the context of spatial association rule mining.
To remedy the existing deficiencies mentioned above, the foremost goal for this research is to construct a comprehensive geospatial knowledge discovery framework using spatial association rule mining for the detection of spatial patterns embedded in geospatial databases and to demonstrate its application within the domain of crime analysis. It is the first attempt at delivering a complete geo-spatial knowledge discovery framework using spatial association rule mining
An intersectional framework for counterfactual fairness in risk prediction
Along with the increasing availability of data in many sectors has come the
rise of data-driven models to inform decision-making and policy. In the health
care sector, these models have the potential to benefit both patients and
health care providers but can also entrench or exacerbate health inequities.
Existing "algorithmic fairness" methods for measuring and correcting model bias
fall short of what is needed for clinical applications in two key ways. First,
methods typically focus on a single grouping along which discrimination may
occur rather than considering multiple, intersecting groups such as gender and
race. Second, in clinical applications, risk prediction is typically used to
guide treatment, and use of a treatment presents distinct statistical issues
that invalidate most existing fairness measurement techniques. We present novel
unfairness metrics that address both of these challenges. We also develop a
complete framework of estimation and inference tools for our metrics, including
the unfairness value ("u-value"), used to determine the relative extremity of
an unfairness measurement, and standard errors and confidence intervals
employing an alternative to the standard bootstrap
Free water imaging of the cholinergic system in dementia with Lewy bodies and Alzheimer's disease
INTRODUCTION: Degeneration of cortical cholinergic projections from the nucleus basalis of Meynert (NBM) is characteristic of dementia with Lewy bodies (DLB) and Alzheimer's disease (AD), whereas involvement of cholinergic projections from the pedunculopontine nucleus (PPN) to the thalamus is less clear. METHODS: We studied both cholinergic projection systems using a free water-corrected diffusion tensor imaging (DTI) model in the following cases: 46 AD, 48 DLB, 35 mild cognitive impairment (MCI) with AD, 38 MCI with Lewy bodies, and 71 controls. RESULTS: Free water in the NBM-cortical pathway was increased in both dementia and MCI groups compared to controls and associated with cognition. Free water along the PPN-thalamus tract was increased only in DLB and related to visual hallucinations. Results were largely replicated in an independent cohort. DISCUSSION: While NBM-cortical projections degenerate early in AD and DLB, the thalamic cholinergic input from the PPN appears to be more selectively affected in DLB and might associate with visual hallucinations. Highlights: Free water in the NBM-cortical cholinergic pathways is increased in AD and DLB. NBM-cortical pathway integrity is related to overall cognitive performance. Free water in the PPN-thalamus cholinergic pathway is only increased in DLB, not AD. PPN-thalamus pathway integrity might be related to visual hallucinations in DLB
Maximally Machine-Learnable Portfolios
When it comes to stock returns, any form of predictability can bolster
risk-adjusted profitability. We develop a collaborative machine learning
algorithm that optimizes portfolio weights so that the resulting synthetic
security is maximally predictable. Precisely, we introduce MACE, a multivariate
extension of Alternating Conditional Expectations that achieves the
aforementioned goal by wielding a Random Forest on one side of the equation,
and a constrained Ridge Regression on the other. There are two key improvements
with respect to Lo and MacKinlay's original maximally predictable portfolio
approach. First, it accommodates for any (nonlinear) forecasting algorithm and
predictor set. Second, it handles large portfolios. We conduct exercises at the
daily and monthly frequency and report significant increases in predictability
and profitability using very little conditioning information. Interestingly,
predictability is found in bad as well as good times, and MACE successfully
navigates the debacle of 2022
- …