503 research outputs found
Improving the cost effectiveness equation of cascade testing for Familial Hypercholesterolaemia (FH)
Purpose of Review : Many International recommendations for the management of Familial Hypercholesterolaemia (FH) propose the use of Cascade Testing (CT) using the family mutation to unambiguously identify affected relatives. In the current economic climate DNA information is often regarded as too expensive. Here we review the literature and suggest strategies to improve cost effectiveness of CT. Recent findings : Advances in next generation sequencing have both speeded up the time taken for a genetic diagnosis and reduced costs. Also, it is now clear that, in the majority of patients with a clinical diagnosis of FH where no mutation can be found, the most likely cause of their elevated LDL-cholesterol (LDL-C) is because they have inherited a greater number than average of common LDL-C raising variants in many different genes. The major cost driver for CT is not DNA testing but of treatment over the remaining lifetime of the identified relative. With potent statins now off-patent, the overall cost has reduced considerably, and combining these three factors, a FH service based around DNA-CT is now less than 25% of that estimated by NICE in 2009. Summary : While all patients with a clinical diagnosis of FH need to have their LDL-C lowered, CT should be focused on those with the monogenic form and not the polygenic form
FGC: an efficient constraint-based frequent set miner
Despite advances in algorithmic design, association rule mining remains problematic from a performance viewpoint when the size of the underlying transaction database is large. The well-known a priori approach, while reducing the computational effort involved still suffers from the problem of scalability due to its reliance on generating candidate itemsets. In this paper we present a novel approach that combines the power of preprocessing with the application of user-defined constraints to prune the itemset space prior to building a compact FP-tree. Experimentation shows that that our algorithm significantly outperforms the current state of the art algorithm, FP-bonsai
Non-redundant rare itemset generation
Rare itemsets are likely to be of great interest because they often relate to high-impact transactions which may give rise to rules of great practical signi cance. Research into the rare association rule mining problem has gained momentum in the recent past. In this paper, we propose a novel approach that captures such rare rules while ensuring that redundant rules are eliminated. Extensive testing on real-world datasets from the UCI repository con rm that our approach outperforms both the Apriori-Inverse(Koh et al. 2006) and Relative Support (Yun et al. 2003) algorithms
Integration of Data Mining and Data Warehousing: a practical methodology
The ever growing repository of data in all fields poses new challenges to the modern analytical
systems. Real-world datasets, with mixed numeric and nominal variables, are difficult to analyze and require effective visual exploration that conveys semantic relationships of data. Traditional data mining techniques such as clustering clusters only the numeric data. Little research has been carried out in tackling the problem of clustering high cardinality nominal variables to get better insight of underlying dataset. Several works in the literature proved the likelihood of integrating data mining with warehousing to discover knowledge from data. For the seamless integration, the mined data has
to be modeled in form of a data warehouse schema. Schema generation process is complex manual
task and requires domain and warehousing familiarity. Automated techniques are required to generate warehouse schema to overcome the existing dependencies. To fulfill the growing analytical needs and to overcome the existing limitations, we propose a novel methodology in this paper that permits efficient analysis of mixed numeric and nominal data, effective visual data exploration, automatic warehouse schema generation and integration of data mining and warehousing. The proposed methodology is evaluated by performing case study on real-world data set. Results show that multidimensional analysis can be performed in an easier and flexible way to discover meaningful
knowledge from large datasets
Cascade effects of load shedding in coupled networks
Intricate webs of interlinked critical infrastructures such as electrical grid, telecommunication, and transportation are essential for the minimal functioning of contemporary societies and economies. Advances in Information and Communication Technology (ICT) underpin the increasing interconnectivity of these systems which created new vulnerabilities that can be seriously affected by hardware failure, link cut, human error, natural disaster, physical-attacks and cyber-attacks. Failure of a fraction on nodes may lead to failure of dependent nodes in the other networks. Therefore, the main objective of this paper is to investigate the cascades phenomena caused by load shedding between two interconnected networks using Bak-Tang-Wiesenfeld sandpile modeling. We have found that, large avalanche occurred when node degree and/interconnectivity link become dense. In addition, the coupled random-regular networks have been found to be more robust than the coupled Erdos-Renyi networks. However, coupled random-regular networks are vulnerable to random attack and coupled Erdos-Renyi networks are vulnerable to target attack due to the degree distribution
Measuring cascade effects in interdependent networks by using effective graph resistance
Understanding the correlation between the underlie
network structure and overlay cascade effects in the interdependent
networks is one of major challenges in complex network
studies. There are some existing metrics that can be used
to measure the cascades. However, different metrics such as
average node degree interpret different characteristic of network
topological structure, especially less metrics have been identified
to effectively measure the cascading performance in interdependent
networks. In this paper, we propose to use a combined
Laplacian matrix to model the interdependent networks and their
interconnectivity, and then use its effective resistance metric as an
indicator to its cascading behavior. Moreover, we have conducted
extensive comparative studies among different metrics such as
average node degree, and the proposed effective resistance. We
have found that the effective resistance metric can describe more
accurate and finer characteristics on topological structure of
the interdependent networks than average node degree which
is widely adapted by the existing research studies for measuring
the cascading performance in interdependent networks
Evolving integrated multi-model framework for on line multiple time series prediction
Time series prediction has been extensively researched in both the statistical and computational intelligence literature with robust methods being developed that can be applied across any given application domain. A much less researched problem is multiple time series prediction where the objective is to simultaneously forecast the values of multiple variables which interact with each other in time varying amounts continuously over time. In this paper we describe the use of a novel Integrated Multi-Model Framework (IMMF) that combined models developed at three di erent levels of data granularity, namely the Global, Local and Transductive models to perform multiple time series prediction. The IMMF is implemented by training a neural network to assign relative weights to predictions from the models at the three di erent levels of data granularity. Our experimental results indicate that IMMF signi cantly outperforms well established methods of time series prediction when applied to the multiple time series prediction problem
Mining software metrics from Jazz
In this paper, we describe the extraction of source code metrics from the Jazz repository and the application of data mining techniques to identify the most useful of those metrics for predicting the success or failure of an attempt to construct a working instance of the software product. We present results from a systematic study using the J48 classification method. The results indicate that only a relatively small number of the available software metrics that we considered have any significance for predicting the outcome of a build. These significant metrics are discussed and implication of the results discussed, particularly the relative difficulty of being able to predict failed build attempts
The reduced cost of providing a nationally recognised service for familial hypercholesterolaemia
OBJECTIVE: Familial hypercholesterolaemia (FH) affects 1 in 500 people in the UK population and is associated with premature morbidity and mortality from coronary heart disease. In 2008, National Institute for Health and Care Excellence (NICE) recommended genetic testing of potential FH index cases and cascade testing of their relatives. Commissioners have been slow to respond although there is strong evidence of cost and clinical effectiveness. Our study quantifies the recent reduced cost of providing a FH service using generic atorvastatin and compares NICE costing estimates with three suggested alternative models of care (a specialist-led service, a dual model service where general practitioners (GPs) can access specialist advice, and a GP-led service).METHODS: Revision of existing 3?year costing template provided by NICE for FH services, and prediction of costs for running a programme over 10?years. Costs were modelled for the first population-based FH service in England which covers Southampton, Hampshire, Isle of Wight and Portsmouth (SHIP). Population 1.95 million.RESULTS: With expiry of the Lipitor (Pfizer atorvastatin) patent the cost of providing a 10-year FH service in SHIP reduces by 42.5% (£4.88 million on patent vs £2.80 million off patent). Further cost reductions are possible as a result of the reduced cost of DNA testing, more management in general practice, and lower referral rates to specialists. For instance a dual-care model with GP management of patients supported by specialist advice when required, costs £1.89 million.CONCLUSIONS: The three alternative models of care are now <50% of the cost of the original estimates undertaken by NICE
Dynamic Interaction Networks in modelling and predicting the behaviour of multiple interactive stock markets
The behaviour of multiple stock markets can be described within the framework of complex dynamic systems. A representative technique of the framework is the dynamic interaction network (DIN), recently developed in the bioinformatics domain. DINs are capable of modelling dynamic interactions between genes and predicting their future expressions. In this paper, we adopt a DIN approach to extract and model interactions between stock markets. The network is further able to learn online and updates incrementally with the unfolding of the stock market time-series. The approach is applied to a case study involving 10 market indexes in the Asia Pacific region. The results show that the DIN model reveals important and complex dynamic relationships between stock markets, demonstrating the ability of complex dynamic systems approaches to go beyond the scope of traditional statistical methods
- …