Search CORE

5,288 research outputs found

A review of associative classification mining

Author: Thabtah Fadi
Publication venue
Publication date: 01/01/2007
Field of study

Associative classification mining is a promising approach in data mining that utilizes the association rule discovery techniques to construct classification systems, also known as associative classifiers. In the last few years, a number of associative classification algorithms have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative classification techniques with regards to the above criteria. Finally, future directions in associative classification, such as incremental learning and mining low-quality data sets, are also highlighted in this paper

CiteSeerX

University of Huddersfield Repository

Leveraging the explainability of associative classifiers to support quantitative stock trading

Author: Attanasio Giuseppe
Baralis Elena
Cagliero Luca
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Forecasting the stock market is particularly challenging due to the presence of a variety of inter-related economic and political factors. In recent years, the application of Machine Learning algorithms in quantitative stock trading systems has become established, as it enables a data-driven approach to investing in the financial markets. However, most professional traders still look for an explanation of automatically generated signals to verify their adherence to technical and fundamental rules. This paper presents an explainable approach to stock trading. It investigates the use of classification rules, which represent reliable associations between a set of discrete indicator values and the target class, to address next-day stock price prediction. Adopting associative classifiers in short-term stock trading not only provides reliable signals but also allows domain experts to understand the rationale behind signal generation. The backtesting of a state-of-the-art associative classifier, relying on a lazy pruning strategy, has shown promising performance in terms of equity appreciation and robustness of the trading system to market drawdowns

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Cocycle Twists and Extensions of Braided Doubles

Author: Bazlov Yuri
Berenstein Arkady
Publication venue
Publication date: 24/11/2012
Field of study

It is well known that central extensions of a group G correspond to 2-cocycles on G. Cocycles can be used to construct extensions of G-graded algebras via a version of the Drinfeld twist introduced by Majid. We show how 2-cocycles can be defined for an abstract monoidal category C, following Panaite, Staic and Van Oystaeyen. A braiding on C leads to analogues of Nichols algebras in C, and we explain how the recent work on twists of Nichols algebras by Andruskiewitsch, Fantino, Garcia and Vendramin fits in this context. Furthermore, we propose an approach to twisting the multiplication in braided doubles, which are a class of algebras with triangular decomposition over G. Braided doubles are not G-graded, but may be embedded in a double of a Nichols algebra, where a twist may be carried out if careful choices are made. This is a source of new algebras with triangular decomposition. As an example, we show how to twist the rational Cherednik algebra of the symmetric group by the cocycle arising from the Schur covering group, obtaining the spin Cherednik algebra introduced by Wang.Comment: 60 pages, LaTeX; v2: references added, misprints correcte

arXiv.org e-Print Archive

Crossref

MIMS EPrints

Speech Recognition by Composition of Weighted Finite Automata

Author: Pereira Fernando C. N.
Riley Michael D.
Publication venue
Publication date: 01/01/1996
Field of study

We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition.Comment: 24 pages, uses psfig.st

arXiv.org e-Print Archive

CiteSeerX

What Causes My Test Alarm? Automatic Cause Analysis for Test Alarms in System and Integration Testing

Author: Jiang He
Li Xiaochen
Xuan Jifeng
Yang Zijiang
Publication venue
Publication date: 02/03/2017
Field of study

Driven by new software development processes and testing in clouds, system and integration testing nowadays tends to produce enormous number of alarms. Such test alarms lay an almost unbearable burden on software testing engineers who have to manually analyze the causes of these alarms. The causes are critical because they decide which stakeholders are responsible to fix the bugs detected during the testing. In this paper, we present a novel approach that aims to relieve the burden by automating the procedure. Our approach, called Cause Analysis Model, exploits information retrieval techniques to efficiently infer test alarm causes based on test logs. We have developed a prototype and evaluated our tool on two industrial datasets with more than 14,000 test alarms. Experiments on the two datasets show that our tool achieves an accuracy of 58.3% and 65.8%, respectively, which outperforms the baseline algorithms by up to 13.3%. Our algorithm is also extremely efficient, spending about 0.1s per cause analysis. Due to the attractive experimental results, our industrial partner, a leading information and communication technology company in the world, has deployed the tool and it achieves an average accuracy of 72% after two months of running, nearly three times more accurate than a previous strategy based on regular expressions.Comment: 12 page

arXiv.org e-Print Archive

Crossref

BAC: A bagged associative classifier for big data frameworks

Author: Apiletti Daniele
Garza Paolo
Venturini Luca
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Big Data frameworks allow powerful distributed computations extending the results achievable on a single machine. In this work, we present a novel distributed associative classifier, named BAC, based on ensemble techniques. Ensembles are a popular approach that builds several models on different subsets of the original dataset, eventually voting to provide a unique classification outcome. Experiments on Apache Spark and preliminary results showed the capability of the proposed ensemble classifier to obtain a quality comparable with the single-machine version on popular real-world datasets, and overcome their scalability limits on large synthetic datasets

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

I-prune: Item selection for associative classification

Author: Baralis
Coenen
Coenen
Guyon
Hall
Li
Quinlan
Rak
Tan
Wang
Wang
Zaïane
Publication venue: John Wiley & Sons, Inc.
Publication date: 01/01/2012
Field of study

Associative classification is characterized by accurate models and high model generation time. Most time is spent in extracting and postprocessing a large set of irrelevant rules, which are eventually pruned.We propose I-prune, an item-pruning approach that selects uninteresting items by means of an interestingness measure and prunes them as soon as they are detected. Thus, the number of extracted rules is reduced and model generation time decreases correspondingly. A wide set of experiments on real and synthetic data sets has been performed to evaluate I-prune and select the appropriate interestingness measure. The experimental results show that I-prune allows a significant reduction in model generation time, while increasing (or at worst preserving) model accuracy. Experimental evaluation also points to the chi-square measure as the most effective interestingness measure for item pruning

Crossref

Archivio istituzionale della ricerca - Politecnico di Milano

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino