Search CORE

7,607 research outputs found

New probabilistic interest measures for association rules

Author: Hahsler Michael
Hornik Kurt
Publication venue
Publication date: 07/02/2008
Field of study

Mining association rules is an important technique for discovering meaningful patterns in transaction databases. Many different measures of interestingness have been proposed for association rules. However, these measures fail to take the probabilistic properties of the mined data into account. In this paper, we start with presenting a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world database from a grocery outlet to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left hand side of rules and that lift performs poorly to filter random noise in transaction data. Based on the probabilistic framework we develop two new interest measures, hyper-lift and hyper-confidence, which can be used to filter or order mined association rules. The new measures show significantly better performance than lift for applications where spurious rules are problematic

arXiv.org e-Print Archive

CiteSeerX

Computing Real Roots of Real Polynomials

Author: Mehlhorn Kurt
Sagraloff Michael
Publication venue
Publication date: 11/03/2015
Field of study

Computing the roots of a univariate polynomial is a fundamental and long-studied problem of computational algebra with applications in mathematics, engineering, computer science, and the natural sciences. For isolating as well as for approximating all complex roots, the best algorithm known is based on an almost optimal method for approximate polynomial factorization, introduced by Pan in 2002. Pan's factorization algorithm goes back to the splitting circle method from Schoenhage in 1982. The main drawbacks of Pan's method are that it is quite involved and that all roots have to be computed at the same time. For the important special case, where only the real roots have to be computed, much simpler methods are used in practice; however, they considerably lag behind Pan's method with respect to complexity. In this paper, we resolve this discrepancy by introducing a hybrid of the Descartes method and Newton iteration, denoted ANEWDSC, which is simpler than Pan's method, but achieves a run-time comparable to it. Our algorithm computes isolating intervals for the real roots of any real square-free polynomial, given by an oracle that provides arbitrary good approximations of the polynomial's coefficients. ANEWDSC can also be used to only isolate the roots in a given interval and to refine the isolating intervals to an arbitrary small size; it achieves near optimal complexity for the latter task.Comment: to appear in the Journal of Symbolic Computatio

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Yang-Mills streamlines and semi-classical confinement

Author: Ilgenfritz Ernst-Michael
Langfeld Kurt
Publication venue: 'AIP Publishing'
Publication date: 10/12/2010
Field of study

Semi-classical configurations in Yang-Mills theory have been derived from lattice Monte Carlo configurations using a recently proposed constrained cooling technique which is designed to preserve every Polyakov line (at any point in space-time in any direction). Consequently, confinement was found sustained by the ensemble of semi-classical configurations. The existence of gluonic and fermionic near-to-zero modes was demonstrated as a precondition for a possible semi-classical expansion around the cooled configurations as well as providing the gapless spectrum of the Dirac operator necessary for chiral symmetry breaking. The cluster structure of topological charge of the semi-classical streamline configurations was analysed and shown to support the axial anomaly of the right size, although the structure differs from the instanton gas or liquid. Here, we present further details on the space-time structure and the time evolution of the streamline configurations.Comment: Invited talk presented at the conference "Quark confinement and the hadron spectrum IX", Madrid, Aug 30 - Sept 3, 201

arXiv.org e-Print Archive

Crossref

A computational algorithm for crack determination: The multiple crack case

Author: Bryan Kurt
Vogelius Michael
Publication venue
Publication date
Field of study

An algorithm for recovering a collection of linear cracks in a homogeneous electrical conductor from boundary measurements of voltages induced by specified current fluxes is developed. The technique is a variation of Newton's method and is based on taking weighted averages of the boundary data. The method also adaptively changes the applied current flux at each iteration to maintain maximum sensitivity to the estimated locations of the cracks

NASA Technical Reports Server

TSP--Infrastructure for the Traveling Salesperson Problem

Author: Kurt Hornik
Michael Hahsler
Publication venue
Publication date
Field of study

The traveling salesperson (or, salesman) problem (TSP) is a well known and important combinatorial optimization problem. The goal is to find the shortest tour that visits each city in a given list exactly once and then returns to the starting city. Despite this simple problem statement, solving the TSP is difficult since it belongs to the class of NP-complete problems. The importance of the TSP arises besides from its theoretical appeal from the variety of its applications. Typical applications in operations research include vehicle routing, computer wiring, cutting wallpaper and job sequencing. The main application in statistics is combinatorial data analysis, e.g., reordering rows and columns of data matrices or identifying clusters. In this paper, we introduce the R package TSP which provides a basic infrastructure for handling and solving the traveling salesperson problem. The package features S3 classes for specifying a TSP and its (possibly optimal) solution as well as several heuristics to find good solutions. In addition, it provides an interface to Concorde, one of the best exact TSP solvers currently available.

Research Papers in Economics

Implications of probabilistic data modeling for rule mining

Author: Hahsler Michael
Hornik Kurt
Reutterer Thomas
Publication venue: Institut für Statistik und Mathematik, WU Vienna University of Economics and Business
Publication date: 01/01/2005
Field of study

Mining association rules is an important technique for discovering meaningful patterns in transaction databases. In the current literature, the properties of algorithms to mine associations are discussed in great detail. In this paper we investigate properties of transaction data sets from a probabilistic point of view. We present a simple probabilistic framework for transaction data and its implementation using the R statistical computing environment. The framework can be used to simulate transaction data when no associations are present. We use such data to explore the ability to filter noise of confidence and lift, two popular interest measures used for rule mining. Based on the framework we develop the measure hyperlift and we compare this new measure to lift using simulated data and a real-world grocery database.Series: Research Report Series / Department of Statistics and Mathematic

Elektronische Publikationen der Wirtschaftsuniversität Wien

Direct-to-Consumer Advertising in Pharmaceutical Markets

Author: Brekke Kurt R.
Kuhn Michael
Publication venue
Publication date
Field of study

We study effects of direct-to-consumer advertising (DTCA) in a market with two pharmaceutical firms providing horizontally differentiated (branded) drugs. Patients varying in their susceptability to medication are a priori uninformed of available medication. Physicians making the prescription choice perfectly identify a patient’s most suitable drug. Firms promote drugs to physicians (detailing) to influence prescription decisions and, if allowed, to consumers (DTCA) to increase the awareness of the drug. The main findings are: Firstly, firms benefit from DTCA only if prices are regulated. On the one hand, DTCA reduces the physicians’ market power and thus detailing expenses, while, on the other, it triggers price competition as a larger share of patients are aware of the alternatives. Secondly, under price regulation DTCA is welfare improving as long as the regulated price is not too high. Under price competition, DTCA is harmful to welfare unless detailing is wasteful and the drugs are poor substitutes.Advertising; Pharmaceuticals; Oligopoly

Research Papers in Economics

Direct-to-Consumer Advertising in Pharmaceutical Markets

Author: Kurt R Brekke
Michael Kuhn
Publication venue
Publication date
Field of study

We study effects of direct-to-consumer advertising (DTCA) in a mar- ket with two pharmaceutical firms providing horizontally dierentiated (branded) drugs. Patients varying in their susceptability to medication are a prioriuninformed of available medication. Physicians making the prescription choice perfectly identify a patient's most suitable drug. Firms promote drugs to physicians (detailing) to influence prescription decisions and, if allowed, to consumers (DTCA) to increase the awareness of the drug. The main Þndings are: Firstly, Þrms beneÞt fromDTCAonlyif prices are regulated. On the one hand, DTCA reduces the physicians™ market power and thus detailing expenses, while, on the other, it triggers price competition as a larger share of patients are aware of the alternatives. Secondly, under price regulation DTCA is welfare improving as long as the regulated price is not too high. Under price competition, DTCA lowers welfare unless detailing is wasteful and the drugs are poor substitutes.Advertising; Pharmaceuticals; Oligopoly

Research Papers in Economics