Search CORE

38,256 research outputs found

Efficient Incremental Breadth-Depth XML Event Mining

Author: Boussaïd Omar
Darmont Jérôme
Salem Rashed
Publication venue
Publication date: 01/01/2011
Field of study

Many applications log a large amount of events continuously. Extracting interesting knowledge from logged events is an emerging active research area in data mining. In this context, we propose an approach for mining frequent events and association rules from logged events in XML format. This approach is composed of two-main phases: I) constructing a novel tree structure called Frequency XML-based Tree (FXT), which contains the frequency of events to be mined; II) querying the constructed FXT using XQuery to discover frequent itemsets and association rules. The FXT is constructed with a single-pass over logged data. We implement the proposed algorithm and study various performance issues. The performance study shows that the algorithm is efficient, for both constructing the FXT and discovering association rules

arXiv.org e-Print Archive

Crossref

HAL Descartes

HAL

Recommended from our members

A customizable multi-agent system for distributed data mining

Author: Di Fatta Giuseppe
Fortino Giancarlo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances

Central Archive at the University of Reading

CiteSeerX

Crossref

Privacy Violation and Detection Using Pattern Mining Techniques

Author: Bhattacharya Jaijit
Chakraborti Debamitro
Dass Rajanish
Gupta S K
Kapoor Vishal
Publication venue
Publication date
Field of study

Privacy, its violations and techniques to bypass privacy violation have grabbed the centre-stage of both academia and industry in recent months. Corporations worldwide have become conscious of the implications of privacy violation and its impact on them and to other stakeholders. Moreover, nations across the world are coming out with privacy protecting legislations to prevent data privacy violations. Such legislations however expose organizations to the issues of intentional or unintentional violation of privacy data. A violation by either malicious external hackers or by internal employees can expose the organizations to costly litigations. In this paper, we propose PRIVDAM; a data mining based intelligent architecture of a Privacy Violation Detection and Monitoring system whose purpose is to detect possible privacy violations and to prevent them in the future. Experimental evaluations show that our approach is scalable and robust and that it can detect privacy violations or chances of violations quite accurately. Please contact the author for full text at [email protected]

Research Papers in Economics

A Global Alliance Against Forced Labour: Report I (B)

Author: International Labour Conference 93rd Session
International Labour Office
Publication venue: DigitalCommons@ILR
Publication date: 18/01/2005
Field of study

Explains how the concept of forced labor is defined in international law and discusses some parameters for identifying contemporary forced labor situations in practice. Provides the first minimum global estimate of the numbers of people in forced labor by an international organization, broken down by geographical region and by form of forced labor. Gives a global picture of contemporary patterns of forced labor, and of action to eradicate it. Reviews the ILO’s assistance to member States for the eradication of forced labor, in view of the creation of a Special Action Programme to Combat Forced Labour. Lastly, it makes recommendations for future action

DigitalCommons@ILR

Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

Author: Cotta Randell
Hu Mingyang
Jiang Dan
Liao Peizhou
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/01/2019
Field of study

We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed "identities", represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate "identity-based", rather than "identifier-based", user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of "finite-sample bias" that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.Comment: Accepted by WSDM 201

arXiv.org e-Print Archive

Crossref

Human Trafficking in Europe: An Economic Perspective

Author: InFocus Programme on Promoting the Declaration on Fundamental Principles and Rights at Work
Van Liemt Gijsbert
Publication venue: DigitalCommons@ILR
Publication date: 01/06/2004
Field of study

Based on a document originally prepared for the Eleventh Economic Forum of the Organization for Economic Security and Cooperation in Europe, held in Prague between 20-23 May 2003. Attempts to comprehend and document human trafficking’s underlying economic dimensions, and places the concerns of trafficking within broader migration analysis (including the role of irregular migration). It also comments on the financial flows involved in trafficking, and on the different patterns of financing trafficking services. Further, it contains a brief review of the evidence, as to the extent to which organized crime is involved in human trafficking

DigitalCommons@ILR

Compilation of Reports from the Conference on Trafficking of Human Beings and Migration: A human rights approach

Author: Anti-Slavery International
Publication venue: DigitalCommons@ILR
Publication date: 01/01/2005
Field of study

This document is part of a digital collection provided by the Martin P. Catherwood Library, ILR School, Cornell University, pertaining to the effects of globalization on the workplace worldwide. Special emphasis is placed on labor rights, working conditions, labor market changes, and union organizing.ASI_2005_HT_Portugal_Compilation_of_Reports.pdf: 184 downloads, before Oct. 1, 2020

DigitalCommons@ILR

eCommons@Cornell

Understanding the Policy Context of Hiring, Human Trafficking and Modern-Day Slavery A BRIEF FOR RESPONSIBLE BUSINESS

Author: Verité
Publication venue: DigitalCommons@ILR
Publication date: 01/01/2010
Field of study

This document is part of a digital collection provided by the Martin P. Catherwood Library, ILR School, Cornell University, pertaining to the effects of globalization on the workplace worldwide. Special emphasis is placed on labor rights, working conditions, labor market changes, and union organizing.Verite_HELP_WANTED_Understanding_the_Policy_Context_of_Hiring_Human_Trafficking_and_Modern_Day_Slavery.pdf: 228 downloads, before Oct. 1, 2020

DigitalCommons@ILR

eCommons@Cornell