14,940 research outputs found
Graph-based Modelling of Concurrent Sequential Patterns
Structural relation patterns have been introduced recently to extend the search for complex patterns often hidden behind large sequences of data. This has motivated a novel approach to sequential patterns post-processing and a corresponding data mining method was proposed for Concurrent Sequential Patterns (ConSP). This article refines the approach in the context of ConSP modelling, where a companion graph-based model is devised as an extension of previous work. Two new modelling methods are presented here together with a construction algorithm, to complete the transformation of concurrent sequential patterns to a ConSP-Graph representation. Customer orders data is used to demonstrate the effectiveness of ConSP mining while synthetic sample data highlights the strength of the modelling technique, illuminating the theories developed
Improving lifecycle query in integrated toolchains using linked data and MQTT-based data warehousing
The development of increasingly complex IoT systems requires large
engineering environments. These environments generally consist of tools from
different vendors and are not necessarily integrated well with each other. In
order to automate various analyses, queries across resources from multiple
tools have to be executed in parallel to the engineering activities. In this
paper, we identify the necessary requirements on such a query capability and
evaluate different architectures according to these requirements. We propose an
improved lifecycle query architecture, which builds upon the existing Tracked
Resource Set (TRS) protocol, and complements it with the MQTT messaging
protocol in order to allow the data in the warehouse to be kept updated in
real-time. As part of the case study focusing on the development of an IoT
automated warehouse, this architecture was implemented for a toolchain
integrated using RESTful microservices and linked data.Comment: 12 pages, worksho
Privacy and Confidentiality in an e-Commerce World: Data Mining, Data Warehousing, Matching and Disclosure Limitation
The growing expanse of e-commerce and the widespread availability of online
databases raise many fears regarding loss of privacy and many statistical
challenges. Even with encryption and other nominal forms of protection for
individual databases, we still need to protect against the violation of privacy
through linkages across multiple databases. These issues parallel those that
have arisen and received some attention in the context of homeland security.
Following the events of September 11, 2001, there has been heightened attention
in the United States and elsewhere to the use of multiple government and
private databases for the identification of possible perpetrators of future
attacks, as well as an unprecedented expansion of federal government data
mining activities, many involving databases containing personal information. We
present an overview of some proposals that have surfaced for the search of
multiple databases which supposedly do not compromise possible pledges of
confidentiality to the individuals whose data are included. We also explore
their link to the related literature on privacy-preserving data mining. In
particular, we focus on the matching problem across databases and the concept
of ``selective revelation'' and their confidentiality implications.Comment: Published at http://dx.doi.org/10.1214/088342306000000240 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …