24,454 research outputs found
Integrating E-Commerce and Data Mining: Architecture and Challenges
We show that the e-commerce domain can provide all the right ingredients for
successful data mining and claim that it is a killer domain for data mining. We
describe an integrated architecture, based on our expe-rience at Blue Martini
Software, for supporting this integration. The architecture can dramatically
reduce the pre-processing, cleaning, and data understanding effort often
documented to take 80% of the time in knowledge discovery projects. We
emphasize the need for data collection at the application server layer (not the
web server) in order to support logging of data and metadata that is essential
to the discovery process. We describe the data transformation bridges required
from the transaction processing systems and customer event streams (e.g.,
clickstreams) to the data warehouse. We detail the mining workbench, which
needs to provide multiple views of the data through reporting, data mining
algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200
Benchmarking Summarizability Processing in XML Warehouses with Complex Hierarchies
Business Intelligence plays an important role in decision making. Based on
data warehouses and Online Analytical Processing, a business intelligence tool
can be used to analyze complex data. Still, summarizability issues in data
warehouses cause ineffective analyses that may become critical problems to
businesses. To settle this issue, many researchers have studied and proposed
various solutions, both in relational and XML data warehouses. However, they
find difficulty in evaluating the performance of their proposals since the
available benchmarks lack complex hierarchies. In order to contribute to
summarizability analysis, this paper proposes an extension to the XML warehouse
benchmark (XWeB) with complex hierarchies. The benchmark enables us to generate
XML data warehouses with scalable complex hierarchies as well as
summarizability processing. We experimentally demonstrated that complex
hierarchies can definitely be included into a benchmark dataset, and that our
benchmark is able to compare two alternative approaches dealing with
summarizability issues.Comment: 15th International Workshop on Data Warehousing and OLAP (DOLAP
2012), Maui : United States (2012
Using Ontologies for the Design of Data Warehouses
Obtaining an implementation of a data warehouse is a complex task that forces
designers to acquire wide knowledge of the domain, thus requiring a high level
of expertise and becoming it a prone-to-fail task. Based on our experience, we
have detected a set of situations we have faced up with in real-world projects
in which we believe that the use of ontologies will improve several aspects of
the design of data warehouses. The aim of this article is to describe several
shortcomings of current data warehouse design approaches and discuss the
benefit of using ontologies to overcome them. This work is a starting point for
discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure
An Intelligent Data Mining System to Detect Health Care Fraud
The chapter begins with an overview of the types of healthcare fraud. Next, there is a brief discussion of issues with the current fraud detection approaches. The chapter then develops information technology based approaches and illustrates how these technologies can improve current practice. Finally, there is a summary of the major findings and the implications for healthcare practice
Pattern tree-based XOLAP rollup operator for XML complex hierarchies
With the rise of XML as a standard for representing business data, XML data
warehousing appears as a suitable solution for decision-support applications.
In this context, it is necessary to allow OLAP analyses on XML data cubes.
Thus, XQuery extensions are needed. To define a formal framework and allow
much-needed performance optimizations on analytical queries expressed in
XQuery, defining an algebra is desirable. However, XML-OLAP (XOLAP) algebras
from the literature still largely rely on the relational model. Hence, we
propose in this paper a rollup operator based on a pattern tree in order to
handle multidimensional XML data expressed within complex hierarchies
Recommended from our members
E-commerce, Warehousing and Distribution Facilities in California: A Dynamic Landscape and the Impacts on Disadvantaged Communities
This work addresses the distribution of warehouses and distribution centers (W&DCs) influenced by e-commerce, through spatial analysis and econometric modelling. Specifically, this work analyzes the concentration of W&DCs in various metropolitan planning organizations (MPOs) in California between 1989 and 2016-18; and studies the spatial relationships between W&DC distribution and other demographic and environmental factors through econometric modeling techniques. The work conducts analyses to uncover common trends in W&DC distribution. The analyses used aggregate establishment, employment, and other socio-economic information, complemented with transportation related variables. The results: 1) confirm that the weighted geometric centers of W&DCs have shifted slightly towards city central areas in all five MPOs; 2) W&DCs show a non-decreasing trend between 2008 and 2016; and 3) areas with more serious environmental problems are more likely to have W&DCs. A disaggregate analyses of properties sold and leased in one of the study regions shows a trend where businesses are buying or leasing smaller facilities, closer to the core of consumer demand. Among other factors, the growth of e-commerce sales, and expedited delivery services, which require proximity to the customers, may explain these trends. The study results provide insights for planners and policy decision makers, and will be of interest to practitioners, public and private entities, and academia. Caltrans, MPOs, and affiliated institutions of the National Center for Sustainable Transportation will directly benefit from the results as they want to avoid equity issues brought by the fast development of e-commerce, and its potential impact on W&DC distribution
- …