Search CORE

7,807 research outputs found

Integrating E-Commerce and Data Mining: Architecture and Challenges

Author: Ansari Suhail
Kohavi Ron
Mason Llew
Zheng Zijian
Publication venue
Publication date: 01/01/2000
Field of study

We show that the e-commerce domain can provide all the right ingredients for successful data mining and claim that it is a killer domain for data mining. We describe an integrated architecture, based on our expe-rience at Blue Martini Software, for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning, and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e.g., clickstreams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200

arXiv.org e-Print Archive

CiteSeerX

Storage Location Assignment Problem: implementation in a warehouse design optimization tool

Author: Battista C
Fumi A
Giordano F
Schiraldi MM
Publication venue
Publication date: 01/09/2011
Field of study

This paper focuses on possible improvements of common practices of warehouse storage management taking cue from Operations Research SLAP (Storage Location Assignment Problem), thus aiming to reach an efficient and organized allocation of products to the warehouse slots. The implementation of a SLAP approach in a tool able to model multiple storage policies will be discussed, with the aim both to reduce the overall required warehouse space - to efficiently allocate produced goods - and to minimize the internal material handling times. The overcome of some of the limits of existing warehousing information management systems modules will be shown, sketching the design of a software tool able to return an organized slot-product allocation. The results of the validation of a prototype on an industrial case are presented, showing the efficiency increase of using the proposed approach with dedicated slot storage policy adoption

ART

Storage Location Assignment Problem: implementation in a warehouse design optimization tool

Author: Battista C
Fumi A
Giordano F
SCHIRALDI MASSIMILIANO MARIA
Publication venue: place:Padova
Publication date: 01/09/2011
Field of study

ART

Multidimensional Modeling

Author: Pedersen Torben Bach
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2009
Field of study

VBN

Data warehouse automation trick or treat?

Author: Oliveira Paula Alexandra Pereira de
Publication venue
Publication date: 12/12/2019
Field of study

Data warehousing systems have been around for 25 years playing a crucial role in collecting data and transforming that data into value, allowing users to make decisions based on informed business facts. It is widely accepted that a data warehouse is a critical component to a data-driven enterprise, and it becomes part of the organisation’s information systems strategy, with a significant impact on the business. However, after 25 years, building a Data Warehouse is still painful, they are too time-consuming, too expensive and too difficult to change after deployment. Data Warehouse Automation appears with the promise to address the limitations of traditional approaches, turning the data warehouse development from a prolonged effort into an agile one, with gains in efficiency and effectiveness in data warehousing processes. So, is Data Warehouse Automation a Trick or Treat? To answer this question, a case study of a data warehousing architecture using a data warehouse automation tool, called WhereScape, was developed. Also, a survey was made to organisations that are using data warehouse automation tools, in order to understand their motivation in the adoption of this kind of tools in their data warehousing systems. Based on the results of the survey and on the case study, automation in the data warehouses building process is necessary to deliver data warehouse systems faster, and a solution to consider when modernize data warehouse architectures as a way to achieve results faster, keeping costs controlled and reduce risk. Data Warehouse Automation definitely may be a Treat.Os sistemas de armazenamento de dados existem há 25 anos, desempenhando um papel crucial na recolha de dados e na transformação desses dados em valor, permitindo que os utilizadores tomem decisões com base em fatos. É amplamente aceite, que um data warehouse é um componente crítico para uma empresa orientada a dados e se torna parte da estratégia de sistemas de informação da organização, com um impacto significativo nos negócios. No entanto, após 25 anos, a construção de um Data Warehouse ainda é uma tarefa penosa, demora muito tempo, é cara e difícil de mudar após a sua conclusão. A automação de Data Warehouse aparece com a promessa de endereçar as limitações das abordagens tradicionais, transformando o desenvolvimento da data warehouse de um esforço prolongado em um esforço ágil, com ganhos de eficiência e eficácia. Será, a automação de Data Warehouse uma doçura ou travessura? Foi desenvolvido um estudo de caso de uma arquitetura de data warehousing usando uma ferramenta de automação, designada WhereScape. Foi também conduzido um questionário a organizações que utilizam ferramentas de automação de data warehouse, para entender sua motivação na adoção deste tipo de ferramentas. Com base nos resultados da pesquisa e no estudo de caso, a automação no processo de construção de data warehouses, é necessária para uma maior agilidade destes sistemas e uma solução a considerar na modernização destas arquiteturas, pois permitem obter resultados mais rapidamente, mantendo os custos controlados e reduzindo o risco. A automação de data warehouse pode bem vir a ser uma “doçura”

Repositório Institucional do ISCTE-IUL

Design of dimensional model for clinical data storage and analysis

Author: Arora P.
Arora P.
Naik P.K.
Naik P.K.
Pant S.
Pant S.
Sengupta D.
Sengupta D.
Publication venue: "Iuliu Haţieganu" Publishing House
Publication date: 01/01/2013
Field of study

Current research in the field of Life and Medical Sciences is generating chunk of data on daily basis. It has thus become a necessity to find solutions for efficient storage of this data, trying to correlate and extract knowledge from it. Clinical data generated in Hospitals, Clinics & Diagnostics centers is falling under a similar paradigm. Patient’s records in various hospitals are increasing at an exponential rate, thus adding to the problem of data management and storage. Major problem being faced corresponding to storage, is the varied dimensionality of the data, ranging from images to numerical form. Therefore there is a need for development of efficient data model which can handle this multi-dimensionality data issue and store the data with historical aspect. For the stated problem lying in façade of clinical informatics we propose a clinical dimensional model design which can be used for development of a clinical data mart. The model has been designed keeping in consideration temporal storage of patient's data with respect to all possible clinical parameters which can include both textual and image based data. Availability of said data for each patient can be then used for application of data mining techniques for finding the correlation of all the parameters at the level of individual and population

WestminsterResearch

The Data Lakehouse: Data Warehousing and More

Author: Hughes Jason
Mazumdar Dipankar
Onofre JB
Publication venue
Publication date: 12/10/2023
Field of study

Relational Database Management Systems designed for Online Analytical Processing (RDBMS-OLAP) have been foundational to democratizing data and enabling analytical use cases such as business intelligence and reporting for many years. However, RDBMS-OLAP systems present some well-known challenges. They are primarily optimized only for relational workloads, lead to proliferation of data copies which can become unmanageable, and since the data is stored in proprietary formats, it can lead to vendor lock-in, restricting access to engines, tools, and capabilities beyond what the vendor offers. As the demand for data-driven decision making surges, the need for a more robust data architecture to address these challenges becomes ever more critical. Cloud data lakes have addressed some of the shortcomings of RDBMS-OLAP systems, but they present their own set of challenges. More recently, organizations have often followed a two-tier architectural approach to take advantage of both these platforms, leveraging both cloud data lakes and RDBMS-OLAP systems. However, this approach brings additional challenges, complexities, and overhead. This paper discusses how a data lakehouse, a new architectural approach, achieves the same benefits of an RDBMS-OLAP and cloud data lake combined, while also providing additional advantages. We take today's data warehousing and break it down into implementation independent components, capabilities, and practices. We then take these aspects and show how a lakehouse architecture satisfies them. Then, we go a step further and discuss what additional capabilities and benefits a lakehouse architecture provides over an RDBMS-OLAP

arXiv.org e-Print Archive

Modern technologies for data storage, organization and managing in CRM systems

Author: Cerchia Alina Elena
Grigorescu Adriana
Jeflea Florin Victor
Publication venue
Publication date: 01/06/2016
Field of study

In our study we intend to emphasize the main targeted objectives for the implementation of CRM type platforms. According to these objectives, in order to provide the functionality of CRM platforms, we will make a reference to\ud the prime methods of collecting and organizing information: databases, data warehouses, data centers from Cloud Computing field. As a representative procedure of handling information we will exemplify the OLAP technique which\ud is implemented by means of SQL Server Analysis Service software instrument. Finally, we will try to look over some of the Cloud Computing based CRM platforms and how the OLAP techniques can be applied to them

Crossref

FIKT Repository

Designing and Implementing a Data Warehouse using Dimensional Modeling

Author: GANAPAVARAPU VINAYA
Publication venue: UNM Digital Repository
Publication date: 12/07/2014
Field of study

As a part of the business intelligence activities initiated at the University of New Mexico (UNM) in the O ce of Institutional Analytics, a need for a data warehouse was established. The goal of the data warehouse is to host data related to students, faculty, sta , nance data and research and make it readily available for the purposes of university analytics. In addition, this data warehouse will be used to generate required reports and help the university better analyze student success activities. In order to build real-time reports, it is essential that the massive amounts of transactional data related to university activities be structured in a way that is op- timal for querying and reporting. This transactional data is stored in relational databases in an Operational Data Store (ODS) at UNM. But for reporting purposes, this design currently requires scores of database join operations between relational database views in order to answer even simple questions. Apart from a ecting per- formance, i.e., the time taken to run these reports, development time is also a factor, as it is very di cult to comprehend the complex data models associated with the ODS in order to generate the appropriate queries. Dimensional modeling was employed to address this issue. Dimensional mod- eling was developed by two pioneers in the eld, Bill Inmon and Ralph Kimball. This thesis explores both methods and implements Kimball\u27s method of dimensional modeling leading to a dimensional data mart based on a star schema design that was implemented using a high performance commercial database. In addition, a data integration tool was used for performing extract-transform-load (ETL) operations necessary to develop jobs and design work ows and to automate the loading of data into the data mart. HTML reports were developed from the data mart using a reporting tool and performance was evaluated relative to reports generated directly from the ODS. On average, the reports developed on top of the data mart were at least 65% faster than those generated from directly from the ODS. One of the reason for this is because the number of joins between tables were drastically reduced. Another reason is that in the ODS, reports were built against views which when queried are slower to perform as compared to reports developed against tables