56,502 research outputs found
Reducing Spatial Data Complexity for Classification Models
Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly
increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy
corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be
frequently retrained which fiirther hinders their use. Various data reduction techniques ranging from data sampling up to
density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do
not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our
response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled
spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we
demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are
moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled
by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of
the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions.
As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with
the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced
dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments
if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of
classification performance at the comparable compression levels
Feasibility of Warehouse Drone Adoption and Implementation
While aerial delivery drones capture headlines, the pace of adoption of drones in warehouses has shown the greatest acceleration. Warehousing constitutes 30% of the cost of logistics in the US. The rise of e-commerce, greater customer service demands of retail stores, and a shortage of skilled labor have intensified competition for efficient warehouse operations. This takes place during an era of shortening technology life cycles. This paper integrates several theoretical perspectives on technology diffusion and adoption to propose a framework to inform supply chain decision-makers on when to invest in new robotics technology
Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses
A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses
Using Ontologies for the Design of Data Warehouses
Obtaining an implementation of a data warehouse is a complex task that forces
designers to acquire wide knowledge of the domain, thus requiring a high level
of expertise and becoming it a prone-to-fail task. Based on our experience, we
have detected a set of situations we have faced up with in real-world projects
in which we believe that the use of ontologies will improve several aspects of
the design of data warehouses. The aim of this article is to describe several
shortcomings of current data warehouse design approaches and discuss the
benefit of using ontologies to overcome them. This work is a starting point for
discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure
Optimizing the number and location of warehouses in logistics networks considering the optimal delivery routes and set level of reserve stock
Forming the optimal structure of a warehouse network is one of the main strategic tasks in arranging an efficient logistics network. The suggested mathematical model and iterative method allow an optimal number and location of warehouses (in the area served by the logistics network) to be defined – with the minimum of total logistics costs for shipping of goods from suppliers through warehouses to customers. The focus is on optimal delivery routes and the optimal level to set for reserve stock contained in the storage network
SODA: Generating SQL for Business Users
The purpose of data warehouses is to enable business analysts to make better
decisions. Over the years the technology has matured and data warehouses have
become extremely successful. As a consequence, more and more data has been
added to the data warehouses and their schemas have become increasingly
complex. These systems still work great in order to generate pre-canned
reports. However, with their current complexity, they tend to be a poor match
for non tech-savvy business analysts who need answers to ad-hoc queries that
were not anticipated. This paper describes the design, implementation, and
experience of the SODA system (Search over DAta Warehouse). SODA bridges the
gap between the business needs of analysts and the technical complexity of
current data warehouses. SODA enables a Google-like search experience for data
warehouses by taking keyword queries of business users and automatically
generating executable SQL. The key idea is to use a graph pattern matching
algorithm that uses the metadata model of the data warehouse. Our results with
real data from a global player in the financial services industry show that
SODA produces queries with high precision and recall, and makes it much easier
for business users to interactively explore highly-complex data warehouses.Comment: VLDB201
- …