Search CORE

447 research outputs found

Putting Pandas in a Box

Author: Hagedorn Stefan
Kläbe Steffen
Sattler Kai-Uwe
Publication venue: ilmedia
Publication date: 11/01/2021
Field of study

Pandas - the Python Data Analysis Library - is a powerful and widely used framework for data analytics. In this work we present our approach to push down the computational part of Pandas scripts into the DBMS by using a transpiler. In addition to basic data processing operations, our approach also supports access to external data stored in files instead of the DBMS. Moreover, user-defined Python functions are transformed automatically to SQL UDFs executed in the DBMS. The latter allows the integration of complex computational tasks including machine learning. We show the usage of this feature to implement a so-called model join, i.e. applying pre-trained ML models to data in SQL tables

Digitale Bibliothek Thüringen

The Collection Virtual Machine: An Abstraction for Multi-Frontend Multi-Backend Data Analysis

Author: Akhadov Sabir
Alonso Gustavo
Koutsoukos Dimitrios
Marroquín Renato
Müller Ingo
Wawrzoniak Mike
Publication venue
Publication date: 08/04/2020
Field of study

Getting the best performance from the ever-increasing number of hardware platforms has been a recurring challenge for data processing systems. In recent years, the advent of data science with its increasingly numerous and complex types of analytics has made this challenge even more difficult. In practice, system designers are overwhelmed by the number of combinations and typically implement only one analysis/platform combination, leading to repeated implementation effort -- and a plethora of semi-compatible tools for data scientists. In this paper, we propose the "Collection Virtual Machine" (or CVM) -- an extensible compiler framework designed to keep the specialization process of data analytics systems tractable. It can capture at the same time the essence of a large span of low-level, hardware-specific implementation techniques as well as high-level operations of different types of analyses. At its core lies a language for defining nested, collection-oriented intermediate representations (IRs). Frontends produce programs in their IR flavors defined in that language, which get optimized through a series of rewritings (possibly changing the IR flavor multiple times) until the program is finally expressed in an IR of platform-specific operators. While reducing the overall implementation effort, this also improves the interoperability of both analyses and hardware platforms. We have used CVM successfully to build specialized backends for platforms as diverse as multi-core CPUs, RDMA clusters, and serverless computing infrastructure in the cloud and expect similar results for many more frontends and hardware platforms in the near future.Comment: This paper is currently under review at DaMoN'2

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Welcome to Sigmod 2019 - The 2019 ACM SIGMOD International Conference on the Management of Data!

Author: Ailamaki A. (Anastasia)
Boncz P.A. (Peter)
Manegold S. (Stefan)
Publication venue
Publication date: 30/06/2019
Field of study

CWI's Institutional Repository

Proceedings of the 2019 International Conference on Management of Data

Author
Publication venue
Publication date: 30/06/2019
Field of study

CWI's Institutional Repository

WHAT IS SMART ABOUT SERVICES? BREAKING THE BOND BETWEEN THE SMART PRODUCT AND THE SERVICE

Author: Boukhris Aida
Fritzsche Albrecht
Publication venue: AIS Electronic Library (AISeL)
Publication date: 15/05/2019
Field of study

While the conceptual delineation between conventional and smart products is rather conspicuous, the distinction between conventional services and their smart counterparts remains elusive. This study develops a conceptual framework for understanding the distinctive attributes of smart services and their relationship to smart products. In a systematic literature review of publications from top information systems outlets, 30 contributions holding relevant information on smart services are identified and subjected to content analysis. The analysis reveals a variety of different definitions and characterizations of smart services and relations to concepts like data-driven services and services associated to smart products and smart objects. These findings are used to examine artifacts developed in rather design-oriented papers to derive five dimensions that impact the level of smartness of services: richness of the data, the knowledge intensiveness of the engine for decision support, the level of sophistication of the outcome delivered to the service user(s), the architecture of the stakeholders, and the automation level of the service processes. Within this scope, the product can have four roles: sensor, computer, interface, or integrator. The paper concludes by identifying some gaps in the overall research landscape and provides directions for future research

AIS Electronic Library (AISeL)

An Integrated View on the Future of Logistics and Information Technology

Author: Dijkman Remco
Grefen Paul
Hofman Wout
Peters Sander
Veenstra Albert
Publication venue
Publication date: 01/01/2018
Field of study

In this position paper, we present our vision on the future of the logistics business domain and the use of information technology (IT) in this domain. The vision is based on extensive experience with Dutch and European logistics in various contexts and from various perspectives. We expect that the vision also holds for logistics outside Europe. We build our vision in a number of steps. First, we make an inventory of the most important trends in the logistics domain - we call these mega-trends. Next, we do the same for the information technology domain, restricted to technologies that have relevance for logistics. Then, we introduce logistics meta-concepts that we use to describe our vision and relate them to business engineering. We use these three ingredients to analyze leading concepts that we currently observe in the logistics domain. Next, we consolidate all elements into a model that represents our vision of the integrated future of logistics and IT. We elaborate on the role of data platforms and open standards in this integrated vision.Comment: 22 pages, 7 figures, 3 table

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Recommended from our members

Integrating Conversational Agents and Knowledge Graphs Within the Scholarly Domain

Author: Angioni Simone
Meloni Antonello
Motta Enrico
Osborne Francesco
Reforgiato Recupero Diego
Salatino Angelo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

In the last few years, chatbots have become mainstream solutions adopted in a variety of domains for automatizing communication at scale. In the same period, knowledge graphs have attracted significant attention from business and academia as robust and scalable representations of information. In the scientific and academic research domain, they are increasingly used to illustrate the relevant actors (e.g., researchers, institutions), documents (e.g., articles, patents), entities (e.g., concepts, innovations), and other related information. Following the same direction, this paper describes how to integrate conversational agents with knowledge graphs focused on the scholarly domain, a.k.a. Scientific Knowledge Graphs. On top of the proposed architecture, we developed AIDA-Bot, a simple chatbot that leverages a large-scale knowledge graph of scholarly data. AIDA-Bot can answer natural language questions about scientific articles, research concepts, researchers, institutions, and research venues. We have developed four prototypes of AIDA-Bot on Alexa products, web browsers, Telegram clients, and humanoid robots. We performed a user study evaluation with 15 domain experts showing a high level of interest and engagement with the proposed agent

Open Research Online (The Open University)