59 research outputs found

    Machine learning supported next-maintenance prediction for industrial vehicles

    Get PDF
    Industrial and construction vehicles require tight periodic maintenance operations. Their schedule depends on vehicle characteristics and usage. The latter can be accurately monitored through various on-board devices, enabling the application of Machine Learning techniques to analyze vehicle usage patterns and design predictive analytics. This paper presents a data-driven application to automatically schedule the periodic maintenance operations of industrial vehicles. It aims to predict, for each vehicle and date, the actual remaining days until the next maintenance is due. Our Machine Learning solution is designed to address the following challenges: (i) the non-stationarity of the per-vehicle utilization time series, which limits the effectiveness of classic scheduling policies, and (ii) the potential lack of historical data for those vehicles that have recently been added to the fleet, which hinders the learning of accurate predictors from past data. Preliminary results collected in a real industrial scenario demonstrate the effectiveness of the proposed solution on heterogeneous vehicles. The system we propose here is currently under deployment, enabling further tests and tunings

    Fine-Grained Complexity of Regular Path Queries

    Get PDF
    A regular path query (RPQ) is a regular expression q that returns all node pairs (u, v) from a graph database that are connected by an arbitrary path labelled with a word from L(q). The obvious algorithmic approach to RPQ evaluation (called PG-approach), i. e., constructing the product graph between an NFA for q and the graph database, is appealing due to its simplicity and also leads to efficient algorithms. However, it is unclear whether the PG-approach is optimal. We address this question by thoroughly investigating which upper complexity bounds can be achieved by the PG-approach, and we complement these with conditional lower bounds (in the sense of the fine-grained complexity framework). A special focus is put on enumeration and delay bounds, as well as the data complexity perspective. A main insight is that we can achieve optimal (or near optimal) algorithms with the PG-approach, but the delay for enumeration is rather high (linear in the database). We explore three successful approaches towards enumeration with sub-linear delay: super-linear preprocessing, approximations of the solution sets, and restricted classes of RPQs

    Evaluating espresso coffee quality by means of time-series feature engineering

    Get PDF
    Espresso quality attracts the interest of many stakeholders: from consumers to local business activities, from coffee-machine vendors to international coffee industries. So far, it has been mostly addressed by means of human experts, electronic noses, and chemical approaches. The current work, instead, proposes a datadriven analysis exploiting time-series feature engineering.We analyze a real-world dataset of espresso brewing by professional coffee-making machines. The novelty of the proposed work is provided by the focus on the brewing time series, from which we propose to engineer features able to improve previous data-driven metrics determining the quality of the espresso. Thanks to the exploitation of the proposed features, better quality-evaluation predictions are achieved with respect to previous data-driven approaches that relied solely on metrics describing each brewing as a whole (e.g., average flow, total amount of water). Yet, the engineered features are simple to compute and add a very limited workload to the coffee-machine sensor-data collection device, hence being suitable for large-scale IoT installations on-board of professional coffee machines, such as those typically installed in consumer-oriented business activities, shops, and workplaces. To the best of the authors' knowledge, this is the first attempt to perform a data-driven analysis of real-world espresso-brewing time series. Presented results yield to three-fold improvements in classification accuracy of high-quality espresso coffees with respect to current data-driven approaches (from 30% to 100%), exploiting simple threshold-based quality evaluations, defined in the newly proposed feature space

    A Purely Regular Approach to Non-Regular Core Spanners

    Get PDF
    The regular spanners (characterised by vset-automata) are closed under the algebraic operations of union, join and projection, and have desirable algorithmic properties. The core spanners (introduced by Fagin, Kimelfeld, Reiss, and Vansummeren (PODS 2013, JACM 2015) as a formalisation of the core functionality of the query language AQL used in IBM's SystemT) additionally need string equality selections and it has been shown by Freydenberger and Holldack (ICDT 2016, Theory of Computing Systems 2018) that this leads to high complexity and even undecidability of the typical problems in static analysis and query evaluation. We propose an alternative approach to core spanners: by incorporating the string-equality selections directly into the regular language that represents the underlying regular spanner (instead of treating it as an algebraic operation on the table extracted by the regular spanner), we obtain a fragment of core spanners that, while having slightly weaker expressive power than the full class of core spanners, arguably still covers the intuitive applications of string equality selections for information extraction and has much better upper complexity bounds of the typical problems in static analysis and query evaluation

    On Shapley Value in Data Assemblage Under Independent Utility

    Full text link
    In many applications, an organization may want to acquire data from many data owners. Data marketplaces allow data owners to produce data assemblage needed by data buyers through coalition. To encourage coalitions to produce data, it is critical to allocate revenue to data owners in a fair manner according to their contributions. Although in literature Shapley fairness and alternatives have been well explored to facilitate revenue allocation in data assemblage, computing exact Shapley value for many data owners and large assembled data sets through coalition remains challenging due to the combinatoric nature of Shapley value. In this paper, we explore the decomposability of utility in data assemblage by formulating the independent utility assumption. We argue that independent utility enjoys many applications. Moreover, we identify interesting properties of independent utility and develop fast computation techniques for exact Shapley value under independent utility. Our experimental results on a series of benchmark data sets show that our new approach not only guarantees the exactness of Shapley value, but also achieves faster computation by orders of magnitudes.Comment: Accepted by VLDB 202
    • …
    corecore