490 research outputs found

    Risk-Averse Matchings over Uncertain Graph Databases

    Full text link
    A large number of applications such as querying sensor networks, and analyzing protein-protein interaction (PPI) networks, rely on mining uncertain graph and hypergraph databases. In this work we study the following problem: given an uncertain, weighted (hyper)graph, how can we efficiently find a (hyper)matching with high expected reward, and low risk? This problem naturally arises in the context of several important applications, such as online dating, kidney exchanges, and team formation. We introduce a novel formulation for finding matchings with maximum expected reward and bounded risk under a general model of uncertain weighted (hyper)graphs that we introduce in this work. Our model generalizes probabilistic models used in prior work, and captures both continuous and discrete probability distributions, thus allowing to handle privacy related applications that inject appropriately distributed noise to (hyper)edge weights. Given that our optimization problem is NP-hard, we turn our attention to designing efficient approximation algorithms. For the case of uncertain weighted graphs, we provide a 13\frac{1}{3}-approximation algorithm, and a 15\frac{1}{5}-approximation algorithm with near optimal run time. For the case of uncertain weighted hypergraphs, we provide a Ω(1k)\Omega(\frac{1}{k})-approximation algorithm, where kk is the rank of the hypergraph (i.e., any hyperedge includes at most kk nodes), that runs in almost (modulo log factors) linear time. We complement our theoretical results by testing our approximation algorithms on a wide variety of synthetic experiments, where we observe in a controlled setting interesting findings on the trade-off between reward, and risk. We also provide an application of our formulation for providing recommendations of teams that are likely to collaborate, and have high impact.Comment: 25 page

    NOSQL design for analytical workloads: Variability matters

    Get PDF
    Big Data has recently gained popularity and has strongly questioned relational databases as universal storage systems, especially in the presence of analytical workloads. As result, co-relational alternatives, commonly known as NOSQL (Not Only SQL) databases, are extensively used for Big Data. As the primary focus of NOSQL is on performance, NOSQL databases are directly designed at the physical level, and consequently the resulting schema is tailored to the dataset and access patterns of the problem in hand. However, we believe that NOSQL design can also benefit from traditional design approaches. In this paper we present a method to design databases for analytical workloads. Starting from the conceptual model and adopting the classical 3-phase design used for relational databases, we propose a novel design method considering the new features brought by NOSQL and encompassing relational and co-relational design altogether.Peer ReviewedPostprint (author's final draft

    Automated database design for document stores with multicriteria optimization

    Get PDF
    Document stores have gained popularity among NoSQL systems mainly due to the semi-structured data storage structure and the enhanced query capabilities. The database design in document stores expands beyond the first normal form by encouraging de-normalization through nesting. This hinders the process, as the number of alternatives grows exponentially with multiple choices in nesting (including different levels) and referencing (including the direction of the reference). Due to this complexity, document store data design is mostly carried out in trial-and-error or ad-hoc rule-based approaches. However, the choices affect multiple, often conflicting, aspects such as query performance, storage space, and complexity of the documents. To overcome these issues, in this paper, we apply multicriteria optimization. Our approach is driven by a query workload and a set of optimization objectives. First, we formalize a canonical model to represent alternative designs and introduce an algebra of transformations that can systematically modify a design. Then, using these transformations, we implement a local search algorithm driven by a loss function that can propose near-optimal designs with high probability. Finally, we compare our prototype against an existing document store data design solution purely driven by query cost, where our proposed designs have better performance and are more compact with less redundancy.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This research has been funded by the European Commission through the Erasmus Mundus Joint Doctorate "Information Technologies for Business Intelligence—Doctoral College" (IT4BI-DC). Sergi Nadal is partly supported by the Spanish Ministerio de Ciencia e Innovación, as well as the European Union—NextGenerationEU, under project FJC2020-045809-I / AEI/10.13039/501100011033.Peer ReviewedPostprint (published version

    Semantics and Verification of UML Activity Diagrams for Workflow Modelling

    Get PDF
    This thesis defines a formal semantics for UML activity diagrams that is suitable for workflow modelling. The semantics allows verification of functional requirements using model checking. Since a workflow specification prescribes how a workflow system behaves, the semantics is defined and motivated in terms of workflow systems. As workflow systems are reactive and coordinate activities, the defined semantics reflects these aspects. In fact, two formal semantics are defined, which are completely different. Both semantics are defined directly in terms of activity diagrams and not by a mapping of activity diagrams to some existing formal notation. The requirements-level semantics, based on the Statemate semantics of statecharts, assumes that workflow systems are infinitely fast w.r.t. their environment and react immediately to input events (this assumption is called the perfect synchrony hypothesis). The implementation-level semantics, based on the UML semantics of statecharts, does not make this assumption. Due to the perfect synchrony hypothesis, the requirements-level semantics is unrealistic, but easy to use for verification. On the other hand, the implementation-level semantics is realistic, but difficult to use for verification. A class of activity diagrams and a class of functional requirements is identified for which the outcome of the verification does not depend upon the particular semantics being used, i.e., both semantics give the same result. For such activity diagrams and such functional requirements, the requirements-level semantics is as realistic as the implementation-level semantics, even though the requirements-level semantics makes the perfect synchrony hypothesis. The requirements-level semantics has been implemented in a verification tool. The tool interfaces with a model checker by translating an activity diagram into an input for a model checker according to the requirements-level semantics. The model checker checks the desired functional requirement against the input model. If the model checker returns a counterexample, the tool translates this counterexample back into the activity diagram by highlighting a path corresponding to the counterexample. The tool supports verification of workflow models that have event-driven behaviour, data, real time, and loops. Only model checkers supporting strong fairness model checking turn out to be useful. The feasibility of the approach is demonstrated by using the tool to verify some real-life workflow models

    Recommender systems in model-driven engineering: A systematic mapping review

    Full text link
    Recommender systems are information filtering systems used in many online applications like music and video broadcasting and e-commerce platforms. They are also increasingly being applied to facilitate software engineering activities. Following this trend, we are witnessing a growing research interest on recommendation approaches that assist with modelling tasks and model-based development processes. In this paper, we report on a systematic mapping review (based on the analysis of 66 papers) that classifies the existing research work on recommender systems for model-driven engineering (MDE). This study aims to serve as a guide for tool builders and researchers in understanding the MDE tasks that might be subject to recommendations, the applicable recommendation techniques and evaluation methods, and the open challenges and opportunities in this field of researchThis work has been funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 813884 (Lowcomote [134]), by the Spanish Ministry of Science (projects MASSIVE, RTI2018-095255-B-I00, and FIT, PID2019-108965GB-I00) and by the R&D programme of Madrid (Project FORTE, P2018/TCS-431

    A Layered Reference Architecture for Metamodels to Tailor Quality Modeling and Analysis

    Get PDF
    corecore