96 research outputs found

    The Lannion report on Big Data and Security Monitoring Research

    Get PDF
    International audienceDuring the last decade, big data management has attracted increasing interest from both the industrial and academic communities. In parallel, Cyber Security has become mandatory due to various and more intensive threats. In June 2022, a group of researchers has met to reflect on their community's impacts on current research challenges. In particular, they have considered four dimensions: (1) dedicated systems being data processing and analytic platforms or time series management systems; (2) graphs analytics and distributed computation; (3) privacy; and (4) new hardware

    Metadata-Aware Query Processing over Data Streams

    Get PDF
    Many modern applications need to process queries over potentially infinite data streams to provide answers in real-time. This dissertation proposes novel techniques to optimize CPU and memory utilization in stream processing by exploiting metadata on streaming data or queries. It focuses on four topics: 1) exploiting stream metadata to optimize SPJ query operators via operator configuration, 2) exploiting stream metadata to optimize SPJ query plans via query-rewriting, 3) exploiting workload metadata to optimize parameterized queries via indexing, and 4) exploiting event constraints to optimize event stream processing via run-time early termination. The first part of this dissertation proposes algorithms for one of the most common and expensive query operators, namely join, to at runtime identify and purge no-longer-needed data from the state based on punctuations. Exploitations of the combination of punctuation and commonly-used window constraints are also studied. Extensive experimental evaluations demonstrate both reduction on memory usage and improvements on execution time due to the proposed strategies. The second part proposes herald-driven runtime query plan optimization techniques. We identify four query optimization techniques, design a lightweight algorithm to efficiently detect the optimization opportunities at runtime upon receiving heralds. We propose a novel execution paradigm to support multiple concurrent logical plans by maintaining one physical plan. Extensive experimental study confirms that our techniques significantly reduce query execution times. The third part deals with the shared execution of parameterized queries instantiated from a query template. We design a lightweight index mechanism to provide multiple access paths to data to facilitate a wide range of parameterized queries. To withstand workload fluctuations, we propose an index tuning framework to tune the index configurations in a timely manner. Extensive experimental evaluations demonstrate the effectiveness of the proposed strategies. The last part proposes event query optimization techniques by exploiting event constraints such as exclusiveness or ordering relationships among events extracted from workflows. Significant performance gains are shown to be achieved by our proposed constraint-aware event processing techniques

    Reasoning in description logics using resolution and deductive databases

    Get PDF

    A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data

    Full text link
    We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable meta-data on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Previous contributions on specific RDF annotation domains are encompassed by our unified reasoning formalism as we show by instantiating it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we provide a generic method for combining multiple annotation domains allowing to represent, e.g. temporally-annotated fuzzy RDF. Furthermore, we address the development of a query language -- AnQL -- that is inspired by SPARQL, including several features of SPARQL 1.1 (subqueries, aggregates, assignment, solution modifiers) along with the formal definitions of their semantics

    Approximate Assertional Reasoning Over Expressive Ontologies

    Get PDF
    In this thesis, approximate reasoning methods for scalable assertional reasoning are provided whose computational properties can be established in a well-understood way, namely in terms of soundness and completeness, and whose quality can be analyzed in terms of statistical measurements, namely recall and precision. The basic idea of these approximate reasoning methods is to speed up reasoning by trading off the quality of reasoning results against increased speed

    Correctness-aware high-level functional matching approaches for semantic web services

    Get PDF
    Existing service matching approaches trade precision for recall, creating the need for humans to choose the correct services, which is a major obstacle for automating the service matching and the service aggregation processes. To overcome this problem, the matchmaker must automatically determine the correctness of the matching results according to the defined users' goals. That is, only service(s)-achieving users' goals are considered correct. This requires the high-level functional semantics of services, users, and application domains to be captured in a machine-understandable format. Also this requires the matchmaker to determine the achievement of users' goals without invoking the services. We propose the G+ model to capture the high-level functional specifications of services and users (namely goals, achievement contexts and external behaviors) providing the basis for automated goal achievement determination; also we propose the concepts substitutability graph to capture the application domains' semantics. To avoid the false negatives resulting from adopting existing constraint and behavior matching approaches during service matching, we also propose new constraint and behavior matching approaches to match constraints with different scopes, and behavior models with different number of state transitions. Finally, we propose two correctness-aware matching approaches (direct and aggregate) that semantically match and aggregate semantic web services according to their G+ models, providing the required theoretical proofs and the corresponding verifying simulation experiments

    Methods for Efficient and Accurate Discovery of Services

    Get PDF
    With an increasing number of services developed and offered in an enterprise setting or the Web, users can hardly verify their requirements manually in order to find appropriate services. In this thesis, we develop a method to discover semantically described services. We exploit comprehensive service and request descriptions such that a wide variety of use cases can be supported. In our discovery method, we compute the matchmaking decision by employing an efficient model checking technique

    Efficient Optimally Lazy Algorithms for Minimal-Interval Semantics

    Full text link
    Minimal-interval semantics associates with each query over a document a set of intervals, called witnesses, that are incomparable with respect to inclusion (i.e., they form an antichain): witnesses define the minimal regions of the document satisfying the query. Minimal-interval semantics makes it easy to define and compute several sophisticated proximity operators, provides snippets for user presentation, and can be used to rank documents. In this paper we provide algorithms for computing conjunction and disjunction that are linear in the number of intervals and logarithmic in the number of operands; for additional operators, such as ordered conjunction and Brouwerian difference, we provide linear algorithms. In all cases, space is linear in the number of operands. More importantly, we define a formal notion of optimal laziness, and either prove it, or prove its impossibility, for each algorithm. We cast our results in a general framework of antichains of intervals on total orders, making our algorithms directly applicable to other domains.Comment: 24 pages, 4 figures. A preliminary (now outdated) version was presented at SPIRE 200

    Presence Condition Reasoning with Feature Model Interfaces: Master’s Thesis

    Get PDF
    Family-based static analysis techniques allow to efficiently analyze the exponential variant space of a software product line (SPL). Rather than analyzing each variant completely on its own, a family-based analysis processes information shared among multiple variants only once. The analysis delegates parts of the exponential complexity to reasoning about boolean presence conditions, which control the inclusion or exclusion of code fragments in a particular variant. The feature model of the SPL, which determines the set of valid variants, must be included in the reasoning process to obtain correct results. However, typically not all parts of a feature model arerelevant for a particular condition. In this work, we propose a method to accelerate presence condition reasoning by decreasing the size of the model used for reasoning. In particular, we use the concept of feature model interfaces to decompose a feature diagram according to its hierarchical structure, and obtain an abstraction that can be selectively refined on the fly for a given condition. We formalize our approach in terms of propositional feature diagram semantics and prove its correctness. In our evaluation, we demonstrate that our approach accelerates reasoning by up to 24 percent

    Explanation and diagnosis services for unsatisfiability and inconsistency in description logics

    Get PDF
    Description Logics (DLs) are a family of knowledge representation formalisms with formal semantics and well understood computational complexities. In recent years, they have found applications in many domains, including domain modeling, software engineering, configuration, and the Semantic Web. DLs have deeply influenced the design and standardization of the Web Ontology Language OWL. The acceptance of OWL as a web standard has reciprocally resulted in the widespread use of DL ontologies on the web. As more applications emerge with increasing complexity, non-standard reasoning services, such as explanation and diagnosis, have become important capabilities that a DL reasoner should provide. For example, unsatisfiability and inconsistency may arise in an ontology due to unintentional design defects or changes in the ontology evolution process. Without explanations, searching for the cause is like looking for a needle in a haystack. It is, therefore, surprising that most of the existing DL reasoners do not provide explanation services; they provide "Yes/No" answers to satisfiability or consistency queries without giving any reasons. This thesis presents our solution for providing explanation and diagnosis services for DL reasoners. We firstly propose a framework based on resolution to explain inconsistency and unsatisfiability in Description Logic. A sound and complete algorithm is developed to generate explanations for the DL language [Special characters omitted.] ALCHI based on the unsatisfiability and inconsistency patterns in [Special characters omitted.] ALCHI . We also develop a technique based on Shapley values to measure inconsistencies in ontologies for diagnosis purposes. This measure is used to identify which axioms in an input ontology or which parts of these axioms need to be repaired in order to make the input consistent. We also investigate optimization techniques to compute the inconsistency measures based on particular properties of DLs. Based on the above theoretical foundations, a running prototype system is implemented to evaluate the practicability of the proposed services. Our preliminary empirical results show that the resolution based explanation framework and the diagnosis procedure based on inconsistency measures can be applied in the real world application
    • …
    corecore