36,940 research outputs found
Middleware-based Database Replication: The Gaps between Theory and Practice
The need for high availability and performance in data management systems has
been fueling a long running interest in database replication from both academia
and industry. However, academic groups often attack replication problems in
isolation, overlooking the need for completeness in their solutions, while
commercial teams take a holistic approach that often misses opportunities for
fundamental innovation. This has created over time a gap between academic
research and industrial practice.
This paper aims to characterize the gap along three axes: performance,
availability, and administration. We build on our own experience developing and
deploying replication systems in commercial and academic settings, as well as
on a large body of prior related work. We sift through representative examples
from the last decade of open-source, academic, and commercial database
replication systems and combine this material with case studies from real
systems deployed at Fortune 500 customers. We propose two agendas, one for
academic research and one for industrial R&D, which we believe can bridge the
gap within 5-10 years. This way, we hope to both motivate and help researchers
in making the theory and practice of middleware-based database replication more
relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on
Management of Data, Vancouver, Canada, June 200
Winnowing ontologies based on application use
The requirements of specific applications and services are often over estimated when ontologies are reused or built. This sometimes results in many ontologies being too large for their intended purposes. It is not uncommon that when applications and services are deployed over an ontology, only a few parts of the ontology are queried and used. Identifying which parts of an ontology are being used could be helpful to winnow the ontology, i.e., simplify or shrink the ontology to smaller, more fit for purpose size. Some approaches to handle this problem have already been suggested in the literature. However, none of that work showed how ontology-based applications can be used in the ontology-resizing process, or how they might be affected by it. This paper presents a study on the use of the AKT Reference Ontology by a number of applications and services,and investigates the possibility of relying on this usage information to winnow that ontology
Reviewer Integration and Performance Measurement for Malware Detection
We present and evaluate a large-scale malware detection system integrating
machine learning with expert reviewers, treating reviewers as a limited
labeling resource. We demonstrate that even in small numbers, reviewers can
vastly improve the system's ability to keep pace with evolving threats. We
conduct our evaluation on a sample of VirusTotal submissions spanning 2.5 years
and containing 1.1 million binaries with 778GB of raw feature data. Without
reviewer assistance, we achieve 72% detection at a 0.5% false positive rate,
performing comparable to the best vendors on VirusTotal. Given a budget of 80
accurate reviews daily, we improve detection to 89% and are able to detect 42%
of malicious binaries undetected upon initial submission to VirusTotal.
Additionally, we identify a previously unnoticed temporal inconsistency in the
labeling of training datasets. We compare the impact of training labels
obtained at the same time training data is first seen with training labels
obtained months later. We find that using training labels obtained well after
samples appear, and thus unavailable in practice for current training data,
inflates measured detection by almost 20 percentage points. We release our
cluster-based implementation, as well as a list of all hashes in our evaluation
and 3% of our entire dataset.Comment: 20 papers, 11 figures, accepted at the 13th Conference on Detection
of Intrusions and Malware & Vulnerability Assessment (DIMVA 2016
Ontology-based semantic interpretation of cylindricity specification in the next-generation GPS
Cylindricity specification is one of the most important geometrical specifications in geometrical product development. This specification can be referenced from the rules and examples in tolerance standards and technical handbooks in practice. These rules and examples are described in the form of natural language, which may cause ambiguities since different designers may have different understandings on a rule or an example.
To address the ambiguous problem, a categorical data model of cylindricity specification in the next-generation Geometrical Product Specifications (GPS) was proposed at the University of Huddersfield. The modeling language used in the categorical data model is category
language. Even though category language can develop a syntactically correct data model, it is difficult to interpret the semantics of the cylindricity specification explicitly. This paper proposes an ontology-based approach to interpret the semantics of cylindricity specification on
the basis of the categorical data model. A scheme for translating the category language to the OWL 2 Web Ontology Language (OWL 2) is presented in this approach. Through such a scheme, the categorical data model is translated into a semantically enriched model, i.e. an OWL 2
ontology for cylindricity specification. This ontology can interpret the semantics of cylindricity specification explicitly. As the benefits of such semantic interpretation, consistency checking, inference procedures and semantic queries can be performed on the OWL 2 ontology. The proposed approach could be easily extended to support the semantic interpretations of other kinds of geometrical specifications
A Semantic Collaboration Method Based on Uniform Knowledge Graph
The Semantic Internet of Things is the extension of the Internet of Things and the Semantic Web, which aims to build an interoperable collaborative system to solve the heterogeneous problems in the Internet of Things. However, the Semantic Internet of Things has the characteristics of both the Internet of Things and the Semantic Web environment, and the corresponding semantic data presents many new data features. In this study, we analyze the characteristics of semantic data and propose the concept of a uniform knowledge graph, allowing us to be applied to the environment of the Semantic Internet of Things better. Here, we design a semantic collaboration method based on a uniform knowledge graph. It can take the uniform knowledge graph as the form of knowledge organization and representation, and provide a useful data basis for semantic collaboration by constructing semantic links to complete semantic relation between different data sets, to achieve the semantic collaboration in the Semantic Internet of Things. Our experiments show that the proposed method can analyze and understand the semantics of user requirements better and provide more satisfactory outcomes
Analyzing the Tagging Quality of the Spanish OpenStreetMap
In this paper, a framework for the assessment of the quality of OpenStreetMap is presented, comprising a batch of methods to analyze the quality of entity tagging. The approach uses Taginfo as a reference base and analyses quality measures such as completeness, compliance, consistence, granularity, richness and trust . The framework has been used to analyze the quality of OpenStreetMap in Spain, comparing the main cities of Spain. Also a comparison between Spain and some major European cities has been carried out. Additionally, a Web tool has been also developed in order to facilitate the same kind of analysis in any area of the world
- âŠ