60 research outputs found
Automated schema matching techniques: an exploratory study
Manual schema matching is a problem for many database applications that use multiple data sources including data warehousing and e-commerce applications. Current research attempts to address this problem by developing algorithms to automate aspects of the schema-matching task. In this paper, an approach using an external dictionary facilitates automated discovery of the semantic meaning of database schema terms. An experimental study was conducted to evaluate the performance and accuracy of five schema-matching techniques with the proposed approach, called SemMA. The proposed approach and results are compared with two existing semi-automated schema-matching approaches and suggestions for future research are made
RelBAC: Relation Based Access Control
TheWeb 2.0, GRID applications and, more recently, semantic desktop applications are bringing the Web to a situation where more and more data and metadata are shared and made available to large user groups. In this context, metadata may be tags or complex graph structures such as file system or web directories, or (lightweight) ontologies. In turn, users can themselves be tagged by certain properties, and can be organized in complex directory structures, very much in the same way as data. Things are further complicated by the highly unpredictable and autonomous dynamics of data, users, permissions and access control rules. In this paper we propose a new access control model and a logic, called RelBAC (for Relation Based Access Control) which allows us to deal with this novel scenario. The key idea, which differentiates RelBAC from the state of the art, e.g., Role Based Access Control (RBAC), is that permissions are modeled as relations between users and data, while access control rules are their instantiations on specific sets of users and objects. As such, access control rules are assigned an arity which allows a fine tuning of which users can access which data, and can evolve independently, according to the desires of the policy manager(s). Furthermore, the formalization of the RelBAC model as an Entity-Relationship (ER) model allows for its direct translation into Description Logics (DL). In turn, this allows us to reason, possibly at run time, about access control policies
Improving Data Integration through Disambiguation Techniques
In this paper Word Sense Disambiguation (WSD) issue in the context of data integration is outlined and an Approximate Word Sense Disambiguation approach (AWSD) is proposed for the automatic lexical annotation of structured and semi-structured data sources
Enhanced normalization approach addressing stop-word complexity in compound-word schema labels
An extensive review of the existing schema matching approaches discovered an area of improvement in the
field of semantic schema matching. Normalization and lexical annotation methods using WordNet have been somewhat successful in general cases. However, in the presence of stop-words these approaches result in poor accuracy. Stop-words have previously been ignored in most studies resulting in false negative conclusions. This paper proposes NORMSTOP (NORMalizer of schemata having STOP-words) as an improved schema normalization approach that addresses the complexity of stop-words (e.g. ‘by’, ‘at’, ‘and,’ or’) in Compound Word (CW) schema labels. Using a combined set of WordNet features, NORMSTOP isolates these labels during the preprocessing stage and resets the base-form to a relevant WordNet term, or an annotable compound noun. When tested on the same real dataset used in the earlier approach - (NORMS or NORMalizer of Schemata), NORMSTOP shows up to 13% improvement in annotation recall measurement. This level of improvement takes the overall schema matching process another step closer to perfect accuracy; while its absence exposes a gap in expectation, especially in today’s databases, where stop-words are in abundance
Challenges in Integrating Biological Data Sources
this report, we examine the technical challenges to integration, critique the available tools and resources, and compare the cost and advantages of various methodologies. We begin by analyzing the basic steps in strict and complete integration: 1) transformation of the various schemas to a common data model; 2) matching of semantically related schema objects; 3) schema integration; 4) transformation of data to the federated database on demand; and 5) matching of semantically equivalent data. Some progress has been made on generic problems such as (1) and (3) within the wider database community, but issues of semantics (steps (2) and (5)) have only been dealt with any degree of success by domain experts within the biological community. We then look at the solution space of integration strategies as defined by two axes, the "tightness" of federation and the "degree" of instantiation, discuss where various solutions fall on this plane, and examine their cost and advantages/disadvantages. Finally, we examine technical challenges that are not -3- July 12, 199
Semantic Integration Approach to Efficient Business Data Supply Chain: Integration Approach to Interoperable XBRL
As an open standard for electronic communication of business and financial data, XBRL has the
potential of improving the efficiency of the business data supply chain. A number of jurisdictions
have developed different XBRL taxonomies as their data standards. Semantic heterogeneity
exists in these taxonomies, the corresponding instances, and the internal systems that store the
original data. Consequently, there are still substantial difficulties in creating and using XBRL
instances that involve multiple taxonomies. To fully realize the potential benefits of XBRL, we
have to develop technologies to reconcile semantic heterogeneity and enable interoperability of
various parts of the supply chain. In this paper, we analyze the XBRL standard and use examples
of different taxonomies to illustrate the interoperability challenge. We also propose a technical
solution that incorporates schema matching and context mediation techniques to improve the
efficiency of the production and consumption of XBRL data
Survey on Techniques for Ontology Interoperability in Semantic Web
Ontology is a shared conceptualization of knowledge representation of particular domain. These are used for the enhancement of semantic information explicitly. It is considered as a key element in semantic web development. Creation of global web data sources is impossible because of the dynamic nature of the web. Ontology Interoperability provides the reusability of ontologies. Different domain experts and ontology engineers create different ontologies for the same or similar domain depending on their data modeling requirements. These cause ontology heterogeneity and inconsistency problems. For more better and precise results ontology mapping is the solution. As their use has increased, providing means of resolving semantic differences has also become very important. Papers on ontology interoperability report the results on different frameworks and this makes their comparison almost impossible. Therefore, the main focus of this paper will be on providing some basics of ontology interoperability and briefly introducing its different approaches. In this paper we survey the approaches that have been proposed for providing interoperability among domain ontologies and its related techniques and tools
Recommended from our members
Solving semantic ambiguity to improve semantic web based ontology matching
A new paradigm in Semantic Web research focuses on the development of a new generation of knowledge-based problem solvers, which can exploit the massive amounts of formally specified information available on the Web, to produce novel intelligent functionalities. An important example of this paradigm can be found in the area of Ontology Matching, where new algorithms, which derive mappings from an exploration of multiple and heterogeneous online ontologies, have been proposed. While these algorithms exhibit very good performance, they rely on merely syntactical techniques to anchor the terms to be matched to those found on the Semantic Web. As a result, their precision can be affected by ambiguous words. In this paper, we aim to solve these problems by introducing techniques from Word Sense Disambiguation, which validate the mappings by exploring the semantics of the ontological terms involved in the matching process. Specifically we discuss how two techniques, which exploit the ontological context of the matched and anchor terms, and the information provided by WordNet, can be used to filter out mappings resulting from the incorrect anchoring of ambiguous terms. Our experiments show that each of the proposed disambiguation techniques, and even more their combination, can lead to an important increase in precision, without having too negative an impact on recall
- …