267 research outputs found

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    A Cooperative Approach for Composite Ontology Matching

    Get PDF
    Ontologies have proven to be an essential element in a range of applications in which knowl-edge plays a key role. Resolving the semantic heterogeneity problem is crucial to allow the interoperability between ontology-based systems. This makes automatic ontology matching, as an anticipated solution to semantic heterogeneity, an important, research issue. Many dif-ferent approaches to the matching problem have emerged from the literature. An important issue of ontology matching is to find effective ways of choosing among many techniques and their variations, and then combining their results. An innovative and promising option is to formalize the combination of matching techniques using agent-based approaches, such as cooperative negotiation and argumentation. In this thesis, the formalization of the on-tology matching problem following an agent-based approach is proposed. Such proposal is evaluated using state-of-the-art data sets. The results show that the consensus obtained by negotiation and argumentation represent intermediary values which are closer to the best matcher. As the best matcher may vary depending on specific differences of multiple data sets, cooperative approaches are an advantage. *** RESUMO - Ontologias são elementos essenciais em sistemas baseados em conhecimento. Resolver o problema de heterogeneidade semântica é fundamental para permitira interoperabilidade entre sistemas baseados em ontologias. Mapeamento automático de ontologias pode ser visto como uma solução para esse problema. Diferentes e complementares abordagens para o problema são propostas na literatura. Um aspecto importante em mapeamento consiste em selecionar o conjunto adequado de abordagens e suas variações, e então combinar seus resultados. Uma opção promissora envolve formalizara combinação de técnicas de ma-peamento usando abordagens baseadas em agentes cooperativos, tais como negociação e argumentação. Nesta tese, a formalização do problema de combinação de técnicas de ma-peamento usando tais abordagens é proposta e avaliada. A avaliação, que envolve conjuntos de testes sugeridos pela comunidade científica, permite concluir que o consenso obtido pela negociação e pela argumentação não é exatamente a melhoria de todos os resultados individuais, mas representa os valores intermediários que são próximo da melhor técnica. Considerando que a melhor técnica pode variar dependendo de diferencas específicas de múltiplas bases de dados, abordagens cooperativas são uma vantagem

    Extensible metadata management framework for personal data lake

    Get PDF
    Common Internet users today are inundated with a deluge of diverse data being generated and siloed in a variety of digital services, applications, and a growing body of personal computing devices as we enter the era of the Internet of Things. Alongside potential privacy compromises, users are facing increasing difficulties in managing their data and are losing control over it. There appears to be a de facto agreement in business and scientific fields that there is critical new value and interesting insight that can be attained by users from analysing their own data, if only it can be freed from its silos and combined with other data in meaningful ways. This thesis takes the point of view that users should have an easy-to-use modern personal data management solution that enables them to centralise and efficiently manage their data by themselves, under their full control, for their best interests, with minimum time and efforts. In that direction, we describe the basic architecture of a management solution that is designed based on solid theoretical foundations and state of the art big data technologies. This solution (called Personal Data Lake - PDL) collects the data of a user from a plurality of heterogeneous personal data sources and stores it into a highly-scalable schema-less storage repository. To simplify the user-experience of PDL, we propose a novel extensible metadata management framework (MMF) that: (i) annotates heterogeneous data with rich lineage and semantic metadata, (ii) exploits the garnered metadata for automating data management workflows in PDL – with extensive focus on data integration, and (iii) facilitates the use and reuse of the stored data for various purposes by querying it on the metadata level either directly by the user or through third party personal analytics services. We first show how the proposed MMF is positioned in PDL architecture, and then describe its principal components. Specifically, we introduce a simple yet effective lineage manager for tracking the provenance of personal data in PDL. We then introduce an ontology-based data integration component called SemLinker which comprises two new algorithms; the first concerns generating graph-based representations to express the native schemas of (semi) structured personal data, and the second algorithm metamodels the extracted representations to a common extensible ontology. SemLinker outputs are utilised by MMF to generate user-tailored unified views that are optimised for querying heterogeneous personal data through low-level SPARQL or high-level SQL-like queries. Next, we introduce an unsupervised automatic keyphrase extraction algorithm called SemCluster that specialises in extracting thematically important keyphrases from unstructured data, and associating each keyphrase with ontological information drawn from an extensible WordNet-based ontology. SemCluster outputs serve as semantic metadata and are utilised by MMF to annotate unstructured contents in PDL, thus enabling various management functionalities such as relationship discovery and semantic search. Finally, we describe how MMF can be utilised to perform holistic integration of personal data and jointly querying it in native representations

    Applying Wikipedia to Interactive Information Retrieval

    Get PDF
    There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

    An Ontology Based Approach Towards A Universal Description Framework for Home Networks

    Get PDF
    Current home networks typically involve two or more machines sharing network resources. The vision for the home network has grown from a simple computer network, to every day appliances embedded with network capabilities. In this environment devices and services within the home can interoperate, regardless of protocol or platform. Network clients can discover required resources by performing network discovery over component descriptions. Common approaches to this discovery process involve simple matching of keywords or attribute/value pairings. Interest emerging from the Semantic Web community has led to ontology languages being applied to network domains, providing a logical and semantically rich approach to both describing and discovering network components. In much of the existing work within this domain, developers have focused on defining new description frameworks in isolation from existing protocol frameworks and vocabularies. This work proposes an ontology-based description framework which takes the ontology approach to the next step, where existing description frameworks are in- corporated into the ontology-based framework, allowing discovery mechanisms to cover multiple existing domains. In this manner, existing protocols and networking approaches can participate in semantically-rich discovery processes. This framework also includes a system architecture developed for the purpose of reconciling existing home network solutions with the ontology-based discovery process. This work also describes an implementation of the approach and is deployed within a home-network environment. This implementation involves existing home networking frameworks, protocols and components, allowing the claims of this work to be examined and evaluated from a ‘real-world’ perspective

    Ontological View-driven Semantic Integration in Open Environments

    Get PDF
    In an open computing environment, such as the World Wide Web or an enterprise Intranet, various information systems are expected to work together to support information exchange, processing, and integration. However, information systems are usually built by different people, at different times, to fulfil different requirements and goals. Consequently, in the absence of an architectural framework for information integration geared toward semantic integration, there are widely varying viewpoints and assumptions regarding what is essentially the same subject. Therefore, communication among the components supporting various applications is not possible without at least some translation. This problem, however, is much more than a simple agreement on tags or mappings between roughly equivalent sets of tags in related standards. Industry-wide initiatives and academic studies have shown that complex representation issues can arise. To deal with these issues, a deep understanding and appropriate treatment of semantic integration is needed. Ontology is an important and widely accepted approach for semantic integration. However, usually there are no explicit ontologies with information systems. Rather, the associated semantics are implied within the supporting information model. It reflects a specific view of the conceptualization that is implicitly defining an ontological view. This research proposes to adopt ontological views to facilitate semantic integration for information systems in open environments. It proposes a theoretical foundation of ontological views, practical assumptions, and related solutions for research issues. The proposed solutions mainly focus on three aspects: the architecture of a semantic integration enabled environment, ontological view modeling and representation, and semantic equivalence relationship discovery. The solutions are applied to the collaborative intelligence project for the collaborative promotion / advertisement domain. Various quality aspects of the solutions are evaluated and future directions of the research are discussed
    corecore