92 research outputs found

    Visualizing Statistical Linked Knowledge for Decision Support

    Get PDF
    In a global and interconnected economy, decision makers often need to consider information from various domains. A tourism destination manager, for example, has to correlate tourist behavior with financial and environmental indicators to allocate funds for strategic long-term investments. Statistical data underpins a broad range of such cross-domain decision tasks. A variety of statistical datasets are available as Linked Open Data, often incorporated into visual analytics solutions to support decision making. What are the principles, architectures, workflows and implementation design patterns that should be followed for building such visual cross-domain decision support systems. This article introduces a methodology to integrate and visualize cross-domain statistical data sources by applying selected RDF Data Cube (QB) principles. A visual dashboard built according to this methodology is presented and evaluated in the context of two use cases in the tourism and telecommunications domains

    Design of a Controlled Language for Critical Infrastructures Protection

    Get PDF
    We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen

    Human-competitive automatic topic indexing

    Get PDF
    Topic indexing is the task of identifying the main topics covered by a document. These are useful for many purposes: as subject headings in libraries, as keywords in academic publications and as tags on the web. Knowing a document's topics helps people judge its relevance quickly. However, assigning topics manually is labor intensive. This thesis shows how to generate them automatically in a way that competes with human performance. Three kinds of indexing are investigated: term assignment, a task commonly performed by librarians, who select topics from a controlled vocabulary; tagging, a popular activity of web users, who choose topics freely; and a new method of keyphrase extraction, where topics are equated to Wikipedia article names. A general two-stage algorithm is introduced that first selects candidate topics and then ranks them by significance based on their properties. These properties draw on statistical, semantic, domain-specific and encyclopedic knowledge. They are combined using a machine learning algorithm that models human indexing behavior from examples. This approach is evaluated by comparing automatically generated topics to those assigned by professional indexers, and by amateurs. We claim that the algorithm is human-competitive because it chooses topics that are as consistent with those assigned by humans as their topics are with each other. The approach is generalizable, requires little training data and applies across different domains and languages

    Aligning Controlled vocabularies for enabling semantic matching in a distributed knowledge management system

    Get PDF
    The underlying idea of the Semantic Web is that web content should be expressed not only in natural language but also in a language that can be unambiguously understood, interpreted and used by software agents, thus permitting them to find, share and integrate information more easily. The central notion of the Semantic Web's syntax are ontologies, shared vocabularies providing taxonomies of concepts, objects and relationships between them, which describe particular domains of knowledge. A vocabulary stores words, synonyms, word sense definitions (i.e. glosses), relations between word senses and concepts; such a vocabulary is generally referred to as the Controlled Vocabulary (CV) if choice or selection of terms are done by domain specialists. A facet is a distinct and dimensional feature of a concept or a term that allows a taxonomy, ontology or CV to be viewed or ordered in multiple ways, rather than in a single way. The facet is clearly defined, mutually exclusive, and composed of collectively exhaustive properties or characteristics of a domain. For example, a collection of rice might be represented using a name facet, place facet etc. This thesis presents a methodology for producing mappings between Controlled Vocabularies, based on a technique called \Hidden Semantic Matching". The \Hidden" word stands for it not relying on any sort of externally provided background knowledge. The sole exploited knowledge comes from the \semantic context" of the same CVs which are being matched. We build a facet for each concept of these CVs, considering more general concepts (broader terms), less general concepts (narrow terms) or related concepts (related terms).Together these form a concept facet (CF) which is then used to boost the matching process

    Ontology driven information retrieval.

    Get PDF
    Ontology-driven information retrieval deals with the use of entities specified in domain ontologies to enhance search and browse. The entities or concepts of lightweight ontological resources are traditionally used to index resources in specialised domains. Indexing with concepts is often achieved manually and reusing them to enhance search remains a challenge. Other challenges range from the difficulty in merging multiple ontologies for use in retrieval to the problem of integrating concept-based search into existing search systems. We mainly encounter these challenges in enterprise search environments, which have not kept pace with Web search engines and mostly rely on full-text search systems. Full-text search systems are keyword-based and suffer from well-known vocabulary mismatch problems. Ontologies model domain knowledge and have the potential for use in understanding the unstructured content of documents. In this thesis, we investigate the challenges of using domain ontologies for enhancing search in enterprise systems. Firstly, we investigate methods for annotating documents by identifying the best concepts that represent their contents. We explore ways to overcome the challenges of insufficient textual features in lightweight ontologies and introduce an unsupervised method for annotating documents based on generating concept descriptors from external resources. Specifically, we augment concepts with descriptive textual content by exploiting the taxonomic structure of an ontology to ensure that we generate useful descriptors. Secondly, the need often arises for cross-ontology reasoning when using multiple ontologies in ontology-driven search. Once again, we attempt to overcome the absence of rich features in lightweight ontologies by exploring the use of background knowledge for the alignment process. We propose novel ontology alignment techniques which integrate string metrics, semantic features, and term weights for discovering diverse correspondence types in supervised and unsupervised ontology alignment. Thirdly, we investigate different representational schemes for queries and documents and explore semantic ranking models using conceptual representations. Accordingly, we propose a semantic ranking model that incorporates the knowledge of concept relatedness and a predictive model to apply semantic ranking only when it is deemed beneficial for retrieval. Finally, we conduct comprehensive evaluations of the proposed methods and discuss our findings

    Generic adaptation framework for unifying adaptive web-based systems

    Get PDF
    The Generic Adaptation Framework (GAF) research project first and foremost creates a common formal framework for describing current and future adaptive hypermedia (AHS) and adaptive webbased systems in general. It provides a commonly agreed upon taxonomy and a reference model that encompasses the most general architectures of the present and future, including conventional AHS, and different types of personalization-enabling systems and applications such as recommender systems (RS) personalized web search, semantic web enabled applications used in personalized information delivery, adaptive e-Learning applications and many more. At the same time GAF is trying to bring together two (seemingly not intersecting) views on the adaptation: a classical pre-authored type, with conventional domain and overlay user models and data-driven adaptation which includes a set of data mining, machine learning and information retrieval tools. To bring these research fields together we conducted a number GAF compliance studies including RS, AHS, and other applications combining adaptation, recommendation and search. We also performed a number of real systems’ case-studies to prove the point and perform a detailed analysis and evaluation of the framework. Secondly it introduces a number of new ideas in the field of AH, such as the Generic Adaptation Process (GAP) which aligns with a layered (data-oriented) architecture and serves as a reference adaptation process. This also helps to understand the compliance features mentioned earlier. Besides that GAF deals with important and novel aspects of adaptation enabling and leveraging technologies such as provenance and versioning. The existence of such a reference basis should stimulate AHS research and enable researchers to demonstrate ideas for new adaptation methods much more quickly than if they had to start from scratch. GAF will thus help bootstrap any adaptive web-based system research, design, analysis and evaluation

    A Semantic Web Based Search Engine with X3D Visualisation of Queries and Results

    Get PDF
    Parts of this PhD have been published: Gkoutzis, Konstantinos, and Vladimir Geroimenko. "Moving from Folksonomies to Taxonomies: Using the Social Web and 3D to Build an Unlimited Semantic Ontology." Proceedings of the 2011 15th International Conference on Information Visualisation. IEEE Computer Society, 2011.The Semantic Web project has introduced new techniques for managing information. Data can now be organised more efficiently and in such a way that computers can take advantage of the relationships that characterise the given input to present more relevant output. Semantic Web based search engines can quickly educe exactly what is needed to be found and retrieve it while avoiding information overload. Up until now, search engines have interacted with their users by asking them to look for words and phrases. We propose the creation of a new generation Semantic Web search engine that will offer a visual interface for queries and results. To create such an engine, information input must be viewed not merely as keywords, but as specific concepts and objects which are all part of the same universal system. To make the manipulation of the interconnected visual objects simpler and more natural, 3D graphics are utilised, based on the X3D Web standard, allowing users to semantically synthesise their queries faster and in a more logical way, both for them and the computer

    Evolutionary Design of Search and Triage Interfaces for Large Document Sets

    Get PDF
    This dissertation is concerned with the design of visual interfaces for searching and triaging large document sets. Data proliferation has generated new and challenging information-based tasks across various domains. Yet, as the document sets of these tasks grow, it has become increasingly difficult for users to remain active participants in the information-seeking process, such as when searching and triaging large document sets. During information search, users seek to understand their document set, align domain knowledge, formulate effective queries, and use those queries to develop document set mappings which help generate encounters with valued documents. During information triage, users encounter the documents mapped by information search to judge relevance to information-seeking objectives. Yet, information search and triage can be challenging for users. Studies have found that when using traditional design strategies in tool interfaces for search and triage, users routinely struggle to understand the domain being searched, apply their expertise, communicate their objectives during query building, and assess the relevance of search results during information triage. Users must understand and apply domain- specific vocabulary when communicating information-seeking objectives. Yet, task vocabularies typically do not align with those of users, especially in tasks of complex domains. Ontologies can be valuable mediating resources for bridging between the vocabularies of users and tasks. They are created by domain experts to provide a standardized mapping of knowledge that can be leveraged both by computational- as well as human-facing systems. We believe that the activation of ontologies within user-facing interfaces has a potential to help users when searching and triaging large document sets, however more research is required

    Towards a search engine for functionally appropriate, Web-enabled models and simulations

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2006.Includes bibliographical references (p. 100-104).New emerging modeling and simulation environments have the potential to provide easy access to design models and simulations on the Internet, much as the World Wide Web (WWW) has provided easy access to information. To support sharing, integration and reuse of web-enabled applications (design models and simulations), a search engine for functionally appropriate/similar models is needed. There are ongoing efforts to develop ontological descriptions for web content and simulation model functionality, where semantics of available services are explicitly represented using a shared knowledge representation of concepts and rules. Simulation publishers are responsible of semantically marking up the interfaces with such ontological annotations. In contrast to such an approach, this work proposes a flexible, implicit, pattern matching solution that does not require any extra annotations to accompany, the models, much as the way current web search engines operate. A learning-through-association, similarity-based approach was developed. It uses only pre-existing low-level information in web-enabled simulation interfaces-such as model and parameters names, parameter units, parameter scale, input/output structure, causality, and documentation - to synthesize templates that become archetypes for functional concepts.(cont.) Then, different interfaces are matched against templates and are classified based on how they are similar to a certain template. Newly found functionally similar interfaces can be merged into the original template, thereby both generalizing the pattern for a functional role and strengthening the most critical aspects of the pattern. This thesis also developed algorithms based on graph theory and pre-defined heuristic attributes similarity metrics. The information from model interfaces is represented using Attributed Relational Graphs (ARG), where nodes represent parameters and arcs represent causality relationships. Templates are represented as Fuzzy Attributed Relational Graphs, which are extended ARGs whose node attributes are fuzzy sets. Then, a bipartite graph-matching algorithm is used to compare graphs and the similarity between an interface and a template. Graph merging algorithm is also designed for template generalization. A prototype implementation of proposed algorithms is developed and applied to a suite of real-life engineering models. Results validate the hypothesis and demonstrated the plausibility of the approach.by Qing Cao.Ph.D

    Interoperability of Enterprise Software and Applications

    Get PDF
    • 

    corecore