403 research outputs found

    Information Retrieval for Multivariate Research Data Repositories

    Get PDF
    In this dissertation, I tackle the challenge of information retrieval for multivariate research data by providing novel means of content-based access. Large amounts of multivariate data are produced and collected in different areas of scientific research and industrial applications, including the human or natural sciences, the social or economical sciences and applications like quality control, security and machine monitoring. Archival and re-use of this kind of data has been identified as an important factor in the supply of information to support research and industrial production. Due to increasing efforts in the digital library community, such multivariate data are collected, archived and often made publicly available by specialized research data repositories. A multivariate research data document consists of tabular data with mm columns (measurement parameters, e.g., temperature, pressure, humidity, etc.) and nn rows (observations). To render such data-sets accessible, they are annotated with meta-data according to well-defined meta-data standard when being archived. These annotations include time, location, parameters, title, author (and potentially many more) of the document under concern. In particular for multivariate data, each column is annotated with the parameter name and unit of its data (e.g., water depth [m]). The task of retrieving and ranking the documents an information seeker is looking for is an important and difficult challenge. To date, access to this data is primarily provided by means of annotated, textual meta-data as described above. An information seeker can search for documents of interest, by querying for the annotated meta-data. For example, an information seeker can retrieve all documents that were obtained in a specific region or within a certain period of time. Similarly, she can search for data-sets that contain a particular measurement via its parameter name or search for data-sets that were produced by a specific scientist. However, retrieval via textual annotations is limited and does not allow for content-based search, e.g., retrieving data which contains a particular measurement pattern like a linear relationship between water depth and water pressure, or which is similar to example data the information seeker provides. In this thesis, I deal with this challenge and develop novel indexing and retrieval schemes, to extend the established, meta-data based access to multivariate research data. By analyzing and indexing the data patterns occurring in multivariate data, one can support new techniques for content-based retrieval and exploration, well beyond meta-data based query methods. This allows information seekers to query for multivariate data-sets that exhibit patterns similar to an example data-set they provide. Furthermore, information seekers can specify one or more particular patterns they are looking for, to retrieve multivariate data-sets that contain similar patterns. To this end, I also develop visual-interactive techniques to support information seekers in formulating such queries, which inherently are more complex than textual search strings. These techniques include providing an over-view of potentially interesting patterns to search for, that interactively adapt to the user's query as it is being entered. Furthermore, based on the pattern description of each multivariate data document, I introduce a similarity measure for multivariate data. This allows scientists to quickly discover similar (or contradictory) data to their own measurements

    Filter (Dials | ...)

    Get PDF
    An increasing number of data providing organizations publish their unique information as Linked data which is available in the Semantic Web in RDF (Resource Distribution Framework) format. RDF is a standard model describing web information that is understandable to computer applications. For users without much technical knowledge, information visualization provides support to interpret the structure of underlying data in the web, understand their relationships, form queries and extract more interesting information. This thesis proposes a visual query technique focusing on the visualization of overall availability of data. The prototype visualization tool developed shows the available groups of data items and their size according to different filtering options. The thesis report starts with a clear description of the research problem and an analysis of the efforts made so far in the field for finding a solution. A survey of the the related work is presented to find the potential for the visualization. The Filter Dials, a novel visualization concept serves as the base work and a starting point and that the evolved new technique attempts to overcome some of the notable drawbacks of the Dials. The thesis also proposes a new visual query technique that aims to achieve the goal of visually presenting an overview of large and complex data available in the web. A prototypical implementation of the visualization tool is done and evaluated by users and conclusions are drawn with the applications, advantages, disadvantages and the possible future directions in the proposed method

    Online sketch-based image retrieval using keyshape mining of geometrical objects

    Get PDF
    Online image retrieval has become an active information-sharing due to the massive use of the Internet. The key challenging problems are the semantic gap between the low-level visual features and high-semantic perception and interpretation, due to understating complexity of images and the hand-drawn query input representation which is not a regular input in addition to the huge amount of web images. Besides, the state-of-art research is highly desired to combine multiple types of different feature representations to close the semantic gap. This study developed a new schema to retrieve images directly from the web repository. It comprises three major phases. Firstly a new online input representation based on pixel mining to detect sketch shape features and correlate them with the semantic sketch objects meaning was designed. Secondly, training process was developed to obtain common templates using Singular Value Decomposition (SVD) technique to detect common sketch template. The outcome of this step is a sketch of variety templates dictionary. Lastly, the retrieval phase matched and compared the sketch with image repository using metadata annotation to retrieve the most relevant images. The sequence of processes in this schema converts the drawn input sketch to a string form which contains the sketch object elements. Then, the string is matched with the templates dictionary to specify the sketch metadata name. This selected name will be sent to a web repository to match and retrieve the relevant images. A series of experiments was conducted to evaluate the performance of the schema against the state of the art found in literature using the same datasets comprising one million images from FlickerIm and 0.2 million images from ImageNet. There was a significant retrieval in all cases of 100% precision for the first five retrieved images whereas the state of the art only achieved 88.8%. The schema has addressed many low features obstacles to retrieve more accurate images such as imperfect sketches, rotation, transpose and scaling. The schema has solved all these problems by using a high level semantic to retrieve accurate images from large databases and the web

    A Data-driven, High-performance and Intelligent CyberInfrastructure to Advance Spatial Sciences

    Get PDF
    abstract: In the field of Geographic Information Science (GIScience), we have witnessed the unprecedented data deluge brought about by the rapid advancement of high-resolution data observing technologies. For example, with the advancement of Earth Observation (EO) technologies, a massive amount of EO data including remote sensing data and other sensor observation data about earthquake, climate, ocean, hydrology, volcano, glacier, etc., are being collected on a daily basis by a wide range of organizations. In addition to the observation data, human-generated data including microblogs, photos, consumption records, evaluations, unstructured webpages and other Volunteered Geographical Information (VGI) are incessantly generated and shared on the Internet. Meanwhile, the emerging cyberinfrastructure rapidly increases our capacity for handling such massive data with regard to data collection and management, data integration and interoperability, data transmission and visualization, high-performance computing, etc. Cyberinfrastructure (CI) consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high-performance networks to improve research productivity and enable breakthroughs that are not otherwise possible. The Geospatial CI (GCI, or CyberGIS), as the synthesis of CI and GIScience has inherent advantages in enabling computationally intensive spatial analysis and modeling (SAM) and collaborative geospatial problem solving and decision making. This dissertation is dedicated to addressing several critical issues and improving the performance of existing methodologies and systems in the field of CyberGIS. My dissertation will include three parts: The first part is focused on developing methodologies to help public researchers find appropriate open geo-spatial datasets from millions of records provided by thousands of organizations scattered around the world efficiently and effectively. Machine learning and semantic search methods will be utilized in this research. The second part develops an interoperable and replicable geoprocessing service by synthesizing the high-performance computing (HPC) environment, the core spatial statistic/analysis algorithms from the widely adopted open source python package – Python Spatial Analysis Library (PySAL), and rich datasets acquired from the first research. The third part is dedicated to studying optimization strategies for feature data transmission and visualization. This study is intended for solving the performance issue in large feature data transmission through the Internet and visualization on the client (browser) side. Taken together, the three parts constitute an endeavor towards the methodological improvement and implementation practice of the data-driven, high-performance and intelligent CI to advance spatial sciences.Dissertation/ThesisDoctoral Dissertation Geography 201

    Exploring relations in neuroscientific literature using Augmented Reality: A design study

    Get PDF
    To support scientists in maintaining an overview of disciplinary concepts and their interrelations, we investigate whether Augmented Reality can serve as a platform to make automated methods more accessible and integrated into current literature exploration practices. Building on insights from text and immersive analytics, we identify information and design requirements. We embody these in DatAR, a system design and implementation focussed on analysis of co-occurrences in neuroscientific text collections. We conducted a scenario-based video survey with a sample of neuroscientists and other domain experts, focusing on participants’ willingness to adopt such an AR system in their regular literature review practices. The AR-tailored epistemic and representational designs of our system were generally perceived as suitable for performing complex analytics.We also discuss several fundamental issues with our chosen 3D visualisations, making steps towards understanding in which ways AR is a suitable medium for high-level conceptual literature exploration

    AXMEDIS 2008

    Get PDF
    The AXMEDIS International Conference series aims to explore all subjects and topics related to cross-media and digital-media content production, processing, management, standards, representation, sharing, protection and rights management, to address the latest developments and future trends of the technologies and their applications, impacts and exploitation. The AXMEDIS events offer venues for exchanging concepts, requirements, prototypes, research ideas, and findings which could contribute to academic research and also benefit business and industrial communities. In the Internet as well as in the digital era, cross-media production and distribution represent key developments and innovations that are fostered by emergent technologies to ensure better value for money while optimising productivity and market coverage

    B!SON: A Tool for Open Access Journal Recommendation

    Get PDF
    Finding a suitable open access journal to publish scientific work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, funders’ conditions and the risk of Predatory Publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. It is developed based on a systematic requirements analysis, built on open data, gives publisher-independent recommendations and works across domains. It suggests open access journals based on title, abstract and references provided by the user. The recommendation quality has been evaluated using a large test set of 10,000 articles. Development by two German scientific libraries ensures the longevity of the project

    Geospatial Information Research: State of the Art, Case Studies and Future Perspectives

    Get PDF
    Geospatial information science (GI science) is concerned with the development and application of geodetic and information science methods for modeling, acquiring, sharing, managing, exploring, analyzing, synthesizing, visualizing, and evaluating data on spatio-temporal phenomena related to the Earth. As an interdisciplinary scientific discipline, it focuses on developing and adapting information technologies to understand processes on the Earth and human-place interactions, to detect and predict trends and patterns in the observed data, and to support decision making. The authors – members of DGK, the Geoinformatics division, as part of the Committee on Geodesy of the Bavarian Academy of Sciences and Humanities, representing geodetic research and university teaching in Germany – have prepared this paper as a means to point out future research questions and directions in geospatial information science. For the different facets of geospatial information science, the state of art is presented and underlined with mostly own case studies. The paper thus illustrates which contributions the German GI community makes and which research perspectives arise in geospatial information science. The paper further demonstrates that GI science, with its expertise in data acquisition and interpretation, information modeling and management, integration, decision support, visualization, and dissemination, can help solve many of the grand challenges facing society today and in the future
    • …
    corecore