903 research outputs found

    A two-stage framework for designing visual analytics systems to augment organizational analytical processes

    Get PDF
    A perennially interesting research topic in the field of visual analytics is how to effectively develop systems that support organizational knowledge worker’s decision-making and reasoning processes. The primary objective of a visual analytic system is to facilitate analytical reasoning and discovery of insights through interactive visual interfaces. It also enables the transfer of capability and expertise from where it resides to where it is needed–across individuals, and organizations as necessary. The problem is, however, most domain analytical practices generally vary from organizations to organizations. This leads to the diversified design of visual analytics systems in incorporating domain analytical processes, making it difficult to generalize the success from one domain to another. Exacerbating this problem is the dearth of general models of analytical workflows available to enable such timely and effective designs. To alleviate these problems, this dissertation presents a two-stage framework for informing the design of a visual analytics system. This two-stage design framework builds upon and extends current practices pertaining to analytical workflow and focuses, in particular, on investigating its effect on the design of visual analytics systems for organizational environments. It aims to empower organizations with more systematic and purposeful information analyses through modeling the domain users’ reasoning processes. The first stage in this framework is an Observation and Designing stage, in which a visual analytic system is designed and implemented to abstract and encapsulate general organizational analytical processes, through extensive collaboration with domain users. The second stage is the User-centric Refinement stage, which aims at interactively enriching and refining the already encapsulated domain analysis process based on understanding user’s intentions through analyzing their task behavior. To implement this framework in the process of designing a visual analytics system, this dissertation proposes four general design recommendations that, when followed, empower such systems to bring the users closer to the center of their analytical processes. This dissertation makes three primary contributions: first, it presents a general characterization of the analytical workflow in organizational environments. This characterization fills in the blank of the current lack of such an analytical model and further represents a set of domain analytical tasks that are commonly applicable to various organizations. Secondly, this dissertation describes a two-stage framework for facilitating the domain users’ workflows through integrating their analytical models into interactive visual analytics systems. Finally, this dissertation presents recommendations and suggestions on enriching and refining domain analysis through capturing and analyzing knowledge workers’ analysis processes. To exemplify the generalizability of these design recommendations, this dissertation presents three visual analytics systems that are developed following the proposed recommendations, including Taste for Xerox Corporation, OpsVis for Microsoft, and IRSV for the U.S. Department of Transportation. All of these systems are deployed to domain knowledge workers and are adopted for their analytical practices. Extensive empirical evaluations are further conducted to demonstrate efficacy of these systems in facilitating domain analytical processes

    Data integration and FAIR data management in Solid Earth Science

    Get PDF
    Integrated use of multidisciplinary data is nowadays a recognized trend in scientific research, in particular in the domain of solid Earth science where the understanding of a physical process is improved and made complete by different types of measurements – for instance, ground acceleration, SAR imaging, crustal deformation – describing a physical phenomenon. FAIR principles are recognized as a means to foster data integration by providing a common set of criteria for building data stewardship systems for Open Science. However, the implementation of FAIR principles raises issues along dimensions like governance and legal beyond, of course, the technical one. In the latter, in particular, the development of FAIR data provision systems is often delegated to Research Infrastructures or data providers, with support in terms of metrics and best practices offered by cluster projects or dedicated initiatives. In the current work, we describe the approach to FAIR data management in the European Plate Observing System (EPOS), a distributed research infrastructure in the solid Earth science domain that includes more than 250 individual research infrastructures across 25 countries in Europe. We focus in particular on the technical aspects, but including also governance, policies and organizational elements, by describing the architecture of the EPOS delivery framework both from the organizational and technical point of view and by outlining the key principles used in the technical design. We describe how a combination of approaches, namely rich metadata and service-based systems design, are required to achieve data integration. We show the system architecture and the basic features of the EPOS data portal, that integrates data from more than 220 services in a FAIR way. The construction of such a portal was driven by the EPOS FAIR data management approach, that by defining a clear roadmap for compliance with the FAIR principles, produced a number of best practices and technical approaches for complying with the FAIR principles. Such a work, that spans over a decade but concentrates the key efforts in the last 5 years with the EPOS Implementation Phase project and the establishment of EPOS-ERIC, was carried out in synergy with other EU initiatives dealing with FAIR data. On the basis of the EPOS experience, future directions are outlined, emphasizing the need to provide i) FAIR reference architectures that can ease data practitioners and engineers from the domain communities to adopt FAIR principles and build FAIR data systems; ii) a FAIR data management framework addressing FAIR through the entire data lifecycle, including reproducibility and provenance; and iii) the extension of the FAIR principles to policies and governance dimensions.publishedVersio

    PerCon: A Personal Digital Library for Heterogeneous Data Management and Analysis

    Get PDF
    Systems are needed to support access to and analysis of larger and more heterogeneous scientific datasets. Users need support in the location, organization, analysis, and interpretation of data to support their current activities with appropriate services and tools. We developed PerCon, a data management and analysis environment, to support such use. PerCon processes and integrates data gathered via queries to existing data providers to create a personal or a small group digital library of data. Users may then search, browse, visualize, annotate, and organize the data as they proceed with analysis and interpretation. Analysis and interpretation in PerCon takes place in a visual workspace in which multiple data visualizations and annotations are placed into spatial arrangements based on the current task. The system watches for patterns in the user’s data selection, exploration, and organization, then through mixed-initiative interaction assists users by suggesting potentially relevant data from unexplored data sources. In order to identify relevant data, PerCon builds up various precomputed feature tables of data objects including their metadata (e.g. similarities, distances) and a user interest model to infer the user interest or specific information need. In particular, probabilistic networks in PerCon model user interactions (i.e. event features) and predict the data type of greatest interest through network training. In turn, the most relevant data objects of interest in the inferred data type are identified through a weighted feature computation then recommended to the user. PerCon’s data location and analysis capabilities were evaluated in a controlled study with 24 users. The study participants were asked to locate and analyze heterogeneous weather and river data with and without the visual workspace and mixed-initiative interaction, respectively. Results indicate that the visual workspace facilitated information representation and aided in the identification of relationships between datasets. The system’s suggestions encouraged data exploration, leading participants to identify more evidences of correlation among data streams and more potential interactions among weather and river data

    Active provenance for data intensive research

    Get PDF
    The role of provenance information in data-intensive research is a significant topic of discussion among technical experts and scientists. Typical use cases addressing traceability, versioning and reproducibility of the research findings are extended with more interactive scenarios in support, for instance, of computational steering and results management. In this thesis we investigate the impact that lineage records can have on the early phases of the analysis, for instance performed through near-real-time systems and Virtual Research Environments (VREs) tailored to the requirements of a specific community. By positioning provenance at the centre of the computational research cycle, we highlight the importance of having mechanisms at the data-scientists’ side that, by integrating with the abstractions offered by the processing technologies, such as scientific workflows and data-intensive tools, facilitate the experts’ contribution to the lineage at runtime. Ultimately, by encouraging tuning and use of provenance for rapid feedback, the thesis aims at improving the synergy between different user groups to increase productivity and understanding of their processes. We present a model of provenance, called S-PROV, that uses and further extends PROV and ProvONE. The relationships and properties characterising the workflow’s abstractions and their concrete executions are re-elaborated to include aspects related to delegation, distribution and steering of stateful streaming operators. The model is supported by the Active framework for tuneable and actionable lineage ensuring the user’s engagement by fostering rapid exploitation. Here, concepts such as provenance types, configuration and explicit state management allow users to capture complex provenance scenarios and activate selective controls based on domain and user-defined metadata. We outline how the traces are recorded in a new comprehensive system, called S-ProvFlow, enabling different classes of consumers to explore the provenance data with services and tools for monitoring, in-depth validation and comprehensive visual-analytics. The work of this thesis will be discussed in the context of an existing computational framework and the experience matured in implementing provenance-aware tools for seismology and climate VREs. It will continue to evolve through newly funded projects, thereby providing generic and user-centred solutions for data-intensive research

    Making Social Dynamics and Content Evolution Transparent in Collaboratively Written Text

    Get PDF
    This dissertation presents models and algorithms for accurately and efficiently extracting data from revisioned content in Collaborative Writing Systems about (i) the provenance and history of specific sequences of text, as well as (ii) interactions between editors via the content changes they perform, especially disagreement. Visualization tools are presented to gain further insights into the extracted data. Collaboration mechanisms to be researched with these new data and tools are discussed

    Linked Data Supported Information Retrieval

    Get PDF
    Um Inhalte im World Wide Web ausfindig zu machen, sind Suchmaschienen nicht mehr wegzudenken. Semantic Web und Linked Data Technologien ermöglichen ein detaillierteres und eindeutiges Strukturieren der Inhalte und erlauben vollkommen neue Herangehensweisen an die Lösung von Information Retrieval Problemen. Diese Arbeit befasst sich mit den Möglichkeiten, wie Information Retrieval Anwendungen von der Einbeziehung von Linked Data profitieren können. Neue Methoden der computer-gestĂŒtzten semantischen Textanalyse, semantischen Suche, Informationspriorisierung und -visualisierung werden vorgestellt und umfassend evaluiert. Dabei werden Linked Data Ressourcen und ihre Beziehungen in die Verfahren integriert, um eine Steigerung der EffektivitĂ€t der Verfahren bzw. ihrer Benutzerfreundlichkeit zu erzielen. ZunĂ€chst wird eine EinfĂŒhrung in die Grundlagen des Information Retrieval und Linked Data gegeben. Anschließend werden neue manuelle und automatisierte Verfahren zum semantischen Annotieren von Dokumenten durch deren VerknĂŒpfung mit Linked Data Ressourcen vorgestellt (Entity Linking). Eine umfassende Evaluation der Verfahren wird durchgefĂŒhrt und das zu Grunde liegende Evaluationssystem umfangreich verbessert. Aufbauend auf den Annotationsverfahren werden zwei neue Retrievalmodelle zur semantischen Suche vorgestellt und evaluiert. Die Verfahren basieren auf dem generalisierten Vektorraummodell und beziehen die semantische Ähnlichkeit anhand von taxonomie-basierten Beziehungen der Linked Data Ressourcen in Dokumenten und Suchanfragen in die Berechnung der Suchergebnisrangfolge ein. Mit dem Ziel die Berechnung von semantischer Ähnlichkeit weiter zu verfeinern, wird ein Verfahren zur Priorisierung von Linked Data Ressourcen vorgestellt und evaluiert. Darauf aufbauend werden Visualisierungstechniken aufgezeigt mit dem Ziel, die Explorierbarkeit und Navigierbarkeit innerhalb eines semantisch annotierten Dokumentenkorpus zu verbessern. HierfĂŒr werden zwei Anwendungen prĂ€sentiert. Zum einen eine Linked Data basierte explorative Erweiterung als ErgĂ€nzung zu einer traditionellen schlĂŒsselwort-basierten Suchmaschine, zum anderen ein Linked Data basiertes Empfehlungssystem
    • 

    corecore