9,809 research outputs found

    A Change Support Model for Distributed Collaborative Work

    Full text link
    Distributed collaborative software development tends to make artifacts and decisions inconsistent and uncertain. We try to solve this problem by providing an information repository to reflect the state of works precisely, by managing the states of artifacts/products made through collaborative work, and the states of decisions made through communications. In this paper, we propose models and a tool to construct the artifact-related part of the information repository, and explain the way to use the repository to resolve inconsistencies caused by concurrent changes of artifacts. We first show the model and the tool to generate the dependency relationships among UML model elements as content of the information repository. Next, we present the model and the method to generate change support workflows from the information repository. These workflows give us the way to efficiently modify the change-related artifacts for each change request. Finally, we define inconsistency patterns that enable us to be aware of the possibility of inconsistency occurrences. By combining this mechanism with version control systems, we can make changes safely. Our models and tool are useful in the maintenance phase to perform changes safely and efficiently.Comment: 10 pages, 13 figures, 4 table

    Automatic Software Repair: a Bibliography

    Get PDF
    This article presents a survey on automatic software repair. Automatic software repair consists of automatically finding a solution to software bugs without human intervention. This article considers all kinds of repairs. First, it discusses behavioral repair where test suites, contracts, models, and crashing inputs are taken as oracle. Second, it discusses state repair, also known as runtime repair or runtime recovery, with techniques such as checkpoint and restart, reconfiguration, and invariant restoration. The uniqueness of this article is that it spans the research communities that contribute to this body of knowledge: software engineering, dependability, operating systems, programming languages, and security. It provides a novel and structured overview of the diversity of bug oracles and repair operators used in the literature

    Consistency in Continuous Distributed Interactive Media

    Full text link
    In this paper we investigate how consistency can be ensured for continuous distributed interactive media, i.e. distributed media which change their state in reaction to user initiated operations as well as because of the passing of time. Existing approaches to reach consistency in discrete distributed interactive media are briefly outlined and it is shown that these fail in the continuous domain. In order to allow a thorough discussion of the problem, a formal definition of the term consistency in the continuous domain is given. Based on this definition we show that an important trade off relationship exists between the responsiveness of the medium and the appearance of short term inconsistencies. Currently this trade off is not taken into consideration for consistency in the continuous domain, thereby severely limiting the consistency related fidelity for a large number of applications. We show that for those applications the fidelity can be significantly raised by voluntarily decreasing the responsiveness of the medium. This concept is called local lag and it enables the distribution of continuous interactive media which are more vulnerable to short term inconsistencies than e.g. battlefield simulations. We prove that the concept of local lag is valid by describing how local lag was successfully used to ensure consistency in a 3D telecooperation application

    Factor Structure and Psychometric Properties of Cognitive-Behavioral Scales in Caregivers of Persons with Dementia

    Get PDF
    Background/Purpose Caregiving can be costly to dementia caregivers\u27 well-being. Assessing the factor structure and psychometric properties of Cognitive-Behavioral Scales in dementia caregivers is an essential step in addressing the gap in the current state of research. Specifically, it is essential to determine first whether the factorial structure of the three measures used in this study namely, the Positive Thinking Skills Scale, the Revised Memory and Behavior Problems Checklist, and the Zarit Burden Interview are good representation of the data by studying the good model fit. Next, evaluating the reliability of each factor of the three measures used are essential to learn about the precision of the factors. Lastly, it is vital to study the factor correlation and its relevance to the theory used to determine the validity of the factors. Methods A descriptive, correlational, cross-sectional design in a convenience sample of 100 caregivers. Results Results indicated that the factorial structure of the three scales is a good representation of the data; an acceptable reliability of each factor of the three measures; and the factors correlated as expected and showed their relevance to the underlying theory. Conclusions Future studies might consider studying the mediating/moderating effects of positive thinking on care-recipients challenging behavior problems. The findings can be used as a guide to provide a positive thinking training intervention among caregivers

    Exploring heterogeneity of unreliable machines for p2p backup

    Full text link
    P2P architecture is a viable option for enterprise backup. In contrast to dedicated backup servers, nowadays a standard solution, making backups directly on organization's workstations should be cheaper (as existing hardware is used), more efficient (as there is no single bottleneck server) and more reliable (as the machines are geographically dispersed). We present the architecture of a p2p backup system that uses pairwise replication contracts between a data owner and a replicator. In contrast to standard p2p storage systems using directly a DHT, the contracts allow our system to optimize replicas' placement depending on a specific optimization strategy, and so to take advantage of the heterogeneity of the machines and the network. Such optimization is particularly appealing in the context of backup: replicas can be geographically dispersed, the load sent over the network can be minimized, or the optimization goal can be to minimize the backup/restore time. However, managing the contracts, keeping them consistent and adjusting them in response to dynamically changing environment is challenging. We built a scientific prototype and ran the experiments on 150 workstations in the university's computer laboratories and, separately, on 50 PlanetLab nodes. We found out that the main factor affecting the quality of the system is the availability of the machines. Yet, our main conclusion is that it is possible to build an efficient and reliable backup system on highly unreliable machines (our computers had just 13% average availability)

    Data Cleaning for XML Electronic Dictionaries via Statistical Anomaly Detection

    Get PDF
    Many important forms of data are stored digitally in XML format. Errors can occur in the textual content of the data in the fields of the XML. Fixing these errors manually is time-consuming and expensive, especially for large amounts of data. There is increasing interest in the research, development, and use of automated techniques for assisting with data cleaning. Electronic dictionaries are an important form of data frequently stored in XML format that frequently have errors introduced through a mixture of manual typographical entry errors and optical character recognition errors. In this paper we describe methods for flagging statistical anomalies as likely errors in electronic dictionaries stored in XML format. We describe six systems based on different sources of information. The systems detect errors using various signals in the data including uncommon characters, text length, character-based language models, word-based language models, tied-field length ratios, and tied-field transliteration models. Four of the systems detect errors based on expectations automatically inferred from content within elements of a single field type. We call these single-field systems. Two of the systems detect errors based on correspondence expectations automatically inferred from content within elements of multiple related field types. We call these tied-field systems. For each system, we provide an intuitive analysis of the type of error that it is successful at detecting. Finally, we describe two larger-scale evaluations using crowdsourcing with Amazon's Mechanical Turk platform and using the annotations of a domain expert. The evaluations consistently show that the systems are useful for improving the efficiency with which errors in XML electronic dictionaries can be detected.Comment: 8 pages, 4 figures, 5 tables; published in Proceedings of the 2016 IEEE Tenth International Conference on Semantic Computing (ICSC), Laguna Hills, CA, USA, pages 79-86, February 201

    XLIndy: interactive recognition and information extraction in spreadsheets

    Get PDF
    Over the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit (visual and textual) information. This translates into a bottleneck, when it comes to automatic analysis and extraction of information. Therefore, we present XLIndy, a Microsoft Excel add-in with a machine learning back-end, written in Python. It showcases our novel methods for layout inference and table recognition in spreadsheets. For a selected task and method, users can visually inspect the results, change configurations, and compare different runs. This enables iterative fine-tuning. Additionally, users can manually revise the predicted layout and tables, and subsequently save them as annotations. The latter is used to measure performance and (re-)train classifiers. Finally, data in the recognized tables can be extracted for further processing. XLIndy supports several standard formats, such as CSV and JSON.Peer ReviewedPostprint (author's final draft

    A Peer-to-Peer Middleware Framework for Resilient Persistent Programming

    Get PDF
    The persistent programming systems of the 1980s offered a programming model that integrated computation and long-term storage. In these systems, reliable applications could be engineered without requiring the programmer to write translation code to manage the transfer of data to and from non-volatile storage. More importantly, it simplified the programmer's conceptual model of an application, and avoided the many coherency problems that result from multiple cached copies of the same information. Although technically innovative, persistent languages were not widely adopted, perhaps due in part to their closed-world model. Each persistent store was located on a single host, and there were no flexible mechanisms for communication or transfer of data between separate stores. Here we re-open the work on persistence and combine it with modern peer-to-peer techniques in order to provide support for orthogonal persistence in resilient and potentially long-running distributed applications. Our vision is of an infrastructure within which an application can be developed and distributed with minimal modification, whereupon the application becomes resilient to certain failure modes. If a node, or the connection to it, fails during execution of the application, the objects are re-instantiated from distributed replicas, without their reference holders being aware of the failure. Furthermore, we believe that this can be achieved within a spectrum of application programmer intervention, ranging from minimal to totally prescriptive, as desired. The same mechanisms encompass an orthogonally persistent programming model. We outline our approach to implementing this vision, and describe current progress.Comment: Submitted to EuroSys 200

    Reengineering workflow for curation of DICOM datasets

    Get PDF
    Reusable, publicly available data is a pillar of open science and rapid advancement of cancer imaging research. Sharing data from completed research studies not only saves research dollars required to collect data, but also helps insure that studies are both replicable and reproducible. The Cancer Imaging Archive (TCIA) is a global shared repository for imaging data related to cancer. Insuring the consistency, scientific utility, and anonymity of data stored in TCIA is of utmost importance. As the rate of submission to TCIA has been increasing, both in volume and complexity of DICOM objects stored, the process of curation of collections has become a bottleneck in acquisition of data. In order to increase the rate of curation of image sets, improve the quality of the curation, and better track the provenance of changes made to submitted DICOM image sets, a custom set of tools was developed, using novel methods for the analysis of DICOM data sets. These tools are written in the programming language perl, use the open-source database PostgreSQL, make use of the perl DICOM routines in the open-source package Posda, and incorporate DICOM diagnostic tools from other open-source packages, such as dicom3tools. These tools are referred to as the Posda Tools. The Posda Tools are open source and available via git at https://github.com/UAMS-DBMI/PosdaTools . In this paper, we briefly describe the Posda Tools and discuss the novel methods employed by these tools to facilitate rapid analysis of DICOM data, including the following: (1) use a database schema which is more permissive, and differently normalized from traditional DICOM databases; (2) perform integrity checks automatically on a bulk basis; (3) apply revisions to DICOM datasets on an bulk basis, either through a web-based interface or via command line executable perl scripts; (4) all such edits are tracked in a revision tracker and may be rolled back; (5) a UI is provided to inspect the results of such edits, to verify that they are what was intended; (6) identification of DICOM Studies, Series, and SOP instances using nicknames which are persistent and have well-defined scope to make expression of reported DICOM errors easier to manage; and (7) rapidly identify potential duplicate DICOM datasets by pixel data is provided; this can be used, e.g., to identify submission subjects which may relate to the same individual, without identifying the individual
    corecore