75,816 research outputs found

    ATLAS: A flexible and extensible architecture for linguistic annotation

    Full text link
    We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract logical model provides for a range of storage formats and promotes the reuse of tools that interact through this API. We focus first on ``Annotation Graphs,'' a graph model for annotations on linear signals (such as text and speech) indexed by intervals, for which efficient database storage and querying techniques are applicable. We note how a wide range of existing annotated corpora can be mapped to this annotation graph model. This model is then generalized to encompass a wider variety of linguistic ``signals,'' including both naturally occuring phenomena (as recorded in images, video, multi-modal interactions, etc.), as well as the derived resources that are increasingly important to the engineering of natural language processing systems (such as word lists, dictionaries, aligned bilingual corpora, etc.). We conclude with a review of the current efforts towards implementing key pieces of this architecture.Comment: 8 pages, 9 figure

    Hierarchical Event Descriptors (HED): Semi-Structured Tagging for Real-World Events in Large-Scale EEG.

    Get PDF
    Real-world brain imaging by EEG requires accurate annotation of complex subject-environment interactions in event-rich tasks and paradigms. This paper describes the evolution of the Hierarchical Event Descriptor (HED) system for systematically describing both laboratory and real-world events. HED version 2, first described here, provides the semantic capability of describing a variety of subject and environmental states. HED descriptions can include stimulus presentation events on screen or in virtual worlds, experimental or spontaneous events occurring in the real world environment, and events experienced via one or multiple sensory modalities. Furthermore, HED 2 can distinguish between the mere presence of an object and its actual (or putative) perception by a subject. Although the HED framework has implicit ontological and linked data representations, the user-interface for HED annotation is more intuitive than traditional ontological annotation. We believe that hiding the formal representations allows for a more user-friendly interface, making consistent, detailed tagging of experimental, and real-world events possible for research users. HED is extensible while retaining the advantages of having an enforced common core vocabulary. We have developed a collection of tools to support HED tag assignment and validation; these are available at hedtags.org. A plug-in for EEGLAB (sccn.ucsd.edu/eeglab), CTAGGER, is also available to speed the process of tagging existing studies

    CompendiumLD – a tool for effective, efficient and creative learning design

    No full text
    Developers and teachers go through a complex decision making process when designing new learning activities – working towards an effective pedagogical mix, combining resources, tools, student and tutor support. This paper describes CompendiumLD, a prototype tool we have built to support practitioners through the process of designing learning activities. We describe how the tool fits into our vision of a dynamic, interactive set of resources and system tools to support effective, efficient and creative learning design. It describes CompendiumLD's features and explains the rationale behind their development. It shows how the tool is intended to aid designers make choices, and plan developments, facilitating creativity and efficiency in the design process. In our conclusions we consider how such a system can support the design of effective learning activities

    Towards structured sharing of raw and derived neuroimaging data across existing resources

    Full text link
    Data sharing efforts increasingly contribute to the acceleration of scientific discovery. Neuroimaging data is accumulating in distributed domain-specific databases and there is currently no integrated access mechanism nor an accepted format for the critically important meta-data that is necessary for making use of the combined, available neuroimaging data. In this manuscript, we present work from the Derived Data Working Group, an open-access group sponsored by the Biomedical Informatics Research Network (BIRN) and the International Neuroimaging Coordinating Facility (INCF) focused on practical tools for distributed access to neuroimaging data. The working group develops models and tools facilitating the structured interchange of neuroimaging meta-data and is making progress towards a unified set of tools for such data and meta-data exchange. We report on the key components required for integrated access to raw and derived neuroimaging data as well as associated meta-data and provenance across neuroimaging resources. The components include (1) a structured terminology that provides semantic context to data, (2) a formal data model for neuroimaging with robust tracking of data provenance, (3) a web service-based application programming interface (API) that provides a consistent mechanism to access and query the data model, and (4) a provenance library that can be used for the extraction of provenance data by image analysts and imaging software developers. We believe that the framework and set of tools outlined in this manuscript have great potential for solving many of the issues the neuroimaging community faces when sharing raw and derived neuroimaging data across the various existing database systems for the purpose of accelerating scientific discovery

    FixMiner: Mining Relevant Fix Patterns for Automated Program Repair

    Get PDF
    Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the AST-level context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure

    The Digital Revolution in Qualitative Research: Working with Digital Audio Data Through Atlas.Ti

    Get PDF
    Modern versions of Computer Assisted Qualitative Data Analysis Software (CAQDAS) are enabling the analysis of audio sound files instead of relying solely on text-based analysis. Along with other developments in computer technologies such as the proliferation of digital recording devices and the potential for using streamed media in online academic publication, this innovation is increasing the possibilities of systematically using media-rich, naturalistic data in place of transcribed 'de-naturalised' forms. This paper reports on a project assessing online learning materials that used Atlas.ti software to analyse sound files, and it describes the problems faced in gathering, analysing and using this data for report writing. It concludes that there are still serious barriers to the full and effective integration of audio data into qualitative research: the absence of 'industry standard' recording technology, the underdevelopment of audio interfaces in Atlas.ti (as a key CAQDAS package), and the conventional approach to data use in many online publication formats all place serious restrictions on the integrated use of this data. Nonetheless, it is argued here that there are clear benefits in pushing for resolutions to these problems as the use of this naturalistic data through digital formats may help qualitative researchers to overcome some long-standing methodological issues: in particular, the ability to overcome the reliance on data transcription rather than 'natural' data, and the possibility of implementing research reports that facilitate a more transparent use of 'reusable' data, are both real possibilities when using these digital technologies, which could substantially change the shape of qualitative research practice.CAQDAS, Recording Technology, Online Publication

    Coordination and control in project-based work: digital objects and infrastructures for delivery

    Get PDF
    A major infrastructure project is used to investigate the role of digital objects in the coordination of engineering design work. From a practice-based perspective, research emphasizes objects as important in enabling cooperative knowledge work and knowledge sharing. The term ‘boundary object’ has become used in the analysis of mutual and reciprocal knowledge sharing around physical and digital objects. The aim is to extend this work by analysing the introduction of an extranet into the public–private partnership project used to construct a new motorway. Multiple categories of digital objects are mobilized in coordination across heterogeneous, cross-organizational groups. The main findings are that digital objects provide mechanisms for accountability and control, as well as for mutual and reciprocal knowledge sharing; and that different types of objects are nested, forming a digital infrastructure for project delivery. Reconceptualizing boundary objects as a digital infrastructure for delivery has practical implications for management practices on large projects and for the use of digital tools, such as building information models, in construction. It provides a starting point for future research into the changing nature of digitally enabled coordination in project-based work

    Theories of understanding others: the need for a new account and the guiding role of the person model theory

    Get PDF
    What would be an adequate theory of social understanding? In the last decade, the philosophical debate has focused on Theory Theory, Simulation Theory and Interaction Theory as the three possible candidates. In the following, we look carefully at each of these and describe its main advantages and disadvantages. Based on this critical analysis, we formulate the need for a new account of social understanding. We propose the Person Model Theory as an independent new account which has greater explanatory power compared to the existing theorie

    The PEG-BOARD project:A case study for BRIDGE

    Get PDF
    • …
    corecore