3,076 research outputs found

    JISC Final Report: IncReASe (Increasing Repository Content through Automation and Services)

    Get PDF
    The IncReASe (Increasing Repository Content through Automation and Services) was an eighteen month project (subsequently extended to twenty months) to enhance White Rose Research Online (WRRO)1. WRRO is a shared repository of research outputs (primarily publications) from the Universities of Leeds, Sheffield and York; it runs on the EPrints open source repository platform. The repository was created in 2004 and had steady growth but, in common with many other similar repositories, had difficulty in achieving a “critical mass” of content and in becoming truly embedded within researchers’ workflows. The main aim of the IncReASe project was to assess ingestion routes into WRRO with a view to lowering barriers to deposit. We reviewed the feasibility of bulk import of pre-existing metadata and/or full-text research outputs, hoping this activity would have a positive knock-on effect on repository growth and embedding. Prior to the project, we had identified researchers’ reluctance to duplicate effort in metadata creation as a significant barrier to WRRO uptake; we investigated how WRRO might share data with internal and external IT systems. This work included a review of how WRRO, as an institutional based repository, might interact with the subject repository of the Economic and Social Research Council (ESRC). The project addressed four main areas: (i) researcher behaviour: we investigated researcher awareness, motivation and workflow through a survey of archiving activity on the university web sites, a questionnaire and discussions with researchers (ii) bulk import: we imported data from local systems, including York’s submission data for the 2008 Research Assessment Exercise (RAE), and developed an import plug-in for use with the arXiv2 repository (iii) interoperability: we looked at how WRRO might interact with university and departmental publication databases and ESRC’s repository. (iv) metadata: we assessed metadata issues raised by importing publication data from a variety of sources. A number of outputs from the project have been made available from the IncReASe project web site http://eprints.whiterose.ac.uk/increase/. The project highlighted the low levels of researcher awareness of WRRO - and of broader open access issues, including research funders’ deposit requirements. We designed some new publicity materials to start to address this. Departmental publication databases provided a useful jumping off point for advocacy and liaison; this activity was helpful in promoting awareness of WRRO. Bulk import proved time consuming – both in terms of adjusting EPrints plug-ins to incorporate different datasets and in the staff time required to improve publication metadata. A number of deposit scenarios were developed in the context of our work with ESRC; we concentrated on investigating how a local deposit of a research paper and attendant metadata in WRRO might be used to populate ESRC’s repository. This work improved our understanding of researcher workflows and of the SWORD protocol as a potential (if partial) solution to the single deposit, multiple destination model we wish to develop; we think the prospect of institutional repository / ESRC data sharing is now a step closer. IncReASe experienced some staff recruitment difficulties. It was also necessary to adapt the project to the changing IT landscape at the three partner institutions – in particular, the introduction of a centralised publication management system at the University of Leeds. Although these factors had some impact on deliverables, the aims and objectives of the project were largely achieved

    The value of research data to the nation

    Get PDF
    Executive Director’s report Ross Wilkinson, ANDS How can Australia address the challenge of living in bushfire prone city fringes? How can Australia most effectively farm and preserve our precious soil? How can Australia understand the Great Barrier Reef? No single discipline can answer these questions, but to address these challenges data is needed from a range of sources and disciplines. Research data that is well organised and available allows research to make substantial contributions vital to Australia’s future. For example, by drawing upon data that is able to be used by soil scientists, geneticists, plant scientists, climate analysts, and others, it is possible to conduct the multidisciplinary investigations necessary to tackle truly difficult and important challenges. The data might be provided by a Terrestrial Ecosystems Research Network OzFluz tower, insect observations recorded by a citizen through the Atlas of Living Australia, genetic sequencing of insects through a Bioplatforms Australia facility, weather observations by the Bureau of Meteorology, or historical data generated by CSIRO over many decades. Each will provide a part of the jigsaw, but the pieces must be able to be put together. This requires careful collection and organisation, which together deliver enormous value to the country. However, nationally significant problems are often tackled by international cooperation, so Australia’s data assets enable Australian researchers to work with the best in the world, solving problems of both national and international significance. Australia’s data assets and research data infrastructure provide Australian researchers with an excellent platform for international collaboration. Australia has world-leading research data infrastructure: our ability to store, compute, discover, explore, analyse and publish is the best in the world. The ability to capture data through a wide range of capabilities, from the Australian Synchrotron to Integrated Marine Observation System [IMOS: imos.org.au] ocean gliders, the combination of national storage and computation through RDSI, NCI and Pawsey initiatives, the ability to publish and discover data through ANDS, the ability to analyse and explore data through Nectar, and state and local eResearch capabilities, highlights just some of the capabilities that Australian researchers are able to access. Importantly, their international partners are able to work with them using many of these resources. As well, Australian research organisations are assembling many resources to support their research. These include policies, procedures, practical infrastructure, and very importantly – people! The eResearch team and the data librarians are always keen to help. This issue of Share highlights how the data resources of Australia are providing a very substantial national benefit, and how that benefit is being realised

    A Practitioner Survey Exploring the Value of Forensic Tools, AI, Filtering, & Safer Presentation for Investigating Child Sexual Abuse Material

    Get PDF
    For those investigating cases of Child Sexual Abuse Material (CSAM), there is the potential harm of experiencing trauma after illicit content exposure over a period of time. Research has shown that those working on such cases can experience psychological distress. As a result, there has been a greater effort to create and implement technologies that reduce exposure to CSAM. However, not much work has explored gathering insight regarding the functionality, effectiveness, accuracy, and importance of digital forensic tools and data science technologies from practitioners who use them. This study focused specifically on examining the value practitioners give to the tools and technologies they utilize to investigate CSAM cases. General findings indicated that implementing filtering technologies is more important than safe-viewing technologies; false positives are a greater concern than false negatives; resources such as time, personnel, and money continue to be a concern; and an improved workflow is highly desirable. Results also showed that practitioners are not well-versed in data science and Artificial Intelligence (AI), which is alarming given that tools already implement these techniques and that practitioners face large amounts of data during investigations. Finally, the data exemplified that practitioners are generally not taking advantage of tools that implement data science techniques, and that the biggest need for them is in automated child nudity detection, age estimation and skin tone detection

    Automation for digital forensics: towards a definition for the community

    Get PDF
    With the increasing amount of digital evidence per case, the automation of investigative tasks is of utmost importance to the digital forensics community. Consequently, tools are published, frameworks are released, and artificial intelligence is explored. However, as the foundation, i.e., a definition, classification, and common terminology, is missing, this resembles the wild west: some consider keyword searches or file carving as automation while others do not. We, therefore, reviewed automation literature (in the domain of digital forensics as well as other domains), performed three practitioner interviews, and discussed the topic with domain experts from academia. On this basis, we propose a definition and then showcase several considerations with respect to automation for digital forensics, e.g., what we classify as no/basic automation as well as full automation (autonomous). We conclude that it requires these foundational discussions to promote and progress the discipline through a common understanding

    Methodology for the Automated Metadata-Based Classification of Incriminating Digital Forensic Artefacts

    Full text link
    The ever increasing volume of data in digital forensic investigation is one of the most discussed challenges in the field. Usually, most of the file artefacts on seized devices are not pertinent to the investigation. Manually retrieving suspicious files relevant to the investigation is akin to finding a needle in a haystack. In this paper, a methodology for the automatic prioritisation of suspicious file artefacts (i.e., file artefacts that are pertinent to the investigation) is proposed to reduce the manual analysis effort required. This methodology is designed to work in a human-in-the-loop fashion. In other words, it predicts/recommends that an artefact is likely to be suspicious rather than giving the final analysis result. A supervised machine learning approach is employed, which leverages the recorded results of previously processed cases. The process of features extraction, dataset generation, training and evaluation are presented in this paper. In addition, a toolkit for data extraction from disk images is outlined, which enables this method to be integrated with the conventional investigation process and work in an automated fashion

    Integrating Clinical Trial Imaging Data Resources Using Service-Oriented Architecture and Grid Computing

    Get PDF
    Clinical trials which use imaging typically require data management and workflow integration across several parties. We identify opportunities for all parties involved to realize benefits with a modular interoperability model based on service-oriented architecture and grid computing principles. We discuss middleware products for implementation of this model, and propose caGrid as an ideal candidate due to its healthcare focus; free, open source license; and mature developer tools and support

    Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy

    Get PDF
    With the advent of agriculture 3.0 and 4.0, researchers are increasingly focusing on the development of innovative smart farming and precision agriculture technologies by introducing automation and robotics into the agricultural processes. Autonomous agricultural field machines have been gaining significant attention from farmers and industries to reduce costs, human workload, and required resources. Nevertheless, achieving sufficient autonomous navigation capabilities requires the simultaneous cooperation of different processes; localization, mapping, and path planning are just some of the steps that aim at providing to the machine the right set of skills to operate in semi-structured and unstructured environments. In this context, this study presents a low-cost local motion planner for autonomous navigation in vineyards based only on an RGB-D camera, low range hardware, and a dual layer control algorithm. The first algorithm exploits the disparity map and its depth representation to generate a proportional control for the robotic platform. Concurrently, a second back-up algorithm, based on representations learning and resilient to illumination variations, can take control of the machine in case of a momentaneous failure of the first block. Moreover, due to the double nature of the system, after initial training of the deep learning model with an initial dataset, the strict synergy between the two algorithms opens the possibility of exploiting new automatically labeled data, coming from the field, to extend the existing model knowledge. The machine learning algorithm has been trained and tested, using transfer learning, with acquired images during different field surveys in the North region of Italy and then optimized for on-device inference with model pruning and quantization. Finally, the overall system has been validated with a customized robot platform in the relevant environment
    • …
    corecore