960,203 research outputs found

    RIVPACS database documentation. Final report

    Get PDF
    With the advent of the EU Water Framework Directive the concept of the 'reference condition' has become explicit within the legislative framework of the European Union. Reference condition has been established as a quality standard against which assessments of biological degradation must be compared. It is therefore essential that Member States can demonstrate that the biological datasets used to define their reference conditions meet the criteria of the WFD. The RIVPACS reference site dataset is therefore central to the definition of reference conditions for macroinvertebrates in streams and rivers in the United Kingdom. Objectives of research: • To establish the ownership of the RIVPACS reference site dataset • To liaise with all stakeholders of the dataset to establish unhindered access to the RIVPACS reference site dataset for the UK agencies (in perpetuity) • To deliver the RIVPACS reference site dataset to the UK agencies and to the public domain in a readily accessible database together will its accompanying physicochemical variables (both existing and newly collated as part of this project), historical and current anthropogenic stress data, and a range of calculated biotic indices. Key findings and recommendations: Ownership of the RIVPACS dataset resides with no single organization and several different organizations consider that they own different portions of the dataset. Formal permissions to release the dataset into the public domain have been obtained from all twelve extant organizations that have been identified as having funded various phases of RIVPACS research. In addition, CEH/NERC has also agreed to release the RIVPACS dataset to the public domain. Terms and conditions relating to the end use of the RIVPACS dataset have now been established. The RIVPACS database has been assembled in Microsoft® Access and can now be downloaded from the CEH web site. This report details the terms and conditions that apply to all end users of the database and it documents the tables given in the database, their structure and the origin of their data. A separate Pressure Data Analysis report describes the screening of the RIVPACS sites in terms of the current and emerging definitions of reference condition

    Accuracy Assessment of the 2006 National Land Cover Database Percent Impervious Dataset

    Get PDF
    An impervious surface is any surface that prevents water from infiltrating the ground. As impervious surface area increases within watersheds, stream networks and water quality are negatively impacted. The Multi-Resolution Land Characteristic Consortium developed a percent impervious dataset using Landsat imagery as part of the 2006 National Land Cover Database. This percent impervious dataset estimates imperviousness for each 30-meter cell in the land cover database. The percent impervious dataset permits study of impervious surfaces, can be used to identify impacted or critical areas, and allows for development of impact mitigation plans; however, the accuracy of this dataset is unknown. To determine the accuracy of the 2006 percent impervious dataset, reference data were digitized from one-foot digital aerial imagery for three study areas in Arkansas, USA. Digitized reference data were compared to percent impervious dataset estimates of imperviousness at multiple 900m2 , 8,100m2 , and 22,500m2 sample grids to determine if accuracy varied by ground area. Analyses showed percent impervious estimates and digitized reference data differ modestly; however, as ground area increases, percent impervious estimates and reference data match more closely. These findings suggest that the percent impervious dataset is useful for planning purposes for ground areas of at least 2.25ha

    Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text

    Full text link
    Real world multimedia data is often composed of multiple modalities such as an image or a video with associated text (e.g. captions, user comments, etc.) and metadata. Such multimodal data packages are prone to manipulations, where a subset of these modalities can be altered to misrepresent or repurpose data packages, with possible malicious intent. It is, therefore, important to develop methods to assess or verify the integrity of these multimedia packages. Using computer vision and natural language processing methods to directly compare the image (or video) and the associated caption to verify the integrity of a media package is only possible for a limited set of objects and scenes. In this paper, we present a novel deep learning-based approach for assessing the semantic integrity of multimedia packages containing images and captions, using a reference set of multimedia packages. We construct a joint embedding of images and captions with deep multimodal representation learning on the reference dataset in a framework that also provides image-caption consistency scores (ICCSs). The integrity of query media packages is assessed as the inlierness of the query ICCSs with respect to the reference dataset. We present the MultimodAl Information Manipulation dataset (MAIM), a new dataset of media packages from Flickr, which we make available to the research community. We use both the newly created dataset as well as Flickr30K and MS COCO datasets to quantitatively evaluate our proposed approach. The reference dataset does not contain unmanipulated versions of tampered query packages. Our method is able to achieve F1 scores of 0.75, 0.89 and 0.94 on MAIM, Flickr30K and MS COCO, respectively, for detecting semantically incoherent media packages.Comment: *Ayush Jaiswal and Ekraam Sabir contributed equally to the work in this pape

    Improved Imputation of Common and Uncommon Single Nucleotide Polymorphisms (SNPs) with a New Reference Set

    Get PDF
    Statistical imputation of genotype data is an important technique for analysis of genome-wide association studies (GWAS). We have built a reference dataset to improve imputation accuracy for studies of individuals of primarily European descent using genotype data from the Hap1, Omni1, and Omni2.5 human SNP arrays (Illumina). Our dataset contains 2.5-3.1 million variants for 930 European, 157 Asian, and 162 African/African-American individuals. Imputation accuracy of European data from Hap660 or OmniExpress array content, measured by the proportion of variants imputed with R^2^>0.8, improved by 34%, 23% and 12% for variants with MAF of 3%, 5% and 10%, respectively, compared to imputation using publicly available data from 1,000 Genomes and International HapMap projects. The improved accuracy with the use of the new dataset could increase the power for GWAS by as much as 8% relative to genotyping all variants. This reference dataset is available to the scientific community through the NCBI dbGaP portal. Future versions will include additional genotype data as well as non-European populations

    REVIEW - A reference data set for retinal vessel profiles

    Get PDF
    This paper describes REVIEW, a new retinal vessel reference dataset. This dataset includes 16 images with 193 vessel segments, demonstrating a variety of pathologies and vessel types. The vessel edges are marked by three observers using a special drawing tool. The paper also describes the algorithm used to process these segments to produce vessel profiles, against which vessel width measurement algorithms can be assessed. Recommendations are given for use of the dataset in performance assessment. REVIEW can be downloaded from http://ReviewDB.lincoln.ac.uk

    Pattern-based phylogenetic distance estimation and tree reconstruction

    Get PDF
    We have developed an alignment-free method that calculates phylogenetic distances using a maximum likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a dataset of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods on this dataset. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.Comment: 21 pages, 3 figures, 2 table

    Spatio-temporal Video Re-localization by Warp LSTM

    Full text link
    The need for efficiently finding the video content a user wants is increasing because of the erupting of user-generated videos on the Web. Existing keyword-based or content-based video retrieval methods usually determine what occurs in a video but not when and where. In this paper, we make an answer to the question of when and where by formulating a new task, namely spatio-temporal video re-localization. Specifically, given a query video and a reference video, spatio-temporal video re-localization aims to localize tubelets in the reference video such that the tubelets semantically correspond to the query. To accurately localize the desired tubelets in the reference video, we propose a novel warp LSTM network, which propagates the spatio-temporal information for a long period and thereby captures the corresponding long-term dependencies. Another issue for spatio-temporal video re-localization is the lack of properly labeled video datasets. Therefore, we reorganize the videos in the AVA dataset to form a new dataset for spatio-temporal video re-localization research. Extensive experimental results show that the proposed model achieves superior performances over the designed baselines on the spatio-temporal video re-localization task
    corecore