960,203 research outputs found
RIVPACS database documentation. Final report
With the advent of the EU Water Framework Directive the concept of the 'reference condition' has become explicit within the legislative framework of the European Union. Reference condition has been established as a quality standard against which assessments of biological degradation must be compared. It is therefore essential that Member States can demonstrate that the biological datasets used to define their reference conditions meet the criteria of the WFD. The RIVPACS reference site dataset is therefore central to the definition of reference conditions for macroinvertebrates in streams and rivers in the United Kingdom.
Objectives of research:
• To establish the ownership of the RIVPACS reference site dataset
• To liaise with all stakeholders of the dataset to establish unhindered access to the RIVPACS reference site dataset for the UK agencies (in perpetuity)
• To deliver the RIVPACS reference site dataset to the UK agencies and to the public domain in a readily accessible database together will its accompanying physicochemical variables (both existing and newly collated as part of this project), historical and current anthropogenic stress data, and a range of calculated biotic indices.
Key findings and recommendations:
Ownership of the RIVPACS dataset resides with no single organization and several different organizations consider that they own different portions of the dataset. Formal permissions to release the dataset into the public domain have been obtained from all twelve extant organizations that have been identified as having funded various phases of RIVPACS research. In addition, CEH/NERC has also agreed to release the RIVPACS dataset to the public domain. Terms and conditions relating to the end use of the RIVPACS dataset have now been established. The RIVPACS database has been assembled in Microsoft® Access and can now be downloaded from the CEH web site. This report details the terms and conditions that apply to all end users of the database and it documents the tables given in the database, their structure and the origin of their data. A separate Pressure Data Analysis report describes the screening of the RIVPACS sites in terms of the current and emerging definitions of reference condition
Accuracy Assessment of the 2006 National Land Cover Database Percent Impervious Dataset
An impervious surface is any surface that prevents water from infiltrating the ground. As impervious surface area increases within watersheds, stream networks and water quality are negatively impacted. The Multi-Resolution Land Characteristic Consortium developed a percent impervious dataset using Landsat imagery as part of the 2006 National Land Cover Database. This percent impervious dataset estimates imperviousness for each 30-meter cell in the land cover database. The percent impervious dataset permits study of impervious surfaces, can be used to identify impacted or critical areas, and allows for development of impact mitigation plans; however, the accuracy of this dataset is unknown. To determine the accuracy of the 2006 percent impervious dataset, reference data were digitized from one-foot digital aerial imagery for three study areas in Arkansas, USA. Digitized reference data were compared to percent impervious dataset estimates of imperviousness at multiple 900m2 , 8,100m2 , and 22,500m2 sample grids to determine if accuracy varied by ground area. Analyses showed percent impervious estimates and digitized reference data differ modestly; however, as ground area increases, percent impervious estimates and reference data match more closely. These findings suggest that the percent impervious dataset is useful for planning purposes for ground areas of at least 2.25ha
Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text
Real world multimedia data is often composed of multiple modalities such as
an image or a video with associated text (e.g. captions, user comments, etc.)
and metadata. Such multimodal data packages are prone to manipulations, where a
subset of these modalities can be altered to misrepresent or repurpose data
packages, with possible malicious intent. It is, therefore, important to
develop methods to assess or verify the integrity of these multimedia packages.
Using computer vision and natural language processing methods to directly
compare the image (or video) and the associated caption to verify the integrity
of a media package is only possible for a limited set of objects and scenes. In
this paper, we present a novel deep learning-based approach for assessing the
semantic integrity of multimedia packages containing images and captions, using
a reference set of multimedia packages. We construct a joint embedding of
images and captions with deep multimodal representation learning on the
reference dataset in a framework that also provides image-caption consistency
scores (ICCSs). The integrity of query media packages is assessed as the
inlierness of the query ICCSs with respect to the reference dataset. We present
the MultimodAl Information Manipulation dataset (MAIM), a new dataset of media
packages from Flickr, which we make available to the research community. We use
both the newly created dataset as well as Flickr30K and MS COCO datasets to
quantitatively evaluate our proposed approach. The reference dataset does not
contain unmanipulated versions of tampered query packages. Our method is able
to achieve F1 scores of 0.75, 0.89 and 0.94 on MAIM, Flickr30K and MS COCO,
respectively, for detecting semantically incoherent media packages.Comment: *Ayush Jaiswal and Ekraam Sabir contributed equally to the work in
this pape
Improved Imputation of Common and Uncommon Single Nucleotide Polymorphisms (SNPs) with a New Reference Set
Statistical imputation of genotype data is an important technique for analysis of genome-wide association studies (GWAS). We have built a reference dataset to improve imputation accuracy for studies of individuals of primarily European descent using genotype data from the Hap1, Omni1, and Omni2.5 human SNP arrays (Illumina). Our dataset contains 2.5-3.1 million variants for 930 European, 157 Asian, and 162 African/African-American individuals. Imputation accuracy of European data from Hap660 or OmniExpress array content, measured by the proportion of variants imputed with R^2^>0.8, improved by 34%, 23% and 12% for variants with MAF of 3%, 5% and 10%, respectively, compared to imputation using publicly available data from 1,000 Genomes and International HapMap projects. The improved accuracy with the use of the new dataset could increase the power for GWAS by as much as 8% relative to genotyping all variants. This reference dataset is available to the scientific community through the NCBI dbGaP portal. Future versions will include additional genotype data as well as non-European populations
REVIEW - A reference data set for retinal vessel profiles
This paper describes REVIEW, a new retinal vessel reference dataset. This dataset includes 16 images with 193 vessel segments, demonstrating a variety of pathologies and vessel types. The vessel edges are marked by three observers using a special drawing tool. The paper also describes the algorithm used to process these segments to produce vessel profiles, against which vessel width measurement algorithms can be assessed. Recommendations are given for use of the dataset in performance assessment. REVIEW can be downloaded from http://ReviewDB.lincoln.ac.uk
Pattern-based phylogenetic distance estimation and tree reconstruction
We have developed an alignment-free method that calculates phylogenetic
distances using a maximum likelihood approach for a model of sequence change on
patterns that are discovered in unaligned sequences. To evaluate the
phylogenetic accuracy of our method, and to conduct a comprehensive comparison
of existing alignment-free methods (freely available as Python package decaf+py
at http://www.bioinformatics.org.au), we have created a dataset of reference
trees covering a wide range of phylogenetic distances. Amino acid sequences
were evolved along the trees and input to the tested methods; from their
calculated distances we infered trees whose topologies we compared to the
reference trees.
We find our pattern-based method statistically superior to all other tested
alignment-free methods on this dataset. We also demonstrate the general
advantage of alignment-free methods over an approach based on automated
alignments when sequences violate the assumption of collinearity. Similarly, we
compare methods on empirical data from an existing alignment benchmark set that
we used to derive reference distances and trees. Our pattern-based approach
yields distances that show a linear relationship to reference distances over a
substantially longer range than other alignment-free methods. The pattern-based
approach outperforms alignment-free methods and its phylogenetic accuracy is
statistically indistinguishable from alignment-based distances.Comment: 21 pages, 3 figures, 2 table
Spatio-temporal Video Re-localization by Warp LSTM
The need for efficiently finding the video content a user wants is increasing
because of the erupting of user-generated videos on the Web. Existing
keyword-based or content-based video retrieval methods usually determine what
occurs in a video but not when and where. In this paper, we make an answer to
the question of when and where by formulating a new task, namely
spatio-temporal video re-localization. Specifically, given a query video and a
reference video, spatio-temporal video re-localization aims to localize
tubelets in the reference video such that the tubelets semantically correspond
to the query. To accurately localize the desired tubelets in the reference
video, we propose a novel warp LSTM network, which propagates the
spatio-temporal information for a long period and thereby captures the
corresponding long-term dependencies. Another issue for spatio-temporal video
re-localization is the lack of properly labeled video datasets. Therefore, we
reorganize the videos in the AVA dataset to form a new dataset for
spatio-temporal video re-localization research. Extensive experimental results
show that the proposed model achieves superior performances over the designed
baselines on the spatio-temporal video re-localization task
- …
