Search CORE

392 research outputs found

Reconstructing human-generated provenance through similarity-based clustering

Author: De Nies Tom
Mannens Erik
Van de Walle Rik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In this paper, we revisit our method for reconstructing the primary sources of documents, which make up an important part of their provenance. Our method is based on the assumption that if two documents are semantically similar, there is a high chance that they also share a common source. We previously evaluated this assumption on an excerpt from a news archive, achieving 68.2% precision and 73% recall when reconstructing the primary sources of all articles. However, since we could not release this dataset to the public, it made our results hard to compare to others. In this work, we extend the flexibility of our method by adding a new parameter, and re-evaluate it on the human-generated dataset created for the 2014 Provenance Reconstruction Challenge. The extended method achieves up to 86% precision and 59% recall, and is now directly comparable to any approach that uses the same dataset

Crossref

Ghent University Academic Bibliography

Archivsystem Ask23

Reconstructing Provenance

Author: Magliacane S.
Publication venue
Publication date: 01/01/2012
Field of study

VU Research Portal

Automatic discovery of high-level provenance using semantic similarity

Author: Coppens Sam
De Nies Tom
Mannens Erik
Van de Walle Rik
Van Deursen Davy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

As interest in provenance grows among the Semantic Web community, it is recognized as a useful tool across many domains. However, existing automatic provenance collection techniques are not universally applicable. Most existing methods either rely on (low-level) observed provenance, or require that the user discloses formal workflows. In this paper, we propose a new approach for automatic discovery of provenance, at multiple levels of granularity. To accomplish this, we detect entity derivations, relying on clustering algorithms, linked data and semantic similarity. The resulting derivations are structured in compliance with the Provenance Data Model (PROV-DM). While the proposed approach is purposely kept general, allowing adaptation in many use cases, we provide an implementation for one of these use cases, namely discovering the sources of news articles. With this implementation, we were able to detect 73% of the original sources of 410 news stories, at 68% precision. Lastly, we discuss possible improvements and future work

Crossref

Ghent University Academic Bibliography

Reproducibility of scientific workflows execution using cloud-aware provenance (ReCAP)

Author: C Scheidegger
E Deelman
EHBM Gronenschild
G Juve
Ilkay Altintas
J Kim
Johannes Starlinger
K Munir
K Munir
Kamran Munir
Kamran Munir
Khawar Hasham
R Sakellariou
T Glatard
W Stallings
Y Simmhan
YL Simmhan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2018
Field of study

© 2018, Springer-Verlag GmbH Austria, part of Springer Nature. Provenance of scientific workflows has been considered a mean to provide workflow reproducibility. However, the provenance approaches adopted so far are not applicable in the context of Cloud because the provenance trace lacks the Cloud information. This paper presents a novel approach that collects the Cloud-aware provenance and represents it as a graph. The workflow execution reproducibility on the Cloud is determined by comparing the workflow provenance at three levels i.e., workflow structure, execution infrastructure and workflow outputs. The experimental evaluation shows that the implemented approach can detect changes in the provenance traces and the outputs produced by the workflow

Crossref

UWE Bristol Research Repository

Enabling automatic provenance-based trust assessment of web content

Author: De Nies Tom
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

Hydrologic Information Systems: Advancing Cyberinfrastructure for Environmental Observatories

Author: Horsburgh Jeffery S.
Publication venue: DigitalCommons@USU
Publication date: 01/01/2009
Field of study

Recently, community initiatives have emerged for the establishment of large-scale environmental observatories. Cyberinfrastructure is the backbone upon which these observatories will be built, and scientists\u27 ability to access and use the data collected within observatories to address research questions will depend on the successful implementation of cyberinfrastructure. The research described in this dissertation advances the cyberinfrastructure available for supporting environmental observatories. This has been accomplished through both development of new cyberinfrastructure components as well as through the demonstration and application of existing tools, with a specific focus on point observations data. The cyberinfrastructure that was developed and deployed to support collection, management, analysis, and publication of data generated by an environmental sensor network in the Little Bear River environmental observatory test bed is described, as is the sensor network design and deployment. Results of several analyses that demonstrate how high-frequency data enable identification of trends and analysis of physical, chemical, and biological behavior that would be impossible using traditional, low-frequency monitoring data are presented. This dissertation also illustrates how the cyberinfrastructure components demonstrated in the Little Bear River test bed have been integrated into a data publication system that is now supporting a nationwide network of 11 environmental observatory test bed sites, as well as other research sites within and outside of the United States. Enhancements to the infrastructure for research and education that are enabled by this research are impacting a diverse community, including the national community of researchers involved with prospective Water and Environmental Research Systems (WATERS) Network environmental observatories as well as other observatory efforts, research watersheds, and test beds. The results of this research provide insight into and potential solutions for some of the bottlenecks associated with design and implementation of cyberinfrastructure for observatory support

DigitalCommons@USU

ProQuest OAI Repository

Retrieving haystacks: a data driven information needs model for faceted search.

Author: Bates MJ
Beavan D
Berger PL
Blei D
Burnett S
Church K
Cleverley PH
Cox A
Creswell JW
De Saulles M
Fagan JC
Glaser BG
Gwizdka J
Halder S
Hawking D
Hearst M
Kaki M
Kidwell P
Kuo BY
McCay-Peet L
Morville P
Niu X
Niu X
Pirolli P
Robinson MA
Russell-Rose T
Russell-Rose T
Spiteri LF
Strauss A
Tunkelang D
Veling A
White HD
White RW
Wilson ML
Publication venue: 'SAGE Publications'
Publication date: 01/01/2014
Field of study

The research aim was to develop an understanding of information need characteristics for word co-occurrence-based search result filters (facets). No prior research has been identified into what enterprise searchers may find useful for exploratory search and why. Various word co-occurrence techniques were applied to results from sample queries performed on industry membership content. The results were used in an international survey of 54 practising petroleum engineers from 32 organizations. Subject familiarity, job role, personality and query specificity are possible causes for survey response variation. An information needs model is presented: Broad, Rich, Intriguing, Descriptive, General, Expert and Situational (BRIDGES). This may help professionals to more effectively meet their information needs and stimulate new needs, improving a systems ability to facilitate serendipity. This research has implications for faceted search in enterprise search and digital library deployments

CiteSeerX

Crossref

Open Access Institutional Repository at Robert Gordon University

Systems Biology Knowledgebase for a New Era in Biology A Genomics:GTL Report from the May 2008 Workshop

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Recommended from our members

Novel processes for smart grid information exchange and knowledge representation using the IEC common information model

Author: Hargreaves Nigel
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The IEC Common Information Model (CIM) is of central importance in enabling smart grid interoperability. Its continual development aims to meet the needs of the smart grid for semantic understanding and knowledge representation for a widening domain of resources and processes. With smart grid evolution the importance of information and data management has become an increasingly pressing issue not only because far more data is being generated using modern sensing, control and measuring devices but also because information is now becoming recognised as the ‘integral component’ that facilitates the optimal flexibility required of the smart grid. This thesis looks at the impacts of CIM implementation upon the landscape of smart grid issues and presents research from within National Grid contributing to three key areas in support of further CIM deployment. Taking the issue of Enterprise Information Management first, an information management framework is presented for CIM deployment at National Grid. Following this the development and demonstration of a novel secure cloud computing platform to handle such information is described. Power system application (PSA) models of the grid are partial knowledge representations of a shared reality. To develop the completeness of our understanding of this reality it is necessary to combine these representations. The second research contribution reports on a novel methodology for a CIM-based model repository to align PSA representations and provide a knowledge resource for building utility business intelligence of the grid. The third contribution addresses the need for greater integration of information relating to energy storage, an essential aspect of smart energy management. It presents the strategic rationale for integrated energy modeling and a novel extension to the existing CIM standards for modeling grid-scale energy storage. Significantly, this work has already contributed to a larger body of work on modeling Distributed Energy Resources currently under development at the Electric Power Research Institute (EPRI) in the USA.Dr. Martin Bradley on behalf of National Grid Plc. and the Engineering and Physical Sciences Research Council (EPSRC

Brunel University Research Archive

Proceedings of the 10th International Conference on Ecological Informatics: translating ecological data into knowledge and decisions in a rapidly changing world: ICEI 2018

Author: International Conference on Ecological Informatics
Publication venue
Publication date: 01/01/2019
Field of study

The Conference Proceedings are an impressive display of the current scope of Ecological Informatics. Whilst Data Management, Analysis, Synthesis and Forecasting have been lasting popular themes over the past nine biannual ICEI conferences, ICEI 2018 addresses distinctively novel developments in Data Acquisition enabled by cutting edge in situ and remote sensing technology. The here presented ICEI 2018 abstracts captures well current trends and challenges of Ecological Informatics towards: • regional, continental and global sharing of ecological data, • thorough integration of complementing monitoring technologies including DNA-barcoding, • sophisticated pattern recognition by deep learning, • advanced exploration of valuable information in ‘big data’ by means of machine learning and process modelling, • decision-informing solutions for biodiversity conservation and sustainable ecosystem management in light of global changes

Digitale Bibliothek Thüringen