Search CORE

4,191 research outputs found

Supporting collaboration within the eScience community

Author: Boldyreff Cornelia
Nutter David
Rank Stephen
Publication venue
Publication date: 14/01/2004
Field of study

Collaboration is a core activity at the heart of large-scale co- operative scientific experimentation. In order to support the emergence of Grid-based scientific collaboration, new models of e-Science working methods are needed. Scientific collaboration involves production and manipulation of various artefacts. Based on work done in the software engineering field, this paper proposes models and tools which will support the representation and production of such artefacts. It is necessary to provide facilities to classify, organise, acquire, process, share, and reuse artefacts generated during collaborative working. The concept of a "design space" will be used to organise scientific design and the composition of experiments, and methods such as self-organising maps will be used to support the reuse of existing artefacts. It is proposed that this work can be carried out and evaluated in the UK e-Science community, using an "industry as laboratory" approach to the research, building on the knowledge, expertise, and experience of those directly involved in e-Science

University of Lincoln Institutional Repository

Using semantic indexing to improve searching performance in web archives

Author: Khan Arshad
Martin David J.
Tiropanis Thanassis
Publication venue
Publication date: 28/01/2013
Field of study

The sheer volume of electronic documents being published on the Web can be overwhelming for users if the searching aspect is not properly addressed. This problem is particularly acute inside archives and repositories containing large collections of web resources or, more precisely, web pages and other web objects. Using the existing search capabilities in web archives, results can be compromised because of the size of data, content heterogeneity and changes in scientific terminologies and meanings. During the course of this research, we will explore whether semantic web technologies, particularly ontology-based annotation and retrieval, could improve precision in search results in multi-disciplinary web archives

Southampton (e-Prints Soton)

National Centre for Research Methods: NCRM EPrints Repository

Community next steps for making globally unique identifiers work for biocollections data

Author: Agosti Donat
Catapano Terry
Cellinese Nico
Deck John
Guralnick Robert P.
Hagedorn Gregor
Kunze John
Page Roderic D.
Penev Lyubomir
Pyle Richard L.
Walls Ramona
Wieczorek John
Publication venue: Pensoft Publishers
Publication date: 01/01/2015
Field of study

Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided

Biodiversity Heritage Library OAI Repository

ZENODO

Directory of Open Access Journals

PubMed Central

Enlighten

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Recommended from our members

Extracting and re-using research data from chemistry e-theses: the SPECTRa-T project

Author: Downing Jim
Harvey Matt
Morgan Peter
Murray-Rust Peter
Rzepa Henry S
Stewart Diana
Tonge Alan
Townsend Joseph A
Publication venue: 11th International Symposium on Electronic Theses and Dissertations
Publication date: 01/06/2008
Field of study

Scientific e-theses are data-rich resources, but much of the information they contain is not readily accessible. For chemistry, the SPECTRa-T project has addressed this problem by developing data-mining techniques to extract experimental data, creating RDF (Resource Description Framework) triples for exposure to sophisticated Semantic Web searches. We used OSCAR3, an Open Source chemistry text-mining tool, to parse and extract data from theses in PDF, and from theses in Office Open XML document format. Theses in PDF suffered data corruption and a loss of formatting that prevented the identification of chemical objects. Theses in .docx yielded semantically rich SciXML that enabled the additional extraction of associated data. Chemical objects were placed in a data repository, and RDF triples deposited in a triplestore. Data-mining from chemistry e-theses is both desirable and feasible; but the use of PDF, the de facto format standard for deposit in most repositories, prevents the optimal extraction of data for semantic querying. In order to facilitate this, we recommend that universities also require deposition of chemistry e-theses in an XML document format. Further work is required to clarify the complex IPR issues and ensure that they do not become an unwarranted barrier to data extraction and re-use

Apollo (Cambridge)

Recommended from our members

A semantic Grid for molecular science

Author: Glen Robert C
Murray-Rust Peter
Rzepa Henry S
Stewart James JP
Townsend Joseph A
Willighagen Egon L
Yong Zhang
Publication venue
Publication date: 26/06/2008
Field of study

Proceedings of the 2003 UK e-Science All Hands Meeting, 31st August - 3rd September, Nottingham UKThe properties of molecules have very well defined semantics and allow the creation of a semantic GRID. Markup languages (CML - Chemical Markup Language) and dictionary-based ontologies have been designed to support a wide range of applications, including chemical supply, publication and the safety of compounds. Many properties can be computed by Quantum Mechanical (QM) programs and we have developed a "black-box" system based on XML wrappers for all components. This is installed on a Condor system on which we have computed properties for 250, 000 compounds. The results of this will be available in an OpenData/OpenSource peer-to-peer (P2P) system (WorldWide Molecular Matrix - WWMM)

Apollo (Cambridge)

Supporting collaborative grid application development within the escience community

Author: Boldyreff Cornelia
Nutter David
Publication venue
Publication date: 08/11/2003
Field of study

The systemic representation and organisation of software artefacts, e.g. specifications, designs, interfaces, and implementations, resulting from the development of large distributed systems from software components have been addressed by our research within the Practitioner and AMES projects [1,2,3,4]. Without appropriate representations and organisations, large collections of existing software are not amenable to the activities of software reuse and software maintenance, as these activities are likely to be severely hindered by the difficulties of understanding the software applications and their associated components. In both of these projects, static analysis of source code and other development artefacts, where available, and subsequent application of reverse engineering techniques were successfully used to develop a more comprehensive understanding of the software applications under study [5,6]. Later research addressed the maintenance of a component library in the context of component-based software product line development and maintenance [7]. The classic software decompositions, horizontal and vertical, proposed by Goguen [8] influenced all of this research. While they are adequate for static composition, they fail to address the dynamic aspects of composing large distributed software applications from components especially where these include software services. The separation of component co-ordination concerns from component functionality proposed in [9] offers a partial solution

University of Lincoln Institutional Repository

Interoperability and FAIRness through a novel combination of Web technologies

Author: Bolleman Jerven T.
Bonino da Silva Santos Luiz Olavo
Ciccarese Paolo
Clark Tim
Dumontier Michel
Gavai Anand
Gray Alasdair J. G.
Kaliyaperumal Rajaram
Kelpin Fleur D. L.
Kuzniar Arnold
Schultes Erik A.
Swertz Morris A.
Thompson Mark
van Mulligen Erik M.
Verborgh Ruben
Wilkinson Mark D.
Publication venue: 'PeerJ'
Publication date: 01/01/2017
Field of study

Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved atthe level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs

Maastricht University Research Portal

Heriot Watt Pure

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Ghent University Academic Bibliography

Directory of Open Access Journals

Dissertations of the University of Groningen

Estimating Fire Weather Indices via Semantic Reasoning over Wireless Sensor Network Data Streams

Author: Bruenig Michael
Gao Lianli
Hunter Jane
Publication venue
Publication date: 01/10/2014
Field of study

Wildfires are frequent, devastating events in Australia that regularly cause significant loss of life and widespread property damage. Fire weather indices are a widely-adopted method for measuring fire danger and they play a significant role in issuing bushfire warnings and in anticipating demand for bushfire management resources. Existing systems that calculate fire weather indices are limited due to low spatial and temporal resolution. Localized wireless sensor networks, on the other hand, gather continuous sensor data measuring variables such as air temperature, relative humidity, rainfall and wind speed at high resolutions. However, using wireless sensor networks to estimate fire weather indices is a challenge due to data quality issues, lack of standard data formats and lack of agreement on thresholds and methods for calculating fire weather indices. Within the scope of this paper, we propose a standardized approach to calculating Fire Weather Indices (a.k.a. fire danger ratings) and overcome a number of the challenges by applying Semantic Web Technologies to the processing of data streams from a wireless sensor network deployed in the Springbrook region of South East Queensland. This paper describes the underlying ontologies, the semantic reasoning and the Semantic Fire Weather Index (SFWI) system that we have developed to enable domain experts to specify and adapt rules for calculating Fire Weather Indices. We also describe the Web-based mapping interface that we have developed, that enables users to improve their understanding of how fire weather indices vary over time within a particular region.Finally, we discuss our evaluation results that indicate that the proposed system outperforms state-of-the-art techniques in terms of accuracy, precision and query performance.Comment: 20pages, 12 figure

arXiv.org e-Print Archive

University of Queensland eSpace