Search CORE

88,947 research outputs found

Towards a Semantic-based Approach for Modeling Regulatory Documents in Building Industry

Author: Bouzidi Khalil Riad
Corby Olivier
Faron-Zucker Catherine
Fies Bruno
Nhan Le-Thanh
Publication venue
Publication date: 25/07/2012
Field of study

Regulations in the Building Industry are becoming increasingly complex and involve more than one technical area. They cover products, components and project implementation. They also play an important role to ensure the quality of a building, and to minimize its environmental impact. In this paper, we are particularly interested in the modeling of the regulatory constraints derived from the Technical Guides issued by CSTB and used to validate Technical Assessments. We first describe our approach for modeling regulatory constraints in the SBVR language, and formalizing them in the SPARQL language. Second, we describe how we model the processes of compliance checking described in the CSTB Technical Guides. Third, we show how we implement these processes to assist industrials in drafting Technical Documents in order to acquire a Technical Assessment; a compliance report is automatically generated to explain the compliance or noncompliance of this Technical Documents

arXiv.org e-Print Archive

Crossref

HAL-UNICE

INRIA a CCSD electronic archive server

HAL-Rennes 1

A cancer cell-line titration series for evaluating somatic classification.

Author: Beck Timothy
Brown Andrew MK
Denroche Robert E
McPherson John D
Mullen Laura
Stein Lincoln
Timms Lee
Yung Christina K
Publication venue: eScholarship, University of California
Publication date: 01/12/2015
Field of study

BackgroundAccurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies.ResultsCell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300×. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease.ConclusionsOur cell-line titration series dataset, along with the associated verification results, was effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development. The data is available for further analysis at the European Genome-phenome Archive under accession number EGAS00001001016. Data access requires registration through the International Cancer Genome Consortium's Data Access Compliance Office (ICGC DACO)

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Recommended from our members

Linking early geospatial documents, one place at a time: annotation of geographic documents with Recogito

Author: Barker Elton
de Soto Cañamares Pau
Isaksen Leif
Simon Rainer
Publication venue
Publication date: 01/01/2015
Field of study

Recogito is an open source tool for the semi-automatic annotation of place references in maps and texts. It was developed as part of the Pelagios 3 research project, which aims to build up a comprehensive directory of places referred to in early maps and geographic writing predating the year 1492. Pelagios 3 focuses specifically on sources from the Classical Latin, Greek and Byzantine periods; on Mappae Mundi and narrative texts from the European Medieval period; on Late Medieval Portolans; and on maps and texts from the early Islamic and early Chinese traditions. Since the start of the project in September 2013, the team has harvested more than 120,000 toponyms, manually verifying almost 60,000 of them. Furthermore, the team held two public annotation workshops supported through the Open Humanities Awards 2014. In these workshops, a mixed audience of students and academics of different backgrounds used Recogito to add several thousand contributions on each workshop day. A number of benefits arise out of this work: on the one hand, the digital identification of places – and the names used for them – makes the documents' contents amenable to information retrieval technology, i.e. documents become more easily search- and discoverable to users than through conventional metadata-based search alone. On the other hand, the documents are opened up to new forms of re-use. For example, it becomes possible to “map” and compare the narrative of texts, and the contents of maps with modern day tools like Web maps and GIS; or to analyze and contrast documents’ geographic properties, toponymy and spatial relationships. Seen in a wider context, we argue that initiatives such as ours contribute to the growing ecosystem of the “Graph of Humanities Data” that is gathering pace in the Digital Humanities (linking data about people, places, events, canonical references, etc.), which has the potential to open up new avenues for computational and quantitative research in a variety of fields including History, Geography, Archaeology, Classics, Genealogy and Modern Languages

Open Research Online (The Open University)

Humanities Commons

Open Research Exeter

Lancaster E-Prints

Recommended from our members

UK Research Information Shared Service (UKRISS) Final Report, July 2014

Author: Gartner R
Joerg B
Jones R
McDonald D
Ritchie M
Scoble R
Sudlow A
Tolev E
Trowell S
Waddington S
Publication venue: JISC
Publication date: 01/01/2014
Field of study

The reporting of research information is a complex and expensive activity for research organisations (ROs). There is little alignment between funders of the reporting requests made to institutions and requests made to individual researchers about their research outputs and outcomes. This inevitably results in duplication and increased costs across the sector, whilst limiting the potential sharing and reuse of the information. The UK Research Information Shared Service (UKRISS) project conducted a feasibility and scoping study for the reporting of research information at a national level based on CERIF (Common European Research Information Format), with the objective of increasing efficiency, productivity and quality across the sector. The aim was to define and prototype solutions which are compelling, easy to use, have a low entry barrier, and support innovative information sharing and benchmarking. CERIF has emerged as the preferred format for expressing research information across Europe. To date, CERIF has been piloted for specific applications, but not as a format for reporting requirements across all UK ROs. The final report presents the work carried out by the UKRISS project, including requirements gathering, modelling and prototyping, as well as recommendation for sustainability. UKRISS was divided into two phases. Phase 1, mapping the reporting landscape, ran from March 2012 to December 2012. Phase 2, exploring delivery of potential solutions, began in February 2013 and ended in December 2013

Brunel University Research Archive

FIRE ecosystem progress report (1st interim report)

Author: Boniface M.J.
Grace Paul
Kirkpatrick Scott
Quetin Geraldine
Schaffers Hans
Publication venue: 'University of Southampton'
Publication date: 31/08/2014
Field of study

Southampton (e-Prints Soton)

The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations

Author: Abzianidze Lasha
Bjerva Johannes
Bos Johan
Evang Kilian
Haagsma Hessel
Ludmann Pierre
Nguyen Duc-Duy
van Noord Rik
Publication venue
Publication date: 01/01/2017
Field of study

The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) segmentation of the text in sentences and lexical items; (ii) syntactic parsing with Combinatory Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.Comment: To appear at EACL 201

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Infectious Disease Ontology

Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain

PhilPapers

CiteSeerX

Crossref

Automated speech and audio analysis for semantic access to multimedia

Author: Huijbregts Marijn
Jong Franciska de
Ordelman Roeland
Publication venue: Springer Verlag
Publication date: 01/01/2006
Field of study

The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

University of Twente Research Information