Search CORE

5,233 research outputs found

Image annotation with Photocopain

Author: Brewster Christopher
Chakravarthy Ajay
Ciravegna Fabio
Dupplaw David P.
Gibbins Nicholas
Harris Stephen
O'Hara Kieron
Shadbolt Nigel R.
Sleeman Derek
Tuffield Mischa
Wilks Yorick
Publication venue
Publication date: 01/01/2006
Field of study

Photo annotation is a resource-intensive task, yet is increasingly essential as image archives and personal photo collections grow in size. There is an inherent conflict in the process of describing and archiving personal experiences, because casual users are generally unwilling to expend large amounts of effort on creating the annotations which are required to organise their collections so that they can make best use of them. This paper describes the Photocopain system, a semi-automatic image annotation system which combines information about the context in which a photograph was captured with information from other readily available sources in order to generate outline annotations for that photograph that the user may further extend or amend

Southampton (e-Prints Soton)

Aston Publications Explorer

Vagueness and referential ambiguity in a large-scale annotated corpus

Author: Versley Yannick
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions

CiteSeerX

Hochschulschriftenserver - Universität Frankfurt am Main

Ontology Driven Web Extraction from Semi-structured and Unstructured Data for B2B Market Analysis

Author: Darlington John
Imtiaz Hazzaz
Zuo Landong
Publication venue
Publication date: 01/09/2009
Field of study

The Market Blended Insight project1 has the objective of improving the UK business to business marketing performance using the semantic web technologies. In this project, we are implementing an ontology driven web extraction and translation framework to supplement our backend triple store of UK companies, people and geographical information. It deals with both the semi-structured data and the unstructured text on the web, to annotate and then translate the extracted data according to the backend schema

Southampton (e-Prints Soton)

CHORUS Deliverable 4.4: Report of the 2nd CHORUS Conference

Author: Karlgren Jussi
Publication venue: Chorus Project Consortium
Publication date: 01/01/2008
Field of study

The Second CHORUS Conference and third Yahoo! Research Workshop on the Future of Web Search was held during April 4-5, 2008, in Granvalira, Andorra to discuss future directions in multi-medial information access and other specialised topics in the near future of retrieval. Attendance was at capacity, with 97 participants from 11 countries and 3 continents

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

A plant disease extension of the Infectious Disease Ontology

Author: Albert Goldfain
Justin Elser
Smith Barry
Stevenson Dennis W.
Walls Ramona
Publication venue
Publication date: 01/01/2012
Field of study

Plants from a handful of species provide the primary source of food for all people, yet this source is vulnerable to multiple stressors, such as disease, drought, and nutrient deficiency. With rapid population growth and climate uncertainty, the need to produce crops that can tolerate or resist plant stressors is more crucial than ever. Traditional plant breeding methods may not be sufficient to overcome this challenge, and methods such as highOthroughput sequencing and automated scoring of phenotypes can provide significant new insights. Ontologies are essential tools for accessing and analysing the large quantities of data that come with these newer methods. As part of a larger project to develop ontologies that describe plant phenotypes and stresses, we are developing a plant disease extension of the Infectious Disease Ontology (IDOPlant). The IDOPlant is envisioned as a reference ontology designed to cover any plant infectious disease. In addition to novel terms for infectious diseases, IDOPlant includes terms imported from other ontologies that describe plants, pathogens, and vectors, the geographic location and ecology of diseases and hosts, and molecular functions and interactions of hosts and pathogens. To encompass this range of data, we are suggesting inOhouse ontology development complemented with reuse of terms from orthogonal ontologies developed as part of the Open Biomedical Ontologies (OBO) Foundry. The study of plant diseases provides an example of how an ontological framework can be used to model complex biological phenomena such as plant disease, and how plant infectious diseases differ from, and are similar to, infectious diseases in other organism

PhilPapers

Recent developments in linguistic annotations of the TüBa-D/Z treebank

Author: Hinrichs Erhard
Kübler Sandra
Naumann Karin
Telljohann Heike
Trushkina Julia
Publication venue
Publication date: 01/01/2004
Field of study

The purpose of this paper is to describe recent developments in the morphological, syntactic, and semantic annotation of the TüBa-D/Z treebank of German. The TüBa-D/Z annotation scheme is derived from the Verbmobil treebank of spoken German [4, 10], but has been extended along various dimensions to accommodate the characteristics of written texts. TüBa-D/Z uses as its data source the "die tageszeitung" (taz) newspaper corpus. The Verbmobil treebank annotation scheme distinguishes four levels of syntactic constituency: the lexical level, the phrasal level, the level of topological fields, and the clausal level. The primary ordering principle of a clause is the inventory of topological fields, which characterize the word order regularities among different clause types of German, and which are widely accepted among descriptive linguists of German [3, 6]. The TüBa-D/Z annotation relies on a context-free backbone (i.e. proper trees without crossing branches) of phrase structure combined with edge labels that specify the grammatical function of the phrase in question. The syntactic annotation scheme of the TüBa-D/Z is described in more detail in [12, 11]. TüBa-D/Z currently comprises approximately 15 000 sentences, with approximately 7 000 sentences being in the correction phase. The latter will be released along with an updated version of the existing treebank before the end of this year. The treebank is available in an XML format, in the NEGRA export format [1] and in the Penn treebank bracketing format. The XML format contains all types of information as described above, the NEGRA export format contains all sentenceinternal information while the Penn treebank format includes only those layers of information that can be expressed as pure tree structures. Over the course of the last year, more fine grained linguistic annotations have been added along the following dimensions: 1. the basic Stuttgart-Tübingen tagset, STTS, [9] labels have been enriched by relevant features of inflectional morphology, 2. named entity information has been encoded as part of the syntactic annotation, and 3. a set of anaphoric and coreference relations has been added to link referentially dependent noun phrases. In the following sections, we will describe each of these innovations in turn and will demonstrate how the additional annotations can be incorporated into one comprehensive annotation scheme

Hochschulschriftenserver - Universität Frankfurt am Main

MemoryBook: Generating Narratives from Lifelogs

Author: Lewis Paul
Packer Heather S.
Smith Ash
Publication venue
Publication date: 01/06/2012
Field of study

Southampton (e-Prints Soton)

Ontological representation of CDC Active Bacterial Core Surveillance Case Reports

Author: Cowell Lindsay G.
Goldfain Albert
Smith Barry
Publication venue
Publication date: 01/01/2014
Field of study

The Center for Disease Control and Prevention’s Active Bacterial Core Surveillance (CDC ABCs) Program is a collaborative effort betweeen the CDC, state health departments, laboratories, and universities to track invasive bacterial pathogens of particular importance to public health [1]. The year-end surveillance reports produced by this program help to shape public policy and coordinate responses to emerging infectious diseases over time. The ABCs case report form (CRF) data represents an excellent opportunity for data reuse beyond the original surveillance purposes

PhilPapers

ATLAS: A flexible and extensible architecture for linguistic annotation

Author: Bird Steven
Day David
Garofolo John
Henderson John
Laprun Christophe
Liberman Mark
Publication venue
Publication date: 01/01/2000
Field of study

We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract logical model provides for a range of storage formats and promotes the reuse of tools that interact through this API. We focus first on ``Annotation Graphs,'' a graph model for annotations on linear signals (such as text and speech) indexed by intervals, for which efficient database storage and querying techniques are applicable. We note how a wide range of existing annotated corpora can be mapped to this annotation graph model. This model is then generalized to encompass a wider variety of linguistic ``signals,'' including both naturally occuring phenomena (as recorded in images, video, multi-modal interactions, etc.), as well as the derived resources that are increasingly important to the engineering of natural language processing systems (such as word lists, dictionaries, aligned bilingual corpora, etc.). We conclude with a review of the current efforts towards implementing key pieces of this architecture.Comment: 8 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Semantic Markup for Geographic Web Maps in HTML

Author: Reißig Malte
Publication venue: ScholarWorks@UMass Amherst
Publication date: 27/01/2018
Field of study

In the recent years more and more geographical web maps have been developed and published on the Open Web Platform. Technically this has turned all variants of these maps into documents of the Hypertext Markup Language (HTML) making them appear to us naturally as graph-like and semi-structured data. In this dispute with geographical web maps and HTML we draw on the notion of so called “map mashups”. Requiring an alternative model and definition of what such a map is, our research allows us to build and refine supportive technology which helps us in analyzing and interpreting information map makers code into their visualizations. The spectacles we take on to shine light on the current authoring practices behind many geographical web maps are informed by the perspective of a “critical map reader”. A task-oriented conception of “map critique” helped us to deduce a meaningful user perspective from which we specifically call the semantic web community for support on how to represent various information presented in maps from many authors and sources. With this perspective and questions in mind we investigated the Schema.org vocabulary as an ontology to use for turning elements of geographic web maps into textual statements referencing entities in the “outer world”. To illustrate and to make our investigation of the corresponding web standard documents easily applicable for map makers, to open up the discussion, but also to challenge and develop our first conclusions, we implemented them as a minimal extension to the standard API of the LeafletJS open source web mapping library

ScholarWorks@UMass Amherst