Search CORE

435 research outputs found

Linked Data for the Natural Sciences. Two Use Cases in Chemistry and Biology

Author: Cimiano Philipp
Wiljes Cord
Publication venue
Publication date: 01/01/2012
Field of study

Wiljes C, Cimiano P. Linked Data for the Natural Sciences. Two Use Cases in Chemistry and Biology. In: Proceedings of the Workshop on the Semantic Publishing (SePublica 2012). 2012: 48-59.The Web was designed to improve the way people work together. The Semantic Web extends the Web with a layer of Linked Data that offers new paths for scientific publishing and co-operation. Experimental raw data, released as Linked Data, could be discovered automatically, fostering its reuse and validation by scientists in different contexts and across the boundaries of disciplines. However, the technological barrier for scientists who want to publish and share their research data as Linked Data remains rather high. We present two real-life use cases in the fields of chemistry and biology and outline a general methodology for transforming research data into Linked Data. A key element of our methodology is the role of a scientific data curator, who is proficient in Linked Data technologies and works in close co-operation with the scientist

Publications at Bielefeld University

Ontology-based Information Extraction with SOBA

Author: Buitelaar Paul
Cimiano Philipp
Racioppa Stefania
Siegel Melanie
Publication venue
Publication date: 20/12/2011
Field of study

In this paper we describe SOBA, a sub-component of the SmartWeb multi-modal dialog system. SOBA is a component for ontologybased information extraction from soccer web pages for automatic population of a knowledge base that can be used for domainspecific question answering. SOBA realizes a tight connection between the ontology, knowledge base and the information extraction component. The originality of SOBA is in the fact that it extracts information from heterogeneous sources such as tabular structures, text and image captions in a semantically integrated way. In particular, it stores extracted information in a knowledge base, and in turn uses the knowledge base to interpret and link newly extracted information with respect to already existing entities

Hochschulschriftenserver - Universität Frankfurt am Main

Learning a semantic parser from spoken utterances

Author: Cimiano Philipp
Gaspers Judith
Publication venue
Publication date: 01/01/2014
Field of study

Gaspers J, Cimiano P. Learning a semantic parser from spoken utterances. In: IEEE International Conference on Acoustics, Speech and Signal Processing. 2014

Publications at Bielefeld University

The USAGE review corpus for fine grained multi lingual opinion analysis

Author: Cimiano Philipp
Klinger Roman
Publication venue: Reykjavik, Iceland
Publication date: 01/01/2014
Field of study

Opinion mining has received wide attention in recent years. Models for this task are typically trained or evaluated with a manually annotated dataset. However, fine-grained annotation of sentiments including information about aspects and their evaluation is very labour-intensive. The data available so far is limited. Contributing to this situation, this paper describes the Bielefeld University Sentiment Analysis Corpus for German and English (USAGE), which we offer freely to the community and which contains the annotation of product reviews from Amazon with both aspects and subjective phrases. It provides information on segments in the text which denote an aspect or a subjective evaluative phrase which refers to the aspect. Relations and coreferences are explicitly annotated. This dataset contains 622 English and 611 German reviews, allowing to investigate how to port sentiment analysis systems across languages and domains. We describe the methodology how the corpus was created and provide statistics including inter-annotator agreement. We further provide figures for a baseline system and results for German and English as well as in a cross-domain setting. The results are encouraging in that they show that aspects and phrases can be extracted robustly without the need of tuning to a particular type of products

Forschungsinformationssystem der Universität Bamberg

A Systematic Investigation of Blocking Strategies for Real-time Classification of Social Media Content into Events

Author: Cimiano Philipp
Reuter Timo
Publication venue: AAAI Press
Publication date: 01/01/2012
Field of study

Reuter T, Cimiano P. A Systematic Investigation of Blocking Strategies for Real-time Classification of Social Media Content into Events. In: Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM) - Workshop on Real-Time Analysis and Mining of Social Streams (RAMSS). Palo Alto, California: AAAI Press; 2012.Events play a prominent role in our lives, such that many social media documents describe or are related to some event. Organizing social media documents with respect to events thus seems a promising approach to better manage and organize the ever-increasing amount of user-generated content in social media applications. It would support the navigation of data by events or allow one to get notified about new postings related to the events one is interested in, just to name two applications. A challenge is to automatize this process so that incoming documents can be assigned to their corresponding event without any user intervention. We present a system that is able to classify a stream of social media data into a growing and evolving set of events. In order to scale up to the data sizes and data rates in social media applications, the use of a candidate retrieval or blocking step is crucial to reduce the number of events that are considered as potential candidates to which the incoming data point could belong to. In this paper we present and experimentally compare different blocking strategies along their cost vs. effectiveness tradeoff. We show that using a blocking strategy that selects the 60 closest events with respect to upload time, we reach FMeasures of about 85.1% while being able to process the incoming documents within 32ms on average. We thus provide a principled approach supporting to scale up classification of social media documents into events and to process the incoming stream of documents in real time

Publications at Bielefeld University

Association for the Advancement of Artificial Intelligence: AAAI Publications

Explicit versus Latent Concept Models for Cross-Language Information Retrieval

Author: Boutilier Craig
Cimiano Philipp
Schultz Antje
Sizov Sergej
Sorg Philipp
Staab Steffen
Publication venue: AAAI Press
Publication date: 01/01/2009
Field of study

Cimiano P, Schultz A, Sizov S, Sorg P, Staab S. Explicit versus Latent Concept Models for Cross-Language Information Retrieval. In: Boutilier C, ed. IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press; 2009: 1513-1518

Publications at Bielefeld University

Orthonormal Explicit Topic Analysis for Cross-lingual Document Matching

Author: Cimiano Philipp
Klinger Roman
McCrae John
Publication venue
Publication date: 01/01/2013
Field of study

McCrae J, Cimiano P, Klinger R. Orthonormal Explicit Topic Analysis for Cross-lingual Document Matching. In: Proceedings of the 2013 Conference on Empirical Natural Language Processing. 2013: 1732-1740

Publications at Bielefeld University