25,892 research outputs found
Semantic Grounding Strategies for Tagbased Recommender Systems
Recommender systems usually operate on similarities between recommended items
or users. Tag based recommender systems utilize similarities on tags. The tags
are however mostly free user entered phrases. Therefore, similarities computed
without their semantic groundings might lead to less relevant recommendations.
In this paper, we study a semantic grounding used for tag similarity calculus.
We show a comprehensive analysis of semantic grounding given by 20 ontologies
from different domains. The study besides other things reveals that currently
available OWL ontologies are very narrow and the percentage of the similarity
expansions is rather small. WordNet scores slightly better as it is broader but
not much as it does not support several semantic relationships. Furthermore,
the study reveals that even with such number of expansions, the recommendations
change considerably.Comment: 13 pages, 5 figure
Recommended from our members
Open Science principles for accelerating trait-based science across the Tree of Life.
Synthesizing trait observations and knowledge across the Tree of Life remains a grand challenge for biodiversity science. Species traits are widely used in ecological and evolutionary science, and new data and methods have proliferated rapidly. Yet accessing and integrating disparate data sources remains a considerable challenge, slowing progress toward a global synthesis to integrate trait data across organisms. Trait science needs a vision for achieving global integration across all organisms. Here, we outline how the adoption of key Open Science principles-open data, open source and open methods-is transforming trait science, increasing transparency, democratizing access and accelerating global synthesis. To enhance widespread adoption of these principles, we introduce the Open Traits Network (OTN), a global, decentralized community welcoming all researchers and institutions pursuing the collaborative goal of standardizing and integrating trait data across organisms. We demonstrate how adherence to Open Science principles is key to the OTN community and outline five activities that can accelerate the synthesis of trait data across the Tree of Life, thereby facilitating rapid advances to address scientific inquiries and environmental issues. Lessons learned along the path to a global synthesis of trait data will provide a framework for addressing similarly complex data science and informatics challenges
Automatic multi-label subject indexing in a multilingual environment
This paper presents an approach to automatically subject index fulltext documents with multiple labels based on binary support vector machines(SVM). The aim was to test the applicability of SVMs with a real world dataset. We have also explored the feasibility of incorporating multilingual background knowledge, as represented in thesauri or ontologies, into our text document representation for indexing purposes. The test set for our evaluations has been compiled from an extensive document base maintained by the Food and Agriculture Organization (FAO) of the United Nations (UN). Empirical results show that SVMs are a good method for automatic multi- label classification of documents in multiple languages
Ontologies and Information Extraction
This report argues that, even in the simplest cases, IE is an ontology-driven
process. It is not a mere text filtering method based on simple pattern
matching and keywords, because the extracted pieces of texts are interpreted
with respect to a predefined partial domain model. This report shows that
depending on the nature and the depth of the interpretation to be done for
extracting the information, more or less knowledge must be involved. This
report is mainly illustrated in biology, a domain in which there are critical
needs for content-based exploration of the scientific literature and which
becomes a major application domain for IE
Statistical mechanics of ontology based annotations
We present a statistical mechanical theory of the process of annotating an
object with terms selected from an ontology. The term selection process is
formulated as an ideal lattice gas model, but in a highly structured
inhomogeneous field. The model enables us to explain patterns recently observed
in real-world annotation data sets, in terms of the underlying graph structure
of the ontology. By relating the external field strengths to the information
content of each node in the ontology graph, the statistical mechanical model
also allows us to propose a number of practical metrics for assessing the
quality of both the ontology, and the annotations that arise from its use.
Using the statistical mechanical formalism we also study an ensemble of
ontologies of differing size and complexity; an analysis not readily performed
using real data alone. Focusing on regular tree ontology graphs we uncover a
rich set of scaling laws describing the growth in the optimal ontology size as
the number of objects being annotated increases. In doing so we provide a
further possible measure for assessment of ontologies.Comment: 27 pages, 5 figure
Recommended from our members
Ontological Foundations for Scholarly Debate Mapping Technology
Mapping scholarly debates is an important genre of what can be called Knowledge Domain Analytics (KDA) technology – i.e. technology which combines both quantitative and qualitative methods of analysing specialist knowledge domains. However, current KDA technology research has emerged from diverse traditions and thus lacks a common conceptual foundation. This paper reports on the design of a KDA ontology that aims to provide this foundation. The paper then describes the argumentation extensions to the ontology for supporting scholarly debate mapping as a special form of KDA and demonstrates its expressive capabilities using a case study debate
- …