Search CORE

5,638 research outputs found

Integration of linked open data in case-based reasoning systems

Author: Althoff Klaus-Dieter
Bach Kerstin
Sauer Christian
Publication venue
Publication date: 02/10/2010
Field of study

This paper discusses the opportunities of integrating Linked Open Data (LOD) resources into Case-Based Reasoning (CBR) systems. Upon the application domain travel medicine, we will exemplify how LOD can be used to fill three out of four knowledge containers a CBR system is based on. The paper also presents the applied techniques for the realization and demonstrates the performance gain of knowledge acquisition by the use of LOD

UWL Repository

TiFi: Taxonomy Induction for Fictional Domains [Extended version]

Author: Chu C.
Razniewski S.
Weikum G.
Publication venue
Publication date: 01/01/2019
Field of study

Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin

MPG.PuRe

A review of the state of the art in Machine Learning on the Semantic Web: Technical Report CSTR-05-003

Author: Price S
Publication venue: Department of Computer Science, University of Bristol
Publication date: 01/01/2004
Field of study

Explore Bristol Research

Towards better understanding Cybersecurity: Or are "Cyberspace" and "Cyber Space" the same?

Author: Camina Steven
Choucri Nazli
Madnick Stuart E.
Woon Wei Lee
Publication venue: Massachusetts Institute of Technology. Engineering Systems Division
Publication date: 01/10/2014
Field of study

Although there are many technology challenges and approaches to attaining cybersecurity, human actions (or inactions) also often pose large risks. There are many reasons, but one problem is whether we all “see the world” the same way. That is, what does “cybersecurity” actually mean – as well as the many related concepts, such as “cyberthreat,” “cybercrime,” etc. Although dictionaries, glossaries, and other sources tell you what words/phrases are supposed to mean (somewhat complicated by the fact that they often contradict each other), they do not tell you how people are actually using them. If we are to have an effective solution, it is important that all the parties understand each other – or, at least, understand that there are different perspectives. For the purpose of this paper and to demonstrate our methodology, we consider the case of the words, “cyberspace” and “cyber space.” When we started, we assumed that “cyberspace” and “cyber space” were essentially the same word with just a minor variation in punctuation (i.e., the space, or lack thereof, between “cyber” and “space”) and that the choice of the punctuation was a rather random occurrence. With that assumption in mind, we would expect that the usage of these words (as determined by the taxonomies that would be constructed by our algorithms) would be basically the same. As it turned out, they were quite different, both in overall shape and groupings within the taxonomy. Since the overall field of cybersecurity is so new, understanding the field and how people think about it (as evidenced by their actual usage of terminology, and how usage changes over time) is an important goal. Our approach helps to illuminate these understandings

DSpace@MIT

Taxonomy Induction using Hypernym Subsequences

Author: Biemann Chris
Cram Damien
Grefenstette Gregory
Gupta Amit
Kozareva Zornitsa
Nastase Vivi
Oakes Michael P
Ponzetto S.
Ponzetto Simone Paolo
Snow Rion
Publication venue
Publication date: 05/05/2017
Field of study

We propose a novel, semi-supervised approach towards domain taxonomy induction from an input vocabulary of seed terms. Unlike all previous approaches, which typically extract direct hypernym edges for terms, our approach utilizes a novel probabilistic framework to extract hypernym subsequences. Taxonomy induction from extracted subsequences is cast as an instance of the minimumcost flow problem on a carefully designed directed graph. Through experiments, we demonstrate that our approach outperforms stateof- the-art taxonomy induction approaches across four languages. Importantly, we also show that our approach is robust to the presence of noise in the input vocabulary. To the best of our knowledge, no previous approaches have been empirically proven to manifest noise-robustness in the input vocabulary

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Towards the Automatic Classification of Documents in User-generated Classifications

Author: Morshed Ahsan-Ul
Publication venue
Publication date: 01/01/2006
Field of study

There is a huge amount of information scattered on the World Wide Web. As the information flow occurs at a high speed in the WWW, there is a need to organize it in the right manner so that a user can access it very easily. Previously the organization of information was generally done manually, by matching the document contents to some pre-defined categories. There are two approaches for this text-based categorization: manual and automatic. In the manual approach, a human expert performs the classification task, and in the second case supervised classifiers are used to automatically classify resources. In a supervised classification, manual interaction is required to create some training data before the automatic classification task takes place. In our new approach, we intend to propose automatic classification of documents through semantic keywords and building the formulas generation by these keywords. Thus we can reduce this human participation by combining the knowledge of a given classification and the knowledge extracted from the data. The main focus of this PhD thesis, supervised by Prof. Fausto Giunchiglia, is the automatic classification of documents into user-generated classifications. The key benefits foreseen from this automatic document classification is not only related to search engines, but also to many other fields like, document organization, text filtering, semantic index managing

Unitn-eprints Research

Experiments on applying relaxation labeling to map multilingual hierarchies

Author: Daude Ventura Jordi
Padró Lluís
Rigau Claramunt German
Publication venue
Publication date: 01/01/1999
Field of study

This paper explores the automatic construction of a multilingual Lexical Knowledge Base from preexisting lexical resources. This paper presents a new approach for linking already existing hierarchies. The Relaxation labeling algorithm is used to select --among all the candidate connections proposed by a bilingual dictionary-- the right conection for each node in the taxonomy.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC