422 research outputs found
OILSW: A New System for Ontology Instance Learning
The Semantic Web is expected to extend the current Web by
providing structured content via the addition of annotations. Because of
the large amount of pages in the Web, manual annotation is very time
consuming. Finding an automatic or semiautomatic method to change
the current Web to the Semantic Web is very helpful. In a specific
domain, Web pages are the instances of that domain ontology. So we
need semiautomatic tools to find these instances and fill their attributes.
In this article, we propose a new system named OILSW for instance
learning of an ontology from Web pages of Websites in a common
domain. This system is the first comprehensive system for automatically
populating the ontology for websites. By using this system, any Website
in a certain domain can be automatically annotated
Using the web to resolve coreferent bridging in German newspaper text
We adopt Markert and Nissim (2005)’s approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall
Una revisión de la literatura sobre población de ontologías
The main goal of ontologies in computing is related to the definition of a common vocabulary for describing basic concepts and relationships on a specific domain. Main components of ontologies are classes—concepts—, instances, properties, relations, and axioms, among others elements. The ontology population process is intended to receive an ontology as input in order to extract and relate the instances of each ontology class from heterogenous information sources. In this paper we perform a systematic state-of-the-art review about ontology population. We select papers from specialized databases and we create a research question for driving paper search. The results of our review points out ontology population as an interesting topic for researchers. Even though we have several techniques for driving the process, fully automated tools are still missing and we also miss high levels of precision and recall.El principal objetivo de las ontologías en computación es la definición de un vocabulario común para describir conceptos básicos y sus relaciones en un dominio específico. Los principales componentes de las ontologías son clases (conceptos), instancias, propiedades, relaciones y axiomas, entre otros elementos. El proceso de población de ontologías se refiere a la recepción de una ontología como entrada, para luego extraer y relacionar las instancias a cada clase de la ontología desde fuentes de información heterogéneas. En este artículo se realiza una revisión sistemática de literatura sobre la población de ontologías. Se seleccionan artículos de bases de datos especializadas y se crea una pregunta de investigación que permita dirigir la búsqueda de los artículos. Los resultados de la revisión apuntan a que la población de ontologías es un tema de interés para los investigadores. A pesar de que existen muchas técnicas para realizar el proceso, hace falta crear herramientas automáticas y con altos niveles de precision y recall
A Health eLearning Ontology and Procedural Reasoning Approach for Developing Personalized Courses to Teach Patients about Their Medical Condition and Treatment
We propose a methodological framework to support the development of personalized courses that improve patients’ understanding of their condition and prescribed treatment. Inspired by Intelligent Tutoring Systems (ITSs), the framework uses an eLearning ontology to express domain and learner models and to create a course. We combine the ontology with a procedural reasoning approach and precompiled plans to operationalize a design across disease conditions. The resulting courses generated by the framework are personalized across four patient axes—condition and treatment, comprehension level, learning style based on the VARK (Visual, Aural, Read/write, Kinesthetic) presentation model, and the level of understanding of specific course content according to Bloom’s taxonomy. Customizing educational materials along these learning axes stimulates and sustains patients’ attention when learning about their conditions or treatment options. Our proposed framework creates a personalized course that prepares patients for their meetings with specialists and educates them about their prescribed treatment. We posit that the improvement in patients’ understanding of prescribed care will result in better outcomes and we validate that the constructs of our framework are appropriate for representing content and deriving personalized courses for two use cases: anticoagulation treatment of an atrial fibrillation patient and lower back pain management to treat a lumbar degenerative disc condition. We conduct a mostly qualitative study supported by a quantitative questionnaire to investigate the acceptability of the framework among the target patient population and medical practitioners
Taking antonymy mask off in vector space
Automatic detection of antonymy is an important task in Natural Language Processing (NLP) for Information Retrieval (IR), Ontology Learning (OL) and many other semantic applications. However, current unsupervised approaches to antonymy detection are still not fully effective because they cannot discriminate antonyms from synonyms. In this paper, we introduce APAnt, a new Average-Precision-based measure for the unsupervised discrimination of antonymy from synonymy using Distributional Semantic Models (DSMs). APAnt makes use of Average Precision to estimate the extent and salience of the intersection among the most descriptive contexts of two target words. Evaluation shows that the proposed method is able to distinguish antonyms and synonyms with high accuracy across different parts of speech, including nouns, adjectives and verbs. APAnt outperforms the vector cosine and a baseline model implementing the co-occurrence hypothesis
Challenges to knowledge representation in multilingual contexts
To meet the increasing demands of the complex inter-organizational processes and the demand for
continuous innovation and internationalization, it is evident that new forms of organisation are
being adopted, fostering more intensive collaboration processes and sharing of resources, in what
can be called collaborative networks (Camarinha-Matos, 2006:03). Information and knowledge are
crucial resources in collaborative networks, being their management fundamental processes to
optimize.
Knowledge organisation and collaboration systems are thus important instruments for the success of
collaborative networks of organisations having been researched in the last decade in the areas of
computer science, information science, management sciences, terminology and linguistics.
Nevertheless, research in this area didn’t give much attention to multilingual contexts of
collaboration, which pose specific and challenging problems. It is then clear that access to and
representation of knowledge will happen more and more on a multilingual setting which implies the
overcoming of difficulties inherent to the presence of multiple languages, through the use of
processes like localization of ontologies.
Although localization, like other processes that involve multilingualism, is a rather well-developed
practice and its methodologies and tools fruitfully employed by the language industry in the
development and adaptation of multilingual content, it has not yet been sufficiently explored as an
element of support to the development of knowledge representations - in particular ontologies -
expressed in more than one language. Multilingual knowledge representation is then an open
research area calling for cross-contributions from knowledge engineering, terminology, ontology
engineering, cognitive sciences, computational linguistics, natural language processing, and
management sciences.
This workshop joined researchers interested in multilingual knowledge representation, in a
multidisciplinary environment to debate the possibilities of cross-fertilization between knowledge
engineering, terminology, ontology engineering, cognitive sciences, computational linguistics,
natural language processing, and management sciences applied to contexts where multilingualism
continuously creates new and demanding challenges to current knowledge representation methods
and techniques.
In this workshop six papers dealing with different approaches to multilingual knowledge
representation are presented, most of them describing tools, approaches and results obtained in the
development of ongoing projects.
In the first case, Andrés Domínguez Burgos, Koen Kerremansa and Rita Temmerman present a
software module that is part of a workbench for terminological and ontological mining,
Termontospider, a wiki crawler that aims at optimally traverse Wikipedia in search of domainspecific
texts for extracting terminological and ontological information. The crawler is part of a tool
suite for automatically developing multilingual termontological databases, i.e. ontologicallyunderpinned
multilingual terminological databases. In this paper the authors describe the basic principles
behind the crawler and summarized the research setting in which the tool is currently tested.
In the second paper, Fumiko Kano presents a work comparing four feature-based similarity
measures derived from cognitive sciences. The purpose of the comparative analysis presented by the author is to verify the potentially most effective model that can be applied for mapping independent ontologies in a culturally influenced domain. For that, datasets based on standardized
pre-defined feature dimensions and values, which are obtainable from the UNESCO Institute for
Statistics (UIS) have been used for the comparative analysis of the similarity measures. The purpose
of the comparison is to verify the similarity measures based on the objectively developed datasets.
According to the author the results demonstrate that the Bayesian Model of Generalization provides
for the most effective cognitive model for identifying the most similar corresponding concepts
existing for a targeted socio-cultural community.
In another presentation, Thierry Declerck, Hans-Ulrich Krieger and Dagmar Gromann present an
ongoing work and propose an approach to automatic extraction of information from multilingual
financial Web resources, to provide candidate terms for building ontology elements or instances of
ontology concepts. The authors present a complementary approach to the direct
localization/translation of ontology labels, by acquiring terminologies through the access and
harvesting of multilingual Web presences of structured information providers in the field of finance,
leading to both the detection of candidate terms in various multilingual sources in the financial
domain that can be used not only as labels of ontology classes and properties but also for the
possible generation of (multilingual) domain ontologies themselves.
In the next paper, Manuel Silva, António Lucas Soares and Rute Costa claim that despite the
availability of tools, resources and techniques aimed at the construction of ontological artifacts,
developing a shared conceptualization of a given reality still raises questions about the principles
and methods that support the initial phases of conceptualization. These questions become, according
to the authors, more complex when the conceptualization occurs in a multilingual setting. To tackle
these issues the authors present a collaborative platform – conceptME - where terminological and
knowledge representation processes support domain experts throughout a conceptualization
framework, allowing the inclusion of multilingual data as a way to promote knowledge sharing and
enhance conceptualization and support a multilingual ontology specification.
In another presentation Frieda Steurs and Hendrik J. Kockaert present us TermWise, a large project
dealing with legal terminology and phraseology for the Belgian public services, i.e. the translation
office of the ministry of justice, a project which aims at developing an advanced tool including
expert knowledge in the algorithms that extract specialized language from textual data (legal
documents) and whose outcome is a knowledge database including Dutch/French equivalents for
legal concepts, enriched with the phraseology related to the terms under discussion.
Finally, Deborah Grbac, Luca Losito, Andrea Sada and Paolo Sirito report on the preliminary
results of a pilot project currently ongoing at UCSC Central Library, where they propose to adapt to
subject librarians, employed in large and multilingual Academic Institutions, the model used by
translators working within European Union Institutions. The authors are using User Experience
(UX) Analysis in order to provide subject librarians with a visual support, by means of “ontology
tables” depicting conceptual linking and connections of words with concepts presented according to
their semantic and linguistic meaning.
The organizers hope that the selection of papers presented here will be of interest to a broad audience, and will be a starting point for further discussion and cooperation
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
- …