1,356 research outputs found
Challenges to knowledge representation in multilingual contexts
To meet the increasing demands of the complex inter-organizational processes and the demand for
continuous innovation and internationalization, it is evident that new forms of organisation are
being adopted, fostering more intensive collaboration processes and sharing of resources, in what
can be called collaborative networks (Camarinha-Matos, 2006:03). Information and knowledge are
crucial resources in collaborative networks, being their management fundamental processes to
optimize.
Knowledge organisation and collaboration systems are thus important instruments for the success of
collaborative networks of organisations having been researched in the last decade in the areas of
computer science, information science, management sciences, terminology and linguistics.
Nevertheless, research in this area didn’t give much attention to multilingual contexts of
collaboration, which pose specific and challenging problems. It is then clear that access to and
representation of knowledge will happen more and more on a multilingual setting which implies the
overcoming of difficulties inherent to the presence of multiple languages, through the use of
processes like localization of ontologies.
Although localization, like other processes that involve multilingualism, is a rather well-developed
practice and its methodologies and tools fruitfully employed by the language industry in the
development and adaptation of multilingual content, it has not yet been sufficiently explored as an
element of support to the development of knowledge representations - in particular ontologies -
expressed in more than one language. Multilingual knowledge representation is then an open
research area calling for cross-contributions from knowledge engineering, terminology, ontology
engineering, cognitive sciences, computational linguistics, natural language processing, and
management sciences.
This workshop joined researchers interested in multilingual knowledge representation, in a
multidisciplinary environment to debate the possibilities of cross-fertilization between knowledge
engineering, terminology, ontology engineering, cognitive sciences, computational linguistics,
natural language processing, and management sciences applied to contexts where multilingualism
continuously creates new and demanding challenges to current knowledge representation methods
and techniques.
In this workshop six papers dealing with different approaches to multilingual knowledge
representation are presented, most of them describing tools, approaches and results obtained in the
development of ongoing projects.
In the first case, Andrés Domínguez Burgos, Koen Kerremansa and Rita Temmerman present a
software module that is part of a workbench for terminological and ontological mining,
Termontospider, a wiki crawler that aims at optimally traverse Wikipedia in search of domainspecific
texts for extracting terminological and ontological information. The crawler is part of a tool
suite for automatically developing multilingual termontological databases, i.e. ontologicallyunderpinned
multilingual terminological databases. In this paper the authors describe the basic principles
behind the crawler and summarized the research setting in which the tool is currently tested.
In the second paper, Fumiko Kano presents a work comparing four feature-based similarity
measures derived from cognitive sciences. The purpose of the comparative analysis presented by the author is to verify the potentially most effective model that can be applied for mapping independent ontologies in a culturally influenced domain. For that, datasets based on standardized
pre-defined feature dimensions and values, which are obtainable from the UNESCO Institute for
Statistics (UIS) have been used for the comparative analysis of the similarity measures. The purpose
of the comparison is to verify the similarity measures based on the objectively developed datasets.
According to the author the results demonstrate that the Bayesian Model of Generalization provides
for the most effective cognitive model for identifying the most similar corresponding concepts
existing for a targeted socio-cultural community.
In another presentation, Thierry Declerck, Hans-Ulrich Krieger and Dagmar Gromann present an
ongoing work and propose an approach to automatic extraction of information from multilingual
financial Web resources, to provide candidate terms for building ontology elements or instances of
ontology concepts. The authors present a complementary approach to the direct
localization/translation of ontology labels, by acquiring terminologies through the access and
harvesting of multilingual Web presences of structured information providers in the field of finance,
leading to both the detection of candidate terms in various multilingual sources in the financial
domain that can be used not only as labels of ontology classes and properties but also for the
possible generation of (multilingual) domain ontologies themselves.
In the next paper, Manuel Silva, António Lucas Soares and Rute Costa claim that despite the
availability of tools, resources and techniques aimed at the construction of ontological artifacts,
developing a shared conceptualization of a given reality still raises questions about the principles
and methods that support the initial phases of conceptualization. These questions become, according
to the authors, more complex when the conceptualization occurs in a multilingual setting. To tackle
these issues the authors present a collaborative platform – conceptME - where terminological and
knowledge representation processes support domain experts throughout a conceptualization
framework, allowing the inclusion of multilingual data as a way to promote knowledge sharing and
enhance conceptualization and support a multilingual ontology specification.
In another presentation Frieda Steurs and Hendrik J. Kockaert present us TermWise, a large project
dealing with legal terminology and phraseology for the Belgian public services, i.e. the translation
office of the ministry of justice, a project which aims at developing an advanced tool including
expert knowledge in the algorithms that extract specialized language from textual data (legal
documents) and whose outcome is a knowledge database including Dutch/French equivalents for
legal concepts, enriched with the phraseology related to the terms under discussion.
Finally, Deborah Grbac, Luca Losito, Andrea Sada and Paolo Sirito report on the preliminary
results of a pilot project currently ongoing at UCSC Central Library, where they propose to adapt to
subject librarians, employed in large and multilingual Academic Institutions, the model used by
translators working within European Union Institutions. The authors are using User Experience
(UX) Analysis in order to provide subject librarians with a visual support, by means of “ontology
tables” depicting conceptual linking and connections of words with concepts presented according to
their semantic and linguistic meaning.
The organizers hope that the selection of papers presented here will be of interest to a broad audience, and will be a starting point for further discussion and cooperation
First Attempt towards a Standard Glossary of Ontology Engineering Terminology
In this paper we present the consensus reaching process followed
within the NeOn consortium for the identification and definition of the
activities involved in the ontology network development process. This work
was conceived due to the lack of standardization in the Ontology Engineering
terminology, which clearly contrasts with the Software Engineering field that
boasts the IEEE Standard Glossary of Software Engineering Terminology.
The paper also includes the NeOn Glossary of Activities, which is the result
of the consensus reaching process here explained. Our future aim is to
standardize the NeOn Glossary of Activities
Identifying Web Tables - Supporting a Neglected Type of Content on the Web
The abundance of the data in the Internet facilitates the improvement of
extraction and processing tools. The trend in the open data publishing
encourages the adoption of structured formats like CSV and RDF. However, there
is still a plethora of unstructured data on the Web which we assume contain
semantics. For this reason, we propose an approach to derive semantics from web
tables which are still the most popular publishing tool on the Web. The paper
also discusses methods and services of unstructured data extraction and
processing as well as machine learning techniques to enhance such a workflow.
The eventual result is a framework to process, publish and visualize linked
open data. The software enables tables extraction from various open data
sources in the HTML format and an automatic export to the RDF format making the
data linked. The paper also gives the evaluation of machine learning techniques
in conjunction with string similarity functions to be applied in a tables
recognition task.Comment: 9 pages, 4 figure
Criteria for the Integration of Term Banks in the Professional Translation Environment
[EN] Translation-oriented terminology management is not only limited to the study of terminology
problems with regards to specialization, currency, and reliability. The integration of
terminology data bases within CAT tools facilitating their use, maintenance and retrieval
towards the automation of the translation process and consistency of terminology has also attracted
attention from the academia and the language industry alike. However, this approach
to terminology management seems to be carried out from a mostly theoretical perspective.
Thus, the aim of this paper is to present the results of a survey conducted among professional
translators in Spain regarding their actual experience with terminology in order to identify
potential gaps between the technological offer and the specific needs of translators.Candel-Mora, MÁ. (2017). Criteria for the Integration of Term Banks in the Professional Translation Environment. Sendebar. 28:243-260. http://hdl.handle.net/10251/111703S2432602
Formal description of conceptual relationship with a view to implementing them in the ontology editor Protége
In this article we present a catalogue of conceptual relationships in which each relationship is defined formally in terms of its properties and the nature of the conceptual classes involved. By making explicit the conceptual relationships of the catalogue using the standard ontology editor Protégé we should be able to retrieve conceptual knowledge in an onomasiological way using the Queries function of the editor. In the final part of the article we present a sample query taken from the analysis of the terminology of finished ceramic products in order to show how information about relationships can be retrieved
Criterios para la integración de bancos de datos terminológicos en el entorno del traductor profesional
Translation-oriented terminology management is not only limited to the study of terminology problems with regards to specialization, currency, and reliability. The integration of terminology data bases within CAT tools facilitating their use, maintenance and retrieval towards the automation of the translation process and consistency of terminology has also attracted attention from the academia and the language industry alike. However, this approach to terminology management seems to be carried out from a mostly theoretical perspective. Thus, the aim of this paper is to present the results of a survey conducted among professional translators in Spain regarding their actual experience with terminology in order to identify potential gaps between the technological offer and the specific needs of translators.La gestión de terminología para la traducción no se limita al estudio de problemas terminológicos de especialización, vigencia y fiabilidad: también la integración de bases de datos terminológicas en herramientas de traducción asistida que facilita su uso, mantenimiento y recuperación, y contribuye a la automatización del proceso de traducción y a la consistencia terminológica ha sido objeto de investigación académica y desde la industria. Sin embargo, este enfoque parece llevarse a cabo desde una perspectiva teórica en su mayoría. Por lo tanto, el objetivo de este trabajo es presentar los resultados de una encuesta realizada a traductores profesionales en España sobre su experiencia real con terminología para identificar posibles brechas entre la oferta tecnológica y las necesidades específicas de los traductores
From Terminology Database to Platform for Terminology Services
Proceedings of the Workshop
CHAT 2011: Creation, Harmonization and Application of Terminology Resources.
Editors: Tatiana Gornostay and Andrejs Vasiļjevs.
NEALT Proceedings Series, Vol. 12 (2011), 16-21.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/16956
Method for the semantic indexing of concept hierarchies, uniform representation, use of relational database systems and generic and case-based reasoning
This paper presents a method for semantic indexing and describes its
application in the field of knowledge representation. Starting point of the
semantic indexing is the knowledge represented by concept hierarchies. The goal
is to assign keys to nodes (concepts) that are hierarchically ordered and
syntactically and semantically correct. With the indexing algorithm, keys are
computed such that concepts are partially unifiable with all more specific
concepts and only semantically correct concepts are allowed to be added. The
keys represent terminological relationships. Correctness and completeness of
the underlying indexing algorithm are proven. The use of classical relational
databases for the storage of instances is described. Because of the uniform
representation, inference can be done using case-based reasoning and generic
problem solving methods
A quest for the right word enhancing reflexivity and technology in terminology training
INTED2010, the 4th International Technology, Education and Development Conference was held in Valencia (Spain), on March 8, 9 and 10, 2010.When it comes to translators training, the acquisition of indexing and terminological competences
(both at retrieval and management stage) has a major role in the performance of future translators. A
good terminological database, as a result of an accurate research, along with computer assisted
translation tools (CAT tools) can improve translation’s speed and quality and also reduce revision
costs, bringing in benefits for all the players in the translation industry: language service providers and
clients.
That process (analysis, selection, retrieval and storage of terminology) takes place mostly in the pretranslation
stage, but underlies the whole translation work and can be a determining factor to the
quality of the final product and to its homogeneity, especially when carried out in a collaborative
environment.
The development of terminological databases is an essential step in the training of translators and the
efficient search for the right word a necessary skill in today's globalised translation market. Moreover
being the quest for the right word almost entirely run over the Internet, data diversity can greatly
increase the noise. This search poses several questions, mainly (1) how and where to retrieve
information and (2) how to manage it efficiently, especially to students who are neither experts in
terminology nor in translation.
To ease some of these problems, students were assigned a project in terminology (a database) and,
in order to accomplish it, both a Webquest and an ePortfolio were proposed as guidance tools. Along
the process, students were expected to build up their thematic and communicative competence and,
in parallel, widen their skills in computer-assisted translation tools as well as standard officeautomation
software.
This paper aims at discussing how these two tools helped students guide their research, structure the
problem solving activities, develop critical thinking and terminological competencies
- …