1,356 research outputs found

    Challenges to knowledge representation in multilingual contexts

    Get PDF
    To meet the increasing demands of the complex inter-organizational processes and the demand for continuous innovation and internationalization, it is evident that new forms of organisation are being adopted, fostering more intensive collaboration processes and sharing of resources, in what can be called collaborative networks (Camarinha-Matos, 2006:03). Information and knowledge are crucial resources in collaborative networks, being their management fundamental processes to optimize. Knowledge organisation and collaboration systems are thus important instruments for the success of collaborative networks of organisations having been researched in the last decade in the areas of computer science, information science, management sciences, terminology and linguistics. Nevertheless, research in this area didn’t give much attention to multilingual contexts of collaboration, which pose specific and challenging problems. It is then clear that access to and representation of knowledge will happen more and more on a multilingual setting which implies the overcoming of difficulties inherent to the presence of multiple languages, through the use of processes like localization of ontologies. Although localization, like other processes that involve multilingualism, is a rather well-developed practice and its methodologies and tools fruitfully employed by the language industry in the development and adaptation of multilingual content, it has not yet been sufficiently explored as an element of support to the development of knowledge representations - in particular ontologies - expressed in more than one language. Multilingual knowledge representation is then an open research area calling for cross-contributions from knowledge engineering, terminology, ontology engineering, cognitive sciences, computational linguistics, natural language processing, and management sciences. This workshop joined researchers interested in multilingual knowledge representation, in a multidisciplinary environment to debate the possibilities of cross-fertilization between knowledge engineering, terminology, ontology engineering, cognitive sciences, computational linguistics, natural language processing, and management sciences applied to contexts where multilingualism continuously creates new and demanding challenges to current knowledge representation methods and techniques. In this workshop six papers dealing with different approaches to multilingual knowledge representation are presented, most of them describing tools, approaches and results obtained in the development of ongoing projects. In the first case, Andrés Domínguez Burgos, Koen Kerremansa and Rita Temmerman present a software module that is part of a workbench for terminological and ontological mining, Termontospider, a wiki crawler that aims at optimally traverse Wikipedia in search of domainspecific texts for extracting terminological and ontological information. The crawler is part of a tool suite for automatically developing multilingual termontological databases, i.e. ontologicallyunderpinned multilingual terminological databases. In this paper the authors describe the basic principles behind the crawler and summarized the research setting in which the tool is currently tested. In the second paper, Fumiko Kano presents a work comparing four feature-based similarity measures derived from cognitive sciences. The purpose of the comparative analysis presented by the author is to verify the potentially most effective model that can be applied for mapping independent ontologies in a culturally influenced domain. For that, datasets based on standardized pre-defined feature dimensions and values, which are obtainable from the UNESCO Institute for Statistics (UIS) have been used for the comparative analysis of the similarity measures. The purpose of the comparison is to verify the similarity measures based on the objectively developed datasets. According to the author the results demonstrate that the Bayesian Model of Generalization provides for the most effective cognitive model for identifying the most similar corresponding concepts existing for a targeted socio-cultural community. In another presentation, Thierry Declerck, Hans-Ulrich Krieger and Dagmar Gromann present an ongoing work and propose an approach to automatic extraction of information from multilingual financial Web resources, to provide candidate terms for building ontology elements or instances of ontology concepts. The authors present a complementary approach to the direct localization/translation of ontology labels, by acquiring terminologies through the access and harvesting of multilingual Web presences of structured information providers in the field of finance, leading to both the detection of candidate terms in various multilingual sources in the financial domain that can be used not only as labels of ontology classes and properties but also for the possible generation of (multilingual) domain ontologies themselves. In the next paper, Manuel Silva, António Lucas Soares and Rute Costa claim that despite the availability of tools, resources and techniques aimed at the construction of ontological artifacts, developing a shared conceptualization of a given reality still raises questions about the principles and methods that support the initial phases of conceptualization. These questions become, according to the authors, more complex when the conceptualization occurs in a multilingual setting. To tackle these issues the authors present a collaborative platform – conceptME - where terminological and knowledge representation processes support domain experts throughout a conceptualization framework, allowing the inclusion of multilingual data as a way to promote knowledge sharing and enhance conceptualization and support a multilingual ontology specification. In another presentation Frieda Steurs and Hendrik J. Kockaert present us TermWise, a large project dealing with legal terminology and phraseology for the Belgian public services, i.e. the translation office of the ministry of justice, a project which aims at developing an advanced tool including expert knowledge in the algorithms that extract specialized language from textual data (legal documents) and whose outcome is a knowledge database including Dutch/French equivalents for legal concepts, enriched with the phraseology related to the terms under discussion. Finally, Deborah Grbac, Luca Losito, Andrea Sada and Paolo Sirito report on the preliminary results of a pilot project currently ongoing at UCSC Central Library, where they propose to adapt to subject librarians, employed in large and multilingual Academic Institutions, the model used by translators working within European Union Institutions. The authors are using User Experience (UX) Analysis in order to provide subject librarians with a visual support, by means of “ontology tables” depicting conceptual linking and connections of words with concepts presented according to their semantic and linguistic meaning. The organizers hope that the selection of papers presented here will be of interest to a broad audience, and will be a starting point for further discussion and cooperation

    First Attempt towards a Standard Glossary of Ontology Engineering Terminology

    Get PDF
    In this paper we present the consensus reaching process followed within the NeOn consortium for the identification and definition of the activities involved in the ontology network development process. This work was conceived due to the lack of standardization in the Ontology Engineering terminology, which clearly contrasts with the Software Engineering field that boasts the IEEE Standard Glossary of Software Engineering Terminology. The paper also includes the NeOn Glossary of Activities, which is the result of the consensus reaching process here explained. Our future aim is to standardize the NeOn Glossary of Activities

    Identifying Web Tables - Supporting a Neglected Type of Content on the Web

    Full text link
    The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a plethora of unstructured data on the Web which we assume contain semantics. For this reason, we propose an approach to derive semantics from web tables which are still the most popular publishing tool on the Web. The paper also discusses methods and services of unstructured data extraction and processing as well as machine learning techniques to enhance such a workflow. The eventual result is a framework to process, publish and visualize linked open data. The software enables tables extraction from various open data sources in the HTML format and an automatic export to the RDF format making the data linked. The paper also gives the evaluation of machine learning techniques in conjunction with string similarity functions to be applied in a tables recognition task.Comment: 9 pages, 4 figure

    Criteria for the Integration of Term Banks in the Professional Translation Environment

    Full text link
    [EN] Translation-oriented terminology management is not only limited to the study of terminology problems with regards to specialization, currency, and reliability. The integration of terminology data bases within CAT tools facilitating their use, maintenance and retrieval towards the automation of the translation process and consistency of terminology has also attracted attention from the academia and the language industry alike. However, this approach to terminology management seems to be carried out from a mostly theoretical perspective. Thus, the aim of this paper is to present the results of a survey conducted among professional translators in Spain regarding their actual experience with terminology in order to identify potential gaps between the technological offer and the specific needs of translators.Candel-Mora, MÁ. (2017). Criteria for the Integration of Term Banks in the Professional Translation Environment. Sendebar. 28:243-260. http://hdl.handle.net/10251/111703S2432602

    Formal description of conceptual relationship with a view to implementing them in the ontology editor Protége

    Get PDF
    In this article we present a catalogue of conceptual relationships in which each relationship is defined formally in terms of its properties and the nature of the conceptual classes involved. By making explicit the conceptual relationships of the catalogue using the standard ontology editor Protégé we should be able to retrieve conceptual knowledge in an onomasiological way using the Queries function of the editor. In the final part of the article we present a sample query taken from the analysis of the terminology of finished ceramic products in order to show how information about relationships can be retrieved

    Criterios para la integración de bancos de datos terminológicos en el entorno del traductor profesional

    Get PDF
    Translation-oriented terminology management is not only limited to the study of terminology problems with regards to specialization, currency, and reliability. The integration of terminology data bases within CAT tools facilitating their use, maintenance and retrieval towards the automation of the translation process and consistency of terminology has also attracted attention from the academia and the language industry alike. However, this approach to terminology management seems to be carried out from a mostly theoretical perspective. Thus, the aim of this paper is to present the results of a survey conducted among professional translators in Spain regarding their actual experience with terminology in order to identify potential gaps between the technological offer and the specific needs of translators.La gestión de terminología para la traducción no se limita al estudio de problemas terminológicos de especialización, vigencia y fiabilidad: también la integración de bases de datos terminológicas en herramientas de traducción asistida que facilita su uso, mantenimiento y recuperación, y contribuye a la automatización del proceso de traducción y a la consistencia terminológica ha sido objeto de investigación académica y desde la industria. Sin embargo, este enfoque parece llevarse a cabo desde una perspectiva teórica en su mayoría. Por lo tanto, el objetivo de este trabajo es presentar los resultados de una encuesta realizada a traductores profesionales en España sobre su experiencia real con terminología para identificar posibles brechas entre la oferta tecnológica y las necesidades específicas de los traductores

    From Terminology Database to Platform for Terminology Services

    Get PDF
    Proceedings of the Workshop CHAT 2011: Creation, Harmonization and Application of Terminology Resources. Editors: Tatiana Gornostay and Andrejs Vasiļjevs. NEALT Proceedings Series, Vol. 12 (2011), 16-21. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16956

    Method for the semantic indexing of concept hierarchies, uniform representation, use of relational database systems and generic and case-based reasoning

    Full text link
    This paper presents a method for semantic indexing and describes its application in the field of knowledge representation. Starting point of the semantic indexing is the knowledge represented by concept hierarchies. The goal is to assign keys to nodes (concepts) that are hierarchically ordered and syntactically and semantically correct. With the indexing algorithm, keys are computed such that concepts are partially unifiable with all more specific concepts and only semantically correct concepts are allowed to be added. The keys represent terminological relationships. Correctness and completeness of the underlying indexing algorithm are proven. The use of classical relational databases for the storage of instances is described. Because of the uniform representation, inference can be done using case-based reasoning and generic problem solving methods

    A quest for the right word enhancing reflexivity and technology in terminology training

    Get PDF
    INTED2010, the 4th International Technology, Education and Development Conference was held in Valencia (Spain), on March 8, 9 and 10, 2010.When it comes to translators training, the acquisition of indexing and terminological competences (both at retrieval and management stage) has a major role in the performance of future translators. A good terminological database, as a result of an accurate research, along with computer assisted translation tools (CAT tools) can improve translation’s speed and quality and also reduce revision costs, bringing in benefits for all the players in the translation industry: language service providers and clients. That process (analysis, selection, retrieval and storage of terminology) takes place mostly in the pretranslation stage, but underlies the whole translation work and can be a determining factor to the quality of the final product and to its homogeneity, especially when carried out in a collaborative environment. The development of terminological databases is an essential step in the training of translators and the efficient search for the right word a necessary skill in today's globalised translation market. Moreover being the quest for the right word almost entirely run over the Internet, data diversity can greatly increase the noise. This search poses several questions, mainly (1) how and where to retrieve information and (2) how to manage it efficiently, especially to students who are neither experts in terminology nor in translation. To ease some of these problems, students were assigned a project in terminology (a database) and, in order to accomplish it, both a Webquest and an ePortfolio were proposed as guidance tools. Along the process, students were expected to build up their thematic and communicative competence and, in parallel, widen their skills in computer-assisted translation tools as well as standard officeautomation software. This paper aims at discussing how these two tools helped students guide their research, structure the problem solving activities, develop critical thinking and terminological competencies
    corecore