7,893 research outputs found

    More semantic links in the SIMPLE-CLIPS database

    Get PDF
    Notwithstanding its acknowledged richness, the SIMPLE semantic model does not offer the representational vocabulary for encoding some conceptual links holding between events and their participants and among co-participants in events. Although critical for boosting performance in many NLP application tasks, such deep lexical information is therefore only partially encoded in the SIMPLE-CLIPS Italian semantic database. This paper reports on the enrichment of the SIMPLE relation set by some expressive means, namely semantic relations, borrowed from the EuroWordNet model and their implementation in the SIMPLE-CLIPS lexicon. The original situation existing in the database, as to the expression of this type of information is described and the loan descriptive vocabulary presented. Strategies based on the exploitation of the source lexicon data were adopted to induce new information: a wide range of semantic - but also syntactic - information was investigated for singling out word senses candidate to be linked by the new relations. The lexicon enrichment by 5,000 new relations instantiated so far has therefore been carried out as a largely automated, low-effort and cost-free process, with no heavy human intervention. The redundancy set off by such an extension of information is being addressed by the implementation of inheritance in the SIMPLE-CLIPS database (Del Gratta et al., 2008)

    Simple_PLUS: una red de relaciones léxico-semánticas

    Get PDF
    Este artículo trata de la base de datos léxico-semántica del italiano, Simple_PLUS, y particularmente de su núcleo central: la red de relaciones léxico-semánticas. Este recurso lexical tiene como base Parole-Simple-Clips, un léxico electrónico con cuatro niveles de descripción, elaborado según el modelo SIMPLE. Simple_PLUS se compone de 30.000 entradas semánticas, sean importadas del léxico fuente, sean recién creadas, todas dotadas de un amplio conjunto de información proporcionado por el modelo subyacente. En Simple_PLUS, aquella representación semántica fue enriquecida con una información relacional esencial, en un proceso semiautomático. Mas de 5.000 lazos que relacionan los eventos con sus participantes y los co-participantes entre sí ─ vínculos que no podían ser descritos antes por falta de medios de representación adecuados ─ fueron codificados mediante un vocabulario descriptivo apropiado, que fue prestado del modelo EuroWordNet. Estos lazos conceptuales, que enriquecen la representación predicativa del léxico, aportan un conocimiento lexical imprescindible para las tareas de PLN y la Web semántica.The present article deals with the Italian lexical-semantic database Simple_PLUS and focuses on its essential core, i.e. the network of lexical semantic relations. This lexical resource builds on Parole-Simple-Clips, a four-layered electronic lexicon of Italian, founded on the SIMPLE model. Simple_PLUS consists of 30,000 semantic entries, partly imported from the source lexicon and partly newly created, but all encoding a wide-ranging set of information provided by the underpinning model. In Simple_PLUS, this semantic representation has been enriched with significant relational information, in a largely automated, inexpensive process. More than 5,000 relationships between events and their participants and among co-participants in events, links which were not capturable previously through lack of suitable representational means, have been encoded with the appropriate descriptive vocabulary borrowed from the EuroWordNet lexical model. Such conceptual links, which efficiently enhance the predicative representation in the lexicon, provide crucial lexical knowledge for NLP systems and for the Semantic Web

    Simple_PLUS: a network of lexical semantic relations Simple_PLUS: una red de relaciones l?xico-sem?nticas

    Get PDF
    The present article deals with the Italian lexical-semantic database Simple_PLUS and focuses on its essential core, i.e. the network of lexical semantic relations. This lexical resource builds on Parole-Simple-Clips, a four-layered electronic lexicon of Italian, founded on the SIMPLE model. Simple_PLUS consists of 30,000 semantic entries, partly imported from the source lexicon and partly newly created, but all encoding a wide-ranging set of information provided by the underpinning model. In Simple_PLUS, this semantic representation has been enriched with significant relational information, in a largely automated, inexpensive process. More than 5,000 relationships between events and their participants and among co-participants in events, links which were not capturable previously through lack of suitable representational means, have been encoded with the appropriate descriptive vocabulary borrowed from the EuroWordNet lexical model. Such conceptual links, which efficiently enhance the predicative representation in the lexicon, provide crucial lexical knowledge for NLP systems and for the Semantic Web.Este art?culo trata de la base de datos l?xico-sem?ntica del italiano, Simple_PLUS, y particularmente de su n?cleo central : la red de relaciones l?xico-sem?nticas. Este recurso lexical tiene como base Parole-Simple-Clips, un l?xico electr?nico con cuatro niveles de descripci?n, elaborado seg?n el modelo SIMPLE. Simple_PLUS se compone de 30.000 entradas sem?nticas, sean importadas del l?xico fuente, sean reci?n creadas, todas dotadas de un amplio conjunto de informaci?n proporcionado por el modelo subyacente. En Simple_PLUS, aquella representaci?n sem?ntica fue enriquecida con una informaci?n relacional esencial, en un proceso semiautom?tico. Mas de 5.000 lazos que relacionan los eventos con sus participantes y los co-participantes entre s? ─ v?nculos que no pod?an ser descritos antes por falta de medios de representaci?n adecuados ─ fueron codificados mediante un vocabulario descriptivo apropiado, que fue prestado del modelo EuroWordNet. Estos lazos conceptuales, que enriquecen la representaci?n predicativa del l?xico, aportan un conocimiento lexical imprescindible para las tareas de PLN y la Web sem?ntica

    Web 2.0, language resources and standards to automatically build a multilingual named entity lexicon

    Get PDF
    This paper proposes to advance in the current state-of-the-art of automatic Language Resource (LR) building by taking into consideration three elements: (i) the knowledge available in existing LRs, (ii) the vast amount of information available from the collaborative paradigm that has emerged from the Web 2.0 and (iii) the use of standards to improve interoperability. We present a case study in which a set of LRs for different languages (WordNet for English and Spanish and Parole-Simple-Clips for Italian) are extended with Named Entities (NE) by exploiting Wikipedia and the aforementioned LRs. The practical result is a multilingual NE lexicon connected to these LRs and to two ontologies: SUMO and SIMPLE. Furthermore, the paper addresses an important problem which affects the Computational Linguistics area in the present, interoperability, by making use of the ISO LMF standard to encode this lexicon. The different steps of the procedure (mapping, disambiguation, extraction, NE identification and postprocessing) are comprehensively explained and evaluated. The resulting resource contains 974,567, 137,583 and 125,806 NEs for English, Spanish and Italian respectively. Finally, in order to check the usefulness of the constructed resource, we apply it into a state-of-the-art Question Answering system and evaluate its impact; the NE lexicon improves the system’s accuracy by 28.1%. Compared to previous approaches to build NE repositories, the current proposal represents a step forward in terms of automation, language independence, amount of NEs acquired and richness of the information represented

    Interactive searching and browsing of video archives: using text and using image matching

    Get PDF
    Over the last number of decades much research work has been done in the general area of video and audio analysis. Initially the applications driving this included capturing video in digital form and then being able to store, transmit and render it, which involved a large effort to develop compression and encoding standards. The technology needed to do all this is now easily available and cheap, with applications of digital video processing now commonplace, ranging from CCTV (Closed Circuit TV) for security, to home capture of broadcast TV on home DVRs for personal viewing. One consequence of the development in technology for creating, storing and distributing digital video is that there has been a huge increase in the volume of digital video, and this in turn has created a need for techniques to allow effective management of this video, and by that we mean content management. In the BBC, for example, the archives department receives approximately 500,000 queries per year and has over 350,000 hours of content in its library. Having huge archives of video information is hardly any benefit if we have no effective means of being able to locate video clips which are of relevance to whatever our information needs may be. In this chapter we report our work on developing two specific retrieval and browsing tools for digital video information. Both of these are based on an analysis of the captured video for the purpose of automatically structuring into shots or higher level semantic units like TV news stories. Some also include analysis of the video for the automatic detection of features such as the presence or absence of faces. Both include some elements of searching, where a user specifies a query or information need, and browsing, where a user is allowed to browse through sets of retrieved video shots. We support the presentation of these tools with illustrations of actual video retrieval systems developed and working on hundreds of hours of video content

    Linking and Integrating two Electronic Lexicons

    Get PDF
    Lexicography, much attention is being paid, when building lexical resources, to their interoperability and their easy integration in HLT-NLP applications for an enhanced performance. Concerning already existing computational lexicons, on the other hand, their integration and interoperability is attainable, provided their main features offer a field of comparison. The two largest and extensively encoded electronic lexicons of Italian language fulfill this essential requirement. Although developed according to two different lexical models, ItalWordNet and PAROLE-SIMPLE-CLIPS present in fact many compatible aspects. Linking and eventually merging these lexical resources in a common representation framework seems therefore a wise move to offer the end-user a more exhaustive and in-depth lexical information combining the potentialities and most outstanding features offered by the two lexical models. This paper reports on the ongoing linking of the two lexicons. The mapping of the ontologies on which basis the lexicons are structured is described; an overview of the adopted methodology, of the linking process and of the results of the first mapping phase regarding 1stOrder Entities is provided. Reciprocal benefits and enhancements for the two resources are also illustrated that definitely justify the soundness of our linking initiative

    A story environment for learning object annotation and collection : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University, Palmerston North, New Zealand

    Get PDF
    With the increase in computer power, network bandwidth and availability, e-learning is used more and more widely. In practice e-learning can be applied in a variety of ways, such as providing electronic resources to support teaching and learning, developing computer based tutoring programs or building computer supported collaborative learning environments. Nowadays e-learning becomes significantly important because it can improve the quality of learning through using interactive computers, online communications and information systems in ways that other teaching methods cannot achieve. The important advantage of e-learning is that it offers learners a large amount of sharable and reusable learning resources. The current approaches such as Internet search and learning object repository does not effectively help users to search for appropriate learning objects. The original story concept introduces a new semantic layer between collections of learning objects and learning material. The basic idea of the story concept is to add an interpretative, semantically rich layer, informally called 'Story' between learning objects and learning material that links learning objects according to specific themes and subjects (Heinrich & Andres, 2003a). One motivation behind this approach is to put a more focused, semantic layer on top of untargeted metadata that are commonly used to describe a single learning object. Speaking from an e-learning context the stories build on learning objects and become information resources for learning material. The overall aim of this project was to design and build a story environment to realize the above story concept. The development of the story environment includes story metadata, story environment components, the story browsing and authoring processes, and tools involved in story browsing and authoring. The story concept suggests different types of metadata should be used in a story. This project developed those different metadata specifications to support story environment. Two prototypes of tools have been designed and implemented in this project to allow users to evaluate the story concept and story environment. The story browser helps story readers to read the story narrative and look at a story from different perspectives. The story authoring tool is used by the story authors to author a story. The future work of this project has been identified in the area of adding features of current tools, user testing and further implementation of the story environment
    corecore