10 research outputs found
Semantics-based information extraction for detecting economic events
As today's financial markets are sensitive to breaking news on economic events, accurate and timely automatic identification of events in news items is crucial. Unstructured news items originating from many heterogeneous sources have to be mined in order to extract knowledge useful for guiding decision making processes. Hence, we propose the Semantics-Based Pipeline for Economic Event Detection (SPEED), focusing on extracting financial events from news articles and annotating these with meta-data at a speed that enables real-time use. In our implementation, we use some components of an existing framework as well as new components, e.g., a high-performance Ontology Gazetteer, a Word Group Look-Up component, a Word Sense Disambiguator, and components for detecting economic events. Through their interaction with a domain-specific ontology, our novel, semantically enabled components constitute a feedback loop which fosters future reuse of acquired knowledge in the event detection process
Using punctuation as an iconic system for describing and augmenting video structure
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2001.Includes bibliographical references (leaves 98-101).Affordable digital cameras, high bandwidth connectivity and large-scale video hosting websites are combining to offer an alternative mode of production and channel of distribution for independent filmmakers and home moviemakers. There is a growing need to develop systems that meaningfully support the desires of these filmmakers to communicate and collaborate effectively with others and to propel cinematic storytelling into new and dynamic realms. This document proposes the development of a networked software application, called PlusShorts, that will allow a distributed group of users to contribute to and collaborate upon the creation of shared movie sequences. This system introduces an iconic language, consisting of punctuation symbols, for annotating, sharing and interpreting conceptual ideas about cinematic structure. The PlusShorts application presents individual movie sequences as elements within an evolving cinematic storyspace, where participants can explore, collaborate and share ideas.Aisling Geraldine Mary Kelliher.S.M
Automated Detection of Financial Events in News Text
Today’s financial markets are inextricably linked with financial events like acquisitions, profit announcements, or product launches. Information extracted from news messages that report on such events could hence be beneficial for financial decision making. The ubiquity of news, however, makes manual analysis impossible, and due to the unstructured nature of text, the (semi-)automatic extraction and application of financial events remains a non-trivial task. Therefore, the studies composing this dissertation investigate 1) how to accurately identify financial events in news text, and 2) how to effectively use such extracted events in financial applications.
Based on a detailed evaluation of current event extraction systems, this thesis presents a competitive, knowledge-driven, semi-automatic system for financial event extraction from text. A novel pattern language, which makes clever use of the system’s underlying knowledge base, allows for the definition of simple, yet expressive event extraction rules that can be applied to natural language texts. The system’s knowledge-driven internals remain synchronized with the latest market developments through the accompanying event-triggered update language for knowledge bases, enabling the definition of update rules.
Additional research covered by this dissertation investigates the practical applicability of extracted events. In automated stock trading experiments, the best performing trading rules do not only make use of traditional numerical signals, but also employ news-based event signals. Moreover, when cleaning stock data from disruptions caused by financial events, financial risk analyses yield more accurate results. These results suggest that events detected in news can be used advantageously as supplementary parameters in financial applications
Importance of Measuring Sentential Semantic Knowledge Base of a "Free Text" Medical Corpus
At present, the healthcare industry uses codified data mainly for billing purpose. Codified data could be used to improve patient care through decision support and analytical systems. However to reduce medical errors, these systems need access to a wide range of medical data. Unfortunately, a great deal of data is only available in a narrative or free text form, requiring natural language processing (NLP) techniques for their codification. Structuring narrative data and analyzing their underlying meaning from a medical domain requires extensive knowledge acquired through studying the domain empirically. Existing NLP system like MedLEE has a limited ability to analyze free text medical observations and codify data against Unified Medical Language System (UMLS) codes. MedLEE was successful in extracting meaning from relatively simple sentences from radiological reports, but could not analyze more complicated sentences which appear frequently in medical reports. An important problem in medical NLP is, understanding how many codes or symbols are necessary to codify a medical domain completely. Another problem is determining whether existing medical lexicons like SNOMED-CT and ICD-9, etc. are suitable for representing the knowledge in medical reports unambiguously. This thesis investigates the problems behind current NLP systems and lexicons, and attempts to estimate the number of required symbols or codes to represent a large corpus of radiology reports. The knowledge will provide a greater understanding of how many symbols may be needed for the complete representation of concepts in other medical domains
Quantitative Characteristics of Human-Written Short Stories as a Metric for Automated Storytelling
Evaluating the extent to which computer-produced stories are structured like human-invented narratives can be an important component of the quality of a story plot. In this paper, we report on an empirical experiment in which human subjects have invented short plots in a constrained scenario. The stories were annotated according to features commonly found in existing automatic story generators. The annotation was designed to measure the proportion and relations of story components that should be used in automatic computational systems for matching human behaviour. Results suggest that there are relatively common patterns that can be used as input data for identifying similarity to human-invented stories in automatic storytelling systems. The found patterns are in line with narratological models, and the results provide numerical quantification and layout of story components. The proposed method of story analysis is tested over two additional sources, the ROCStories corpus and stories generated by automated storytellers, to illustrate the valuable insights that may be derived from them
Recommended from our members
Modeling Narrative Discourse
This thesis describes new approaches to the formal modeling of narrative discourse. Although narratives of all kinds are ubiquitous in daily life, contemporary text processing techniques typically do not leverage the aspects that separate narrative from expository discourse. We describe two approaches to the problem. The first approach considers the conversational networks to be found in literary fiction as a key aspect of discourse coherence; by isolating and analyzing these networks, we are able to comment on longstanding literary theories. The second approach proposes a new set of discourse relations that are specific to narrative. By focusing on certain key aspects, such as agentive characters, goals, plans, beliefs, and time, these relations represent a theory-of-mind interpretation of a text. We show that these discourse relations are expressive, formal, robust, and through the use of a software system, amenable to corpus collection projects through the use of trained annotators. We have procured and released a collection of over 100 encodings, covering a set of fables as well as longer texts including literary fiction and epic poetry. We are able to inferentially find similarities and analogies between encoded stories based on the proposed relations, and an evaluation of this technique shows that human raters prefer such a measure of similarity to a more traditional one based on the semantic distances between story propositions
A web semântica no contexto educativo: um sistema para a recuperação de objectos de aprendizagem baseado nas tecnologias para a web semântica, para o e-learning e para os agentes
A Web pode ser vista como uma mediateca de documentos à escala mundial. Constitui
actualmente o maior repositório de informação, disponibilizando conteúdos multimédia,
contudo a sua localização não é fácil, devido essencialmente ao facto da sua semântica ou
significado apenas poder ser capturada dentro do contexto e na perspectiva dos humanos.
Nos últimos anos, a comunidade cientÃfica internacional tem vindo a desenvolver
esforços significativos no sentido de melhorar a localização, recuperação e reutilização de
objectos de informação, inacessÃveis e armazenados em servidores dispersos na Web profunda
ou invisÃvel.
Os metadados e as ontologias, as metalinguagens e as ferramentas de anotação e de
criação de ontologias e mapas de tópicos, os agentes inteligentes e sistemas de agentes
móveis, entre outros avanços tecnológicos das Ciências da Computação e da Inteligência
Artificial no âmbito da Gestão da Informação e do Conhecimento e dos Sistemas DistribuÃdos
para a Web, constituem elementos essenciais para o desenvolvimento de soluções que a pouco
e pouco contribuirão para alterar a realidade da Web actual.
A iniciativa que mais se tem destacado é a Web Semântica, cujo principal objectivo é
a integração, o intercâmbio e a compreensão semântica da informação, tanto na óptica dos
humanos, como na óptica das máquinas, através da transformação da Web actual numa Web
de informação semântica que tem como particularidade descrever, interrelacionar e
compreender os conteúdos através de metadados, ontologias e agentes de software.
Neste contexto, a presente dissertação especifica uma arquitectura para um sistema de
recuperação de objectos de aprendizagem baseado nas tecnologias para a Web Semântica,
para o e-Learning e para os agentes de software, com o objectivo principal de resolver o
problema da descoberta de objectos de aprendizagem e de cursos de formação. De acordo
com essa arquitectura, foi também desenvolvido um protótipo experimental para a pesquisa
semântica de objectos de aprendizagem armazenados em sistemas de e-Learning, repositórios
de objectos de aprendizagem e noutros servidores Web de conteúdos educativos.The Web can be understood as a multimedia library of documents at world level. It
currently consists of the largest information repository which makes multimedia contents
available. However, their localization is not an easy task, mainly due to the fact that their
semantics or meaning can only be captured in their context and in accordance to human
perspective.
In the past years, the international scientific community has been carrying out
significant efforts with view to improving the localization, retrieval and reuse of information
objects, which may be inaccessible or stored in servers scattered around the deep Web or the
invisible Web.
Metadata and ontologies, metalanguages, annotation tools, tools for the creation of
ontologies and topic maps, intelligent agents and mobile agent systems, among other
technological developments of Computer Sciences and of Artificial Intelligence within the
scope of Information and Knowledge Management and Distributed Web Systems, are the key
elements for the development of solutions that will gradually lead to changing the present
Web reality.
The project that has been receiving more attention is the Semantic Web whose main
purpose is the integration, interchange and semantic understanding of information not only from
the viewpoint of humans, but also from the perspective of machines by means of the change of
the current Web into a Web of semantic data. This would then allow for the description,
interrelation and understanding of contents through metadata, ontologies and software agents.
In this context, the present thesis specifies the architecture for a retrieval system of
learning objects based on the technologies of the Semantic Web, e-Learning and software
agents, aiming to solve the problem of determining learning objects and training courses.
According to this architecture, we also developed an experimental prototype for the semantic
search of learning objects stored in e-Learning systems, learning object repositories and other
Web servers for educational contents.Le Web peut être vu comme une médiathèque de documents à l’échelle mondiale. Il
constitue actuellement le plus grand recueil d’information, mettant à notre disposition des contenus
multimédia ; toutefois, sa localisation n’est pas facile en raison essentiellement du fait que sa
sémantique ou son sens ne peuvent être saisis que dans le contexte et la perspective des humains.
Au cours des dernières années, la communauté scientifique internationale a développé
des efforts considérables dans le sens d’améliorer la localisation, la récupération et la
réutilisation d’objets d’information inaccessibles et emmagasinés dans des serveurs dispersés
dans le Web profond ou invisible.
Les métadonnées et les ontologies, les métalangages et les outils d’annotation et de
création d’ontologies et de cartes de topiques, les agents intelligents et les systèmes d’agents
mobiles, entre autres progrès technologiques des Sciences Informatiques et de l’Intelligence
Artificielle dans le cadre de la Gestion de l’Information et de la Connaissance et des Systèmes
Distribués pour le Web, constituent des éléments essentiels pour le développement de
solutions qui, peu à peu, contribueront pour altérer la réalité du Web actuel.
L’initiative qui s’est le plus fait remarquer est le Web Sémantique dont l’objectif
principal est l’intégration, l’échange et la compréhension sémantique de l’information, aussi
bien du point de vue des humains que du point de vue des machines, Ã travers la
transformation du Web actuel en un Web d’information sémantique qui a comme particularité
décrire, mettre en rapport et comprendre les contenus à travers des métadonnées, des
ontologies et des agents de software.
Dans ce contexte, la présente dissertation précise une architecture pour un système de
récupération d’objets d’apprentissage basé sur les technologies pour le Web Sémantique, pour
le e-Learning et pour les agents de software, ayant comme but principal résoudre le problème
de la découverte d’objets d’apprentissage et de cours de formation. Selon cette architecture,
un prototype expérimental a également été développé pour la recherche sémantique d’objets
d’apprentissage emmagasinés dans des systèmes de e-Learning, recueils d’objets
d’apprentissage et dans d’autres serveurs Web de contenus éducatifs.O projecto de investigação desta dissertação foi parcialmente financiado pelo Programa de
Desenvolvimento Educativo para Portugal (PRODEP III), Eixo 3 – Sociedade da Aprendizagem, Medida 5 (FSE) – Formação de Docentes e outros agentes, Acção 5.3 – Formação Avançada de Docentes do Ensino Superior
Modelo de educación de la inteligencia colectiva
The research carried out is part of the field of study of Collective Intelligence (CI) with the use of Information and Communication Technologies (ICT) in Higher Education.
The heart of this research was focused on the study, design and construction of electronic tools according to the paradigms of CI, to be applied in Higher Education. As an instrument for the implementation of these tools, an educational model with a collective work approach was designed.
The research strategy used was Design-Based Research (DBR), because it investigates a phenomenon in its real context, iterative and incremental, and it is especially recommended for the field of education. DBR in each experimental cycle updates literature, model and tools.
Empirical studies were conducted in four universities and fields of study in Latin America and Europe.
The refinements demanded by the research strategy provided the scientific and empirical evidence to design ICT tools that meet the requirements of CI. In addition, the results indicate that the educational model and the tools have generated a positive perception in teachers and students about the effects on the teaching-learning process. Based on this fact, the experimental cycles present significant contributions to the research carried out around the CI with ICT tools in Higher Education.La investigación realizada se enmarca en el campo de estudio de la Inteligencia Colectiva (IC) con el uso de las TecnologÃas de la Información y la Comunicaciones (TIC) en la Educación Superior. El corazón de ésta investigación estuvo enfocada en el estudio, diseño y construcción de herramientas electrónicas acorde a los paradigmas de IC, para ser aplicadas en la Educación Superior. Como vÃa de instrumentación de dichas herramientas, se diseñó un modelo educativo con enfoque de trabajo colectivo. La estrategia de investigación que se utilizó fue la Investigación Basada en el Diseño (DBR), porque investiga un fenómeno en su contexto real, es iterativa e incremental, y está especialmente recomendada para el ámbito de la educación.DBR en cada ciclo experimental actualiza literatura, modelo y herramientas. Los estudios empÃricos se realizaron en cuatro universidades y campos de estudio en Hispanoamérica y Europa. Los múltiples refinamientos exigidos por la estrategia de investigación, proporcionaron la evidencia cientÃfica y empÃrica para diseñar herramientas TIC que cumplan con los requisitos de IC. Además, los resultados indican que el modelo educativo y las herramientas han generado una percepción positiva en docentes y estudiantes sobre los efectos en el proceso de enseñanza-aprendizaje. Basados en este hecho, los ciclos experimentales presentan aportes significativos a las investigaciones que se realizan en torno a la IC con herramientas TIC en la Educación Superior.Postprint (published version
Modelo de educación de la inteligencia colectiva
The research carried out is part of the field of study of Collective Intelligence (CI) with the use of Information and Communication Technologies (ICT) in Higher Education.
The heart of this research was focused on the study, design and construction of electronic tools according to the paradigms of CI, to be applied in Higher Education. As an instrument for the implementation of these tools, an educational model with a collective work approach was designed.
The research strategy used was Design-Based Research (DBR), because it investigates a phenomenon in its real context, iterative and incremental, and it is especially recommended for the field of education. DBR in each experimental cycle updates literature, model and tools.
Empirical studies were conducted in four universities and fields of study in Latin America and Europe.
The refinements demanded by the research strategy provided the scientific and empirical evidence to design ICT tools that meet the requirements of CI. In addition, the results indicate that the educational model and the tools have generated a positive perception in teachers and students about the effects on the teaching-learning process. Based on this fact, the experimental cycles present significant contributions to the research carried out around the CI with ICT tools in Higher Education.La investigación realizada se enmarca en el campo de estudio de la Inteligencia Colectiva (IC) con el uso de las TecnologÃas de la Información y la Comunicaciones (TIC) en la Educación Superior. El corazón de ésta investigación estuvo enfocada en el estudio, diseño y construcción de herramientas electrónicas acorde a los paradigmas de IC, para ser aplicadas en la Educación Superior. Como vÃa de instrumentación de dichas herramientas, se diseñó un modelo educativo con enfoque de trabajo colectivo. La estrategia de investigación que se utilizó fue la Investigación Basada en el Diseño (DBR), porque investiga un fenómeno en su contexto real, es iterativa e incremental, y está especialmente recomendada para el ámbito de la educación.DBR en cada ciclo experimental actualiza literatura, modelo y herramientas. Los estudios empÃricos se realizaron en cuatro universidades y campos de estudio en Hispanoamérica y Europa. Los múltiples refinamientos exigidos por la estrategia de investigación, proporcionaron la evidencia cientÃfica y empÃrica para diseñar herramientas TIC que cumplan con los requisitos de IC. Además, los resultados indican que el modelo educativo y las herramientas han generado una percepción positiva en docentes y estudiantes sobre los efectos en el proceso de enseñanza-aprendizaje. Basados en este hecho, los ciclos experimentales presentan aportes significativos a las investigaciones que se realizan en torno a la IC con herramientas TIC en la Educación Superior
Modelo de educación de la inteligencia colectiva
The research carried out is part of the field of study of Collective Intelligence (CI) with the use of Information and Communication Technologies (ICT) in Higher Education.
The heart of this research was focused on the study, design and construction of electronic tools according to the paradigms of CI, to be applied in Higher Education. As an instrument for the implementation of these tools, an educational model with a collective work approach was designed.
The research strategy used was Design-Based Research (DBR), because it investigates a phenomenon in its real context, iterative and incremental, and it is especially recommended for the field of education. DBR in each experimental cycle updates literature, model and tools.
Empirical studies were conducted in four universities and fields of study in Latin America and Europe.
The refinements demanded by the research strategy provided the scientific and empirical evidence to design ICT tools that meet the requirements of CI. In addition, the results indicate that the educational model and the tools have generated a positive perception in teachers and students about the effects on the teaching-learning process. Based on this fact, the experimental cycles present significant contributions to the research carried out around the CI with ICT tools in Higher Education.La investigación realizada se enmarca en el campo de estudio de la Inteligencia Colectiva (IC) con el uso de las TecnologÃas de la Información y la Comunicaciones (TIC) en la Educación Superior. El corazón de ésta investigación estuvo enfocada en el estudio, diseño y construcción de herramientas electrónicas acorde a los paradigmas de IC, para ser aplicadas en la Educación Superior. Como vÃa de instrumentación de dichas herramientas, se diseñó un modelo educativo con enfoque de trabajo colectivo. La estrategia de investigación que se utilizó fue la Investigación Basada en el Diseño (DBR), porque investiga un fenómeno en su contexto real, es iterativa e incremental, y está especialmente recomendada para el ámbito de la educación.DBR en cada ciclo experimental actualiza literatura, modelo y herramientas. Los estudios empÃricos se realizaron en cuatro universidades y campos de estudio en Hispanoamérica y Europa. Los múltiples refinamientos exigidos por la estrategia de investigación, proporcionaron la evidencia cientÃfica y empÃrica para diseñar herramientas TIC que cumplan con los requisitos de IC. Además, los resultados indican que el modelo educativo y las herramientas han generado una percepción positiva en docentes y estudiantes sobre los efectos en el proceso de enseñanza-aprendizaje. Basados en este hecho, los ciclos experimentales presentan aportes significativos a las investigaciones que se realizan en torno a la IC con herramientas TIC en la Educación Superior