5,548 research outputs found

    Information and Experience in Metaphor: A Perspective From Computer Analysis

    Get PDF
    Novel linguistic metaphor can be seen as the assignment of attributes to a topic through a vehicle belonging to another domain. The experience evoked by the vehicle is a significant aspect of the meaning of the metaphor, especially for abstract metaphor, which involves more than mere physical similarity. In this article I indicate, through description of a specific model, some possibilities as well as limitations of computer processing directed toward both informative and experiential/affective aspects of metaphor. A background to the discussion is given by other computational treatments of metaphor analysis, as well as by some questions about metaphor originating in other disciplines. The approach on which the present metaphor analysis model is based is consistent with a theory of language comprehension that includes both the intent of the originator and the effect on the recipient of the metaphor. The model addresses the dual problem of (a) determining potentially salient properties of the vehicle concept, and (b) defining extensible symbolic representations of such properties, including affective and other connotations. The nature of the linguistic analysis underlying the model suggests how metaphoric expression of experiential components in abstract metaphor is dependent on the nominalization of actions and attributes. The inverse process of undoing such nominalizations in computer analysis of metaphor constitutes a translation of a metaphor to a more literal expression within the metaphor-nonmetaphor dichotomy

    Towards a Linked Semantic Web: Precisely, Comprehensively and Scalably Linking Heterogeneous Data in the Semantic Web

    Get PDF
    The amount of Semantic Web data is growing rapidly today. Individual users, academic institutions and businesses have already published and are continuing to publish their data in Semantic Web standards, such as RDF and OWL. Due to the decentralized nature of the Semantic Web, the same real world entity may be described in various data sources with different ontologies and assigned syntactically distinct identifiers. Furthermore, data published by each individual publisher may not be complete. This situation makes it difficult for end users to consume the available Semantic Web data effectively. In order to facilitate data utilization and consumption in the Semantic Web, without compromising the freedom of people to publish their data, one critical problem is to appropriately interlink such heterogeneous data. This interlinking process is sometimes referred to as Entity Coreference, i.e., finding which identifiers refer to the same real world entity. In the Semantic Web, the owl:sameAs predicate is used to link two equivalent (coreferent) ontology instances. An important question is where these owl:sameAs links come from. Although manual interlinking is possible on small scales, when dealing with large-scale datasets (e.g., millions of ontology instances), automated linking becomes necessary. This dissertation summarizes contributions to several aspects of entity coreference research in the Semantic Web. First of all, by developing the EPWNG algorithm, we advance the performance of the state-of-the-art by 1% to 4%. EPWNG finds coreferent ontology instances from different data sources by comparing every pair of instances and focuses on achieving high precision and recall by appropriately collecting and utilizing instance context information domain-independently. We further propose a sampling and utility function based context pruning technique, which provides a runtime speedup factor of 30 to 75. Furthermore, we develop an on-the-fly candidate selection algorithm, P-EPWNG, that enables the coreference process to run 2 to 18 times faster than the state-of-the-art on up to 1 million instances while only making a small sacrifice in the coreference F1-scores. This is achieved by utilizing the matching histories of the instances to prune instance pairs that are not likely to be coreferent. We also propose Offline, another candidate selection algorithm, that not only provides similar runtime speedup to P-EPWNG but also helps to achieve higher candidate selection and coreference F1-scores due to its more accurate filtering of true negatives. Different from P-EPWNG, Offline pre-selects candidate pairs by only comparing their partial context information that is selected in an unsupervised, automatic and domain-independent manner.In order to be able to handle really heterogeneous datasets, a mechanism for automatically determining predicate comparability is proposed. Combing this property matching approach with EPWNG and Offline, our system outperforms state-of-the-art algorithms on the 2012 Billion Triples Challenge dataset on up to 2 million instances for both coreference F1-score and runtime. An interesting project, where we apply the EPWNG algorithm for assisting cervical cancer screening, is discussed in detail. By applying our algorithm to a combination of different patient clinical test results and biographic information, we achieve higher accuracy compared to its ablations. We end this dissertation with the discussion of promising and challenging future work

    Flexible RDF data extraction from Wiktionary - Leveraging the power of community build linguistic wikis

    Get PDF
    We present a declarative approach implemented in a comprehensive opensource framework (based on DBpedia) to extract lexical-semantic resources (an ontology about language use) from Wiktionary. The data currently includes language, part of speech, senses, definitions, synonyms, taxonomies (hyponyms, hyperonyms, synonyms, antonyms) and translations for each lexical word. Main focus is on flexibility to the loose schema and configurability towards differing language-editions ofWiktionary. This is achieved by a declarative mediator/wrapper approach. The goal is, to allow the addition of languages just by configuration without the need of programming, thus enabling the swift and resource-conserving adaptation of wrappers by domain experts. The extracted data is as fine granular as the source data in Wiktionary and additionally follows the lemon model. It enables use cases like disambiguation or machine translation. By offering a linked data service, we hope to extend DBpedia’s central role in the LOD infrastructure to the world of Open Linguistics.

    Affective graphs: the visual appeal of linked data

    Get PDF
    The essence and value of Linked Data lies in the ability of humans and machines to query, access and reason upon highly structured and formalised data. Ontology structures provide an unambiguous description of the structure and content of data. While a multitude of software applications and visualization systems have been developed over the past years for Linked Data, there is still a significant gap that exists between applications that consume Linked Data and interfaces that have been designed with significant focus on aesthetics. Though the importance of aesthetics in affecting the usability, effectiveness and acceptability of user interfaces have long been recognised, little or no explicit attention has been paid to the aesthetics of Linked Data applications. In this paper, we introduce a formalised approach to developing aesthetically pleasing semantic web interfaces by following aesthetic principles and guidelines identified from literature. We apply such principles to design and develop a generic approach of using visualizations to support exploration of Linked Data, in an interface that is pleasing to users. This provides users with means to browse ontology structures, enriched with statistics of the underlying data, facilitating exploratory activities and enabling visual query for highly precise information needs. We evaluated our approach in three ways: an initial objective evaluation comparing our approach with other well-known interfaces for the semantic web and two user evaluations with semantic web researchers

    An Ontology-Based Assistant For Analyzing Agents\u27 Activities

    Get PDF
    This thesis reports on work in progress on software that helps an analyst identify and analyze activities of actors (such as vehicles) in an intelligence-relevant scenario. A system is being developed to aid intelligence analysts, IAGOA ((Intelligence Analyst’s Geospatial and Ontological Assistant). Analysis may be accomplished by retrieving simulated satellite data of ground vehicles and interacting with software modules that allow the analyst to conjecture the activities in which the actor is engaged along with the (largely geospatial and temporal) features of the area of operation relevant to the natures of those activities. Activities are conceptualized by ontologies. The research relies on natural language components (semantic frames) gathered from the FrameNet lexical database, which captures the semantics of lexical items with an ontology using OWL. The software has two components, one for the analyst, and one for a modeler who produces HTML and parameterized KML documents used by the analyst. The most significant input to the modeler software is the FrameNet OWL file, and the interface for the analyst and, to some extent, the modeler is provided by the Google Earth API

    Toward Entity-Aware Search

    Get PDF
    As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability
    • …
    corecore