152 research outputs found
Cloud-Based Benchmarking of Medical Image Analysis
Medical imagin
Complexity of Lexical Descriptions and its Relevance to Partial Parsing
In this dissertation, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (supertags) that impose complex constraints in a local context. However, increasing the complexity of descriptions makes the number of different descriptions for each lexical item much larger and hence increases the local ambiguity for a parser. This local ambiguity can be resolved by using supertag co-occurrence statistics collected from parsed corpora. We have explored these ideas in the context of Lexicalized Tree-Adjoining Grammar (LTAG) framework wherein supertag disambiguation provides a representation that is an almost parse. We have used the disambiguated supertag sequence in conjunction with a lightweight dependency analyzer to compute noun groups, verb groups, dependency linkages and even partial parses. We have shown that a trigram-based supertagger achieves an accuracy of 92.1‰ on Wall Street Journal (WSJ) texts. Furthermore, we have shown that the lightweight dependency analysis on the output of the supertagger identifies 83‰ of the dependency links accurately. We have exploited the representation of supertags with Explanation-Based Learning to improve parsing effciency. In this approach, parsing in limited domains can be modeled as a Finite-State Transduction. We have implemented such a system for the ATIS domain which improves parsing eciency by a factor of 15. We have used the supertagger in a variety of applications to provide lexical descriptions at an appropriate granularity. In an information retrieval application, we show that the supertag based system performs at higher levels of precision compared to a system based on part-of-speech tags. In an information extraction task, supertags are used in specifying extraction patterns. For language modeling applications, we view supertags as syntactically motivated class labels in a class-based language model. The distinction between recursive and non-recursive supertags is exploited in a sentence simplification application
Recommended from our members
Large-scale Affective Computing for Visual Multimedia
In recent years, Affective Computing has arisen as a prolific interdisciplinary field for engineering systems that integrate human affections. While human-computer relationships have long revolved around cognitive interactions, it is becoming increasingly important to account for human affect, or feelings or emotions, to avert user experience frustration, provide disability services, predict virality of social media content, etc. In this thesis, we specifically focus on Affective Computing as it applies to large-scale visual multimedia, and in particular, still images, animated image sequences and video streams, above and beyond the traditional approaches of face expression and gesture recognition. By taking a principled psychology-grounded approach, we seek to paint a more holistic and colorful view of computational affect in the context of visual multimedia. For example, should emotions like 'surprise' and `fear' be assumed to be orthogonal output dimensions? Or does a 'positive' image in one culture's view elicit the same feelings of positivity in another culture? We study affect frameworks and ontologies to define, organize and develop machine learning models with such questions in mind to automatically detect affective visual concepts.
In the push for what we call "Big Affective Computing," we focus on two dimensions of scale for affect -- scaling up and scaling out -- which we propose are both imperative if we are to scale the Affective Computing problem successfully. Intuitively, simply increasing the number of data points corresponds to "scaling up". However, less intuitive, is when problems like Affective Computing "scale out," or diversify. We show that this latter dimension of introducing data variety, alongside the former of introducing data volume, can yield particular insights since human affections naturally depart from traditional Machine Learning and Computer Vision problems where there is an objectively truthful target. While no one might debate a picture of a 'dog' should be tagged as a 'dog,' but not all may agree that it looks 'ugly'. We present extensive discussions on why scaling out is critical and how it can be accomplished while in the context of large-volume visual data.
At a high-level, the main contributions of this thesis include:
Multiplicity of Affect Oracles:
Prior to the work in this thesis, little consideration has been paid to the affective label generating mechanism when learning functional mappings between inputs and labels. Throughout this thesis but first in Chapter 2, starting in Section 2.1.2, we make a case for a conceptual partitioning of the affect oracle governing the label generation process in Affective Computing problems resulting a multiplicity of oracles, whereas prior works assumed there was a single universal oracle. In Chapter 3, the differences between intended versus expressed versus induced versus perceived emotion are discussed, where we argue that perceived emotion is particularly well-suited for scaling up because it reduces the label variance due to its more objective nature compared to other affect states. And in Chapter 4 and 5, a division of the affect oracle along cultural lines with manifestations along both language and geography is explored. We accomplish all this without sacrificing the 'scale up' dimension, and tackle significantly larger volume problems than prior comparable visual affective computing research.
Content-driven Visual Affect Detection:
Traditionally, in most Affective Computing work, prediction tasks use psycho-physiological signals from subjects viewing the stimuli of interest, e.g., a video advertisement, as the system inputs. In essence, this means that the machine learns to label a proxy signal rather than the stimuli itself. In this thesis, with the rise of strong Computer Vision and Multimedia techniques, we focus on the learning to label the stimuli directly without a human subject provided biometric proxy signal (except in the unique circumstances of Chapter 7). This shift toward learning from the stimuli directly is important because it allows us to scale up with much greater ease given that biometric measurement acquisition is both low-throughput and somewhat invasive while stimuli are often readily available. In addition, moving toward learning directly from the stimuli will allow researchers to precisely determine which low-level features in the stimuli are actually coupled with affect states, e.g., which set of frames caused viewer discomfort rather a broad sense that a video was discomforting. In Part I of this thesis, we illustrate an emotion prediction task with a psychology-grounded affect representation. In particular, in Chapter 3, we develop a prediction task over semantic emotional classes, e.g., 'sad,' 'happy' and 'angry,' using animated image sequences given annotations from over 2.5 million users. Subsequently, in Part II, we develop visual sentiment and adjective-based semantics models from million-scale digital imagery mined from a social multimedia platform.
Mid-level Representations for Visual Affect:
While discrete semantic emotions and sentiment are classical representations of affect with decades of psychology grounding, the interdisciplinary nature of Affective Computing, now only about two decades old, allows for new avenues of representation. Mid-level representations have been proposed in numerous Computer Vision and Multimedia problems as an intermediary, and often more computable, step toward bridging the semantic gap between low-level system inputs and high-level label semantic abstractions. In Part II, inspired by this work, we adapt it for vision-based Affective Computing and adopt a semantic construct called adjective-noun pairs. Specifically, in Chapter 4, we explore the use of such adjective-noun pairs in the context of a social multimedia platform and develop a multilingual visual sentiment ontology with over 15,000 affective mid-level visual concepts across 12 languages associated with over 7.3 million images and representations from over 235 countries, resulting in the largest affective digital image corpus in both depth and breadth to date. In Chapter 5, we develop computational methods to predict such adjective-noun pairs and also explore their usefulness in traditional sentiment analysis but with a previously unexplored cross-lingual perspective. And in Chapter 6, we propose a new learning setting called 'cross-residual learning' building off recent successes in deep neural networks, and specifically, in residual learning; we show that cross-residual learning can be used effectively to jointly learn across even multiple related tasks in object detection (noun), more traditional affect modeling (adjectives), and affective mid-level representations (adjective-noun pairs), giving us a framework for better grounding the adjective-noun pair bridge in both vision and affect simultaneously
Information-theoretic environment modeling for mobile robot localization
To enhance robotic computational efficiency without degenerating accuracy, it is imperative to fit the right and exact amount of information in its simplest form to the investigated task. This thesis conforms to this reasoning in environment model building and robot localization. It puts forth an approach towards building maps and localizing a mobile robot efficiently with respect to unknown, unstructured and moderately dynamic environments. For this, the environment is modeled on an information-theoretic basis, more specifically in terms of its transmission property. Subsequently, the presented environment model, which does not specifically adhere to classical geometric modeling, succeeds in solving the environment disambiguation effectively.
The proposed solution lays out a two-level hierarchical structure for localization. The structure makes use of extracted features, which are stored in two different resolutions in a single hybrid feature-map. This enables dual coarse-topological and fine-geometric localization modalities.
The first level in the hierarchy describes the environment topologically, where a defined set of places is described by a probabilistic feature representation. A conditional entropy-based criterion is proposed to quantify the transinformation between the feature and the place domains. This criterion provides a double benefit of pruning the large dimensional feature space, and at the same time selecting the best discriminative features that overcome environment aliasing problems. Features with the highest transinformation are filtered and compressed to form a coarse resolution feature-map (codebook). Localization at this level is conducted through place matching.
In the second level of the hierarchy, the map is viewed in high-resolution, as consisting of non-compressed entropy-processed features. These features are additionally tagged with their position information. Given the identified topological place provided by the first level, fine localization corresponding to the second level is executed using feature triangulation. To enhance the triangulation accuracy, redundant features are used and two metric evaluating criteria are employ-ed; one for dynamic features and mismatches detection, and another for feature selection.
The proposed approach and methods have been tested in realistic indoor environments using a vision sensor and the Scale Invariant Feature Transform local feature extraction. Through experiments, it is demonstrated that an information-theoretic modeling approach is highly efficient in attaining combined accuracy and computational efficiency performances for localization. It has also been proven that the approach is capable of modeling environments with a high degree of unstructuredness, perceptual aliasing, and dynamic variations (illumination conditions; scene dynamics). The merit of employing this modeling type is that environment features are evaluated quantitatively, while at the same time qualitative conclusions are generated about feature selection and performance in a robot localization task. In this way, the accuracy of localization can be adapted in accordance with the available resources.
The experimental results also show that the hybrid topological-metric map provides sufficient information to localize a mobile robot on two scales, independent of the robot motion model. The codebook exhibits fast and accurate topological localization at significant compression ratios. The hierarchical localization framework demonstrates robustness and optimized space and time complexities. This, in turn, provides scalability to large environments application and real-time employment adequacies
Los documentos electrónicos : modelización y consulta de documentos estructurados bajo un ambiente orientado a objetos
[Tesis] (Maestría en Informática Administrativa) U.A.N.L.UANLhttp://www.uanl.mx
Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations
The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov
Toward a Model of Intercultural Warrant: A Case of the Korean Decimal Classification\u27s Cross-cultural Adaptation of the Dewey Decimal Classification
I examined the Korean Decimal Classification (KDC)\u27s adaptation of the Dewey Decimal Classification (DDC) by comparing the two systems. This case manifests the sociocultural influences on KOSs in a cross-cultural context. I focused my analysis on the changes resulting from the meeting of the two cultures, answering the main research question: “How does KDC adapt DDC in terms of underlying sociocultural perspectives in a classificatory form?” I took a comparative approach and address the main research question in two phases. In Phase 1, quantities of class numbers were analyzed by edition and discipline. The main class with the most consistently high number of class numbers in DDC was the social sciences, while the main class with the most consistently high number of class numbers in KDC was technology. The two main classes are expected to differ in semantic contents or specificities. In Phase 2, patterns of adaptations were analyzed by examining the class numbers, captions, and hierarchical relations within the developed adaptation taxonomy. Implementing the taxonomy as a coding scheme brings two comparative features of classifications: 1) semantic contents determined by captions and quantity of subordinate numbers; and 2) structural arrangement determined by ranks, the broader category, presence and the order of subordinate numbers. Surveying proper forms of adaptation resulted in the development of an adaptation taxonomy that will serve as a framework to account for the conflicts between and harmonization of multiple social and cultural influences in knowledge structures. This study has ramifications in theoretical and empirical foundations for the development of “intercultural warrant” in KOSs
- …