576 research outputs found

    A Graph Theoretic Approach for Object Shape Representation in Compositional Hierarchies Using a Hybrid Generative-Descriptive Model

    Full text link
    A graph theoretic approach is proposed for object shape representation in a hierarchical compositional architecture called Compositional Hierarchy of Parts (CHOP). In the proposed approach, vocabulary learning is performed using a hybrid generative-descriptive model. First, statistical relationships between parts are learned using a Minimum Conditional Entropy Clustering algorithm. Then, selection of descriptive parts is defined as a frequent subgraph discovery problem, and solved using a Minimum Description Length (MDL) principle. Finally, part compositions are constructed by compressing the internal data representation with discovered substructures. Shape representation and computational complexity properties of the proposed approach and algorithms are examined using six benchmark two-dimensional shape image datasets. Experiments show that CHOP can employ part shareability and indexing mechanisms for fast inference of part compositions using learned shape vocabularies. Additionally, CHOP provides better shape retrieval performance than the state-of-the-art shape retrieval methods.Comment: Paper : 17 pages. 13th European Conference on Computer Vision (ECCV 2014), Zurich, Switzerland, September 6-12, 2014, Proceedings, Part III, pp 566-581. Supplementary material can be downloaded from http://link.springer.com/content/esm/chp:10.1007/978-3-319-10578-9_37/file/MediaObjects/978-3-319-10578-9_37_MOESM1_ESM.pd

    Learning deep representations for robotics applications

    Get PDF
    In this thesis, two hierarchical learning representations are explored in computer vision tasks. First, a novel graph theoretic method for statistical shape analysis, called Compositional Hierarchy of Parts (CHOP), was proposed. The method utilises line-based features as its building blocks for the representation of shapes. A deep, multi-layer vocabulary is learned by recursively compressing this initial representation. The key contribution of this work is to formulate layerwise learning as a frequent sub-graph discovery problem, solved using the Minimum Description Length (MDL) principle. The experiments show that CHOP employs part shareability and data compression features, and yields state-of- the-art shape retrieval performance on 3 benchmark datasets. In the second part of the thesis, a hybrid generative-evaluative method was used to solve the dexterous grasping problem. This approach combines a learned dexterous grasp generation model with two novel evaluative models based on Convolutional Neural Networks (CNNs). The data- efficient generative method learns from a human demonstrator. The evaluative models are trained in simulation, using the grasps proposed by the generative approach and the depth images of the objects from a single view. On a real grasp dataset of 49 scenes with previously unseen objects, the proposed hybrid architecture outperforms the purely generative method, with a grasp success rate of 77.7% to 57.1%. The thesis concludes by comparing the two families of deep architectures, compositional hierarchies and DNNs, providing insights on their strengths and weaknesses

    Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

    Full text link
    Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design computer agents with intelligent capabilities such as understanding, reasoning, and learning through integrating multiple communicative modalities, including linguistic, acoustic, visual, tactile, and physiological messages. With the recent interest in video understanding, embodied autonomous agents, text-to-image generation, and multisensor fusion in application domains such as healthcare and robotics, multimodal machine learning has brought unique computational and theoretical challenges to the machine learning community given the heterogeneity of data sources and the interconnections often found between modalities. However, the breadth of progress in multimodal research has made it difficult to identify the common themes and open questions in the field. By synthesizing a broad range of application domains and theoretical frameworks from both historical and recent perspectives, this paper is designed to provide an overview of the computational and theoretical foundations of multimodal machine learning. We start by defining two key principles of modality heterogeneity and interconnections that have driven subsequent innovations, and propose a taxonomy of 6 core technical challenges: representation, alignment, reasoning, generation, transference, and quantification covering historical and recent trends. Recent technical achievements will be presented through the lens of this taxonomy, allowing researchers to understand the similarities and differences across new approaches. We end by motivating several open problems for future research as identified by our taxonomy

    A Language-centered Approach to support environmental modeling with Cellular Automata

    Get PDF
    Die Anwendung von Methodiken und Technologien aus dem Bereich der Softwaretechnik auf den Bereich der Umweltmodellierung ist eine gemeinhin akzeptierte Vorgehensweise. Im Rahmen der "modellgetriebenen Entwicklung"(MDE, model-driven engineering) werden Technologien entwickelt, die darauf abzielen, Softwaresysteme vorwiegend auf Basis von im Vergleich zu Programmquelltexten relativ abstrakten Modellen zu entwickeln. Ein wesentlicher Bestandteil von MDE sind Techniken zur effizienten Entwicklung von "domänenspezifischen Sprachen"( DSL, domain-specific language), die auf Sprachmetamodellen beruhen. Die vorliegende Arbeit zeigt, wie modellgetriebene Entwicklung, und insbesondere die metamodellbasierte Beschreibung von DSLs, darüber hinaus Aspekte der Pragmatik unterstützen kann, deren Relevanz im erkenntnistheoretischen und kognitiven Hintergrund wissenschaftlichen Forschens begründet wird. Hierzu wird vor dem Hintergrund der Erkenntnisse des "modellbasierten Forschens"(model-based science und model-based reasoning) gezeigt, wie insbesondere durch Metamodelle beschriebene DSLs Möglichkeiten bieten, entsprechende pragmatische Aspekte besonders zu berücksichtigen, indem sie als Werkzeug zur Erkenntnisgewinnung aufgefasst werden. Dies ist v.a. im Kontext großer Unsicherheiten, wie sie für weite Teile der Umweltmodellierung charakterisierend sind, von grundsätzlicher Bedeutung. Die Formulierung eines sprachzentrierten Ansatzes (LCA, language-centered approach) für die Werkzeugunterstützung konkretisiert die genannten Aspekte und bildet die Basis für eine beispielhafte Implementierung eines Werkzeuges mit einer DSL für die Beschreibung von Zellulären Automaten (ZA) für die Umweltmodellierung. Anwendungsfälle belegen die Verwendbarkeit von ECAL und der entsprechenden metamodellbasierten Werkzeugimplementierung.The application of methods and technologies of software engineering to environmental modeling and simulation (EMS) is common, since both areas share basic issues of software development and digital simulation. Recent developments within the context of "Model-driven Engineering" (MDE) aim at supporting the development of software systems at the base of relatively abstract models as opposed to programming language code. A basic ingredient of MDE is the development of methods that allow the efficient development of "domain-specific languages" (DSL), in particular at the base of language metamodels. This thesis shows how MDE and language metamodeling in particular, may support pragmatic aspects that reflect epistemic and cognitive aspects of scientific investigations. For this, DSLs and language metamodeling in particular are set into the context of "model-based science" and "model-based reasoning". It is shown that the specific properties of metamodel-based DSLs may be used to support those properties, in particular transparency, which are of particular relevance against the background of uncertainty, that is a characterizing property of EMS. The findings are the base for the formulation of an corresponding specific metamodel- based approach for the provision of modeling tools for EMS (Language-centered Approach, LCA), which has been implemented (modeling tool ECA-EMS), including a new DSL for CA modeling for EMS (ECAL). At the base of this implementation, the applicability of this approach is shown

    Abstract Representation of Music: A Type-Based Knowledge Representation Framework

    Get PDF
    The wholesale efficacy of computer-based music research is contingent on the sharing and reuse of information and analysis methods amongst researchers across the constituent disciplines. However, computer systems for the analysis and manipulation of musical data are generally not interoperable. Knowledge representation has been extensively used in the domain of music to harness the benefits of formal conceptual modelling combined with logic based automated inference. However, the available knowledge representation languages lack sufficient logical expressivity to support sophisticated musicological concepts. In this thesis we present a type-based framework for abstract representation of musical knowledge. The core of the framework is a multiple-hierarchical information model called a constituent structure, which accommodates diverse kinds of musical information. The framework includes a specification logic for expressing formal descriptions of the components of the representation. We give a formal specification for the framework in the Calculus of Inductive Constructions, an expressive logical language which lends itself to the abstract specification of data types and information structures. We give an implementation of our framework using Semantic Web ontologies and JavaScript. The ontologies capture the core structural aspects of the representation, while the JavaScript tools implement the functionality of the abstract specification. We describe how our framework supports three music analysis tasks: pattern search and discovery, paradigmatic analysis and hierarchical set-class analysis, detailing how constituent structures are used to represent both the input and output of these analyses including sophisticated structural annotations. We present a simple demonstrator application, built with the JavaScript tools, which performs simple analysis and visualisation of linked data documents structured by the ontologies. We conclude with a summary of the contributions of the thesis and a discussion of the type-based approach to knowledge representation, as well as a number of avenues for future work in this area

    The Road to General Intelligence

    Get PDF
    Humans have always dreamed of automating laborious physical and intellectual tasks, but the latter has proved more elusive than naively suspected. Seven decades of systematic study of Artificial Intelligence have witnessed cycles of hubris and despair. The successful realization of General Intelligence (evidenced by the kind of cross-domain flexibility enjoyed by humans) will spawn an industry worth billions and transform the range of viable automation tasks.The recent notable successes of Machine Learning has lead to conjecture that it might be the appropriate technology for delivering General Intelligence. In this book, we argue that the framework of machine learning is fundamentally at odds with any reasonable notion of intelligence and that essential insights from previous decades of AI research are being forgotten. We claim that a fundamental change in perspective is required, mirroring that which took place in the philosophy of science in the mid 20th century. We propose a framework for General Intelligence, together with a reference architecture that emphasizes the need for anytime bounded rationality and a situated denotational semantics. We given necessary emphasis to compositional reasoning, with the required compositionality being provided via principled symbolic-numeric inference mechanisms based on universal constructions from category theory. • Details the pragmatic requirements for real-world General Intelligence. • Describes how machine learning fails to meet these requirements. • Provides a philosophical basis for the proposed approach. • Provides mathematical detail for a reference architecture. • Describes a research program intended to address issues of concern in contemporary AI. The book includes an extensive bibliography, with ~400 entries covering the history of AI and many related areas of computer science and mathematics.The target audience is the entire gamut of Artificial Intelligence/Machine Learning researchers and industrial practitioners. There are a mixture of descriptive and rigorous sections, according to the nature of the topic. Undergraduate mathematics is in general sufficient. Familiarity with category theory is advantageous for a complete understanding of the more advanced sections, but these may be skipped by the reader who desires an overall picture of the essential concepts This is an open access book

    A Defense of Pure Connectionism

    Full text link
    Connectionism is an approach to neural-networks-based cognitive modeling that encompasses the recent deep learning movement in artificial intelligence. It came of age in the 1980s, with its roots in cybernetics and earlier attempts to model the brain as a system of simple parallel processors. Connectionist models center on statistical inference within neural networks with empirically learnable parameters, which can be represented as graphical models. More recent approaches focus on learning and inference within hierarchical generative models. Contra influential and ongoing critiques, I argue in this dissertation that the connectionist approach to cognitive science possesses in principle (and, as is becoming increasingly clear, in practice) the resources to model even the most rich and distinctly human cognitive capacities, such as abstract, conceptual thought and natural language comprehension and production. Consonant with much previous philosophical work on connectionism, I argue that a core principle—that proximal representations in a vector space have similar semantic values—is the key to a successful connectionist account of the systematicity and productivity of thought, language, and other core cognitive phenomena. My work here differs from preceding work in philosophy in several respects: (1) I compare a wide variety of connectionist responses to the systematicity challenge and isolate two main strands that are both historically important and reflected in ongoing work today: (a) vector symbolic architectures and (b) (compositional) vector space semantic models; (2) I consider very recent applications of these approaches, including their deployment on large-scale machine learning tasks such as machine translation; (3) I argue, again on the basis mostly of recent developments, for a continuity in representation and processing across natural language, image processing and other domains; (4) I explicitly link broad, abstract features of connectionist representation to recent proposals in cognitive science similar in spirit, such as hierarchical Bayesian and free energy minimization approaches, and offer a single rebuttal of criticisms of these related paradigms; (5) I critique recent alternative proposals that argue for a hybrid Classical (i.e. serial symbolic)/statistical model of mind; (6) I argue that defending the most plausible form of a connectionist cognitive architecture requires rethinking certain distinctions that have figured prominently in the history of the philosophy of mind and language, such as that between word- and phrase-level semantic content, and between inference and association
    • …
    corecore