251 research outputs found

    Fuzzy Interval-Valued Multi Criteria Based Decision Making for Ranking Features in Multi-Modal 3D Face Recognition

    Get PDF
    Soodamani Ramalingam, 'Fuzzy interval-valued multi criteria based decision making for ranking features in multi-modal 3D face recognition', Fuzzy Sets and Systems, In Press version available online 13 June 2017. This is an Open Access paper, made available under the Creative Commons license CC BY 4.0 https://creativecommons.org/licenses/by/4.0/This paper describes an application of multi-criteria decision making (MCDM) for multi-modal fusion of features in a 3D face recognition system. A decision making process is outlined that is based on the performance of multi-modal features in a face recognition task involving a set of 3D face databases. In particular, the fuzzy interval valued MCDM technique called TOPSIS is applied for ranking and deciding on the best choice of multi-modal features at the decision stage. It provides a formal mechanism of benchmarking their performances against a set of criteria. The technique demonstrates its ability in scaling up the multi-modal features.Peer reviewedProo

    Procceding 2rd International Seminar on Linguistics

    Get PDF

    Robust Modeling of Epistemic Mental States

    Full text link
    This work identifies and advances some research challenges in the analysis of facial features and their temporal dynamics with epistemic mental states in dyadic conversations. Epistemic states are: Agreement, Concentration, Thoughtful, Certain, and Interest. In this paper, we perform a number of statistical analyses and simulations to identify the relationship between facial features and epistemic states. Non-linear relations are found to be more prevalent, while temporal features derived from original facial features have demonstrated a strong correlation with intensity changes. Then, we propose a novel prediction framework that takes facial features and their nonlinear relation scores as input and predict different epistemic states in videos. The prediction of epistemic states is boosted when the classification of emotion changing regions such as rising, falling, or steady-state are incorporated with the temporal features. The proposed predictive models can predict the epistemic states with significantly improved accuracy: correlation coefficient (CoERR) for Agreement is 0.827, for Concentration 0.901, for Thoughtful 0.794, for Certain 0.854, and for Interest 0.913.Comment: Accepted for Publication in Multimedia Tools and Application, Special Issue: Socio-Affective Technologie

    LANGUAGE AND CIVILIZATION: PROCEEDING OF THE 2nd INTERNATIONAL SEMINAR ON LINGUISTICS

    Get PDF

    PROCEEDING THE 2nd INTERNATIONAL SEMINAR ON LINGUISTICS (ISOL-2): Language and Civilization

    Get PDF
    ISOL is a biennial international seminar held by the Linguistics Graduate Program of Faculty of Humanity, Andalas University in collaboration with the Linguistic Society of Indonesia (MLI), Unand Chapter. ISOL aims to provide a discussion platform for linguists and language observers across Indonesia. Its main objective is to enhance the exchange of research and new approaches in language studies. The seminar is open to interested people from outside of Indonesia. The theme of the 2nd ISOL is Language and Civilization. Civilization is the process by which a society or place reaches an advanced stage of social development and organization. It is also defined as the society, culture, and way of life of a particular area. Over time, the word civilization has come to imply something beyond the organization. It refers to a particular shared way of thinking about the world as well as a reflection on that world in art, literature, drama and a host of other cultural happenings. Language is itself a social construct – a component of social reality. Thus, like all social constructs and conventions, it can be changed. A civilization is any complex state society which is characterized by urban development, social stratification, symbolic communication forms and a perceived separation from and domination over the natural environment. To advance civilization is to construct a new social reality which emerges through language. In other words, social reality is the operational expression of words and the meanings of them that society has agreed upon. Language is itself a social construct – a component of social reality. Thus, like all social constructs and conventions, it can be changed

    Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions

    Get PDF
    yesThis paper aims to demystify the hype and attention on chatbots and its association with conversational artificial intelligence. Both are slowly emerging as a real presence in our lives from the impressive technological developments in machine learning, deep learning and natural language understanding solutions. However, what is under the hood, and how far and to what extent can chatbots/conversational artificial intelligence solutions work – is our question. Natural language is the most easily understood knowledge representation for people, but certainly not the best for computers because of its inherent ambiguous, complex and dynamic nature. We will critique the knowledge representation of heavy statistical chatbot solutions against linguistics alternatives. In order to react intelligently to the user, natural language solutions must critically consider other factors such as context, memory, intelligent understanding, previous experience, and personalized knowledge of the user. We will delve into the spectrum of conversational interfaces and focus on a strong artificial intelligence concept. This is explored via a text based conversational software agents with a deep strategic role to hold a conversation and enable the mechanisms need to plan, and to decide what to do next, and manage the dialogue to achieve a goal. To demonstrate this, a deep linguistically aware and knowledge aware text based conversational agent (LING-CSA) presents a proof-of-concept of a non-statistical conversational AI solution

    동적 멀티모달 데이터 학습을 위한 심층 하이퍼네트워크

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 2. 장병탁.Recent advancements in information communication technology has led the explosive increase of data. Dissimilar to traditional data which are structured and unimodal, in particular, the characteristics of recent data generated from dynamic environments are summarized as high-dimensionality, multimodality, and structurelessness as well as huge-scale size. The learning from non-stationary multimodal data is essential for solving many difficult problems in artificial intelligence. However, despite many successful reports, existing machine learning methods have mainly focused on solving practical problems represented by large-scaled but static databases, such as image classification, tagging, and retrieval. Hypernetworks are a probabilistic graphical model representing empirical distribution, using a hypergraph structure that is a large collection of many hyperedges encoding the associations among variables. This representation allows the model to be suitable for characterizing the complex relationships between features with a population of building blocks. However, since a hypernetwork is represented by a huge combinatorial feature space, the model requires a large number of hyperedges for handling the multimodal large-scale data and thus faces the scalability problem. In this dissertation, we propose a deep architecture of hypernetworks for dealing with the scalability issue for learning from multimodal data with non-stationary properties such as videos, i.e., deep hypernetworks. Deep hypernetworks handle the issues through the abstraction at multiple levels using a hierarchy of multiple hypergraphs. We use a stochastic method based on Monte-Carlo simulation, a graph MC, for efficiently constructing hypergraphs representing the empirical distribution of the observed data. The structure of a deep hypernetwork continuously changes as the learning proceeds, and this flexibility is contrasted to other deep learning models. The proposed model incrementally learns from the data, thus handling the nonstationary properties such as concept drift. The abstract representations in the learned models play roles of multimodal knowledge on data, which are used for the content-aware crossmodal transformation including vision-language conversion. We view the vision-language conversion as a machine translation, and thus formulate the vision-language translation in terms of the statistical machine translation. Since the knowledge on the video stories are used for translation, we call this story-aware vision-language translation. We evaluate deep hypernetworks on large-scale vision-language multimodal data including benmarking datasets and cartoon video series. The experimental results show the deep hypernetworks effectively represent visual-linguistic information abstracted at multiple levels of the data contents as well as the associations between vision and language. We explain how the introduction of a hierarchy deals with the scalability and non-stationary properties. In addition, we present the story-aware vision-language translation on cartoon videos by generating scene images from sentences and descriptive subtitles from scene images. Furthermore, we discuss the meaning of our model for lifelong learning and the improvement direction for achieving human-level artificial intelligence.1 Introduction 1.1 Background and Motivation 1.2 Problems to be Addressed 1.3 The Proposed Approach and its Contribution 1.4 Organization of the Dissertation 2 RelatedWork 2.1 Multimodal Leanring 2.2 Models for Learning from Multimodal Data 2.2.1 Topic Model-Based Multimodal Leanring 2.2.2 Deep Network-based Multimodal Leanring 2.3 Higher-Order Graphical Models 2.3.1 Hypernetwork Models 2.3.2 Bayesian Evolutionary Learning of Hypernetworks 3 Multimodal Hypernetworks for Text-to-Image Retrievals 3.1 Overview 3.2 Hypernetworks for Multimodal Associations 3.2.1 Multimodal Hypernetworks 3.2.2 Incremental Learning of Multimodal Hypernetworks 3.3 Text-to-Image Crossmodal Inference 3.3.1 Representatation of Textual-Visual Data 3.3.2 Text-to-Image Query Expansion 3.4 Text-to-Image Retrieval via Multimodal Hypernetworks 3.4.1 Data and Experimental Settings 3.4.2 Text-to-Image Retrieval Performance 3.4.3 Incremental Learning for Text-to-Image Retrieval 3.5 Summary 4 Deep Hypernetworks for Multimodal Cocnept Learning from Cartoon Videos 4.1 Overview 4.2 Visual-Linguistic Concept Representation of Catoon Videos 4.3 Deep Hypernetworks for Modeling Visual-Linguistic Concepts 4.3.1 Sparse Population Coding 4.3.2 Deep Hypernetworks for Concept Hierarchies 4.3.3 Implication of Deep Hypernetworks on Cognitive Modeling 4.4 Learning of Deep Hypernetworks 4.4.1 Problem Space of Deep Hypernetworks 4.4.2 Graph Monte-Carlo Simulation 4.4.3 Learning of Concept Layers 4.4.4 Incremental Concept Construction 4.5 Incremental Concept Construction from Catoon Videos 4.5.1 Data Description and Parameter Setup 4.5.2 Concept Representation and Development 4.5.3 Character Classification via Concept Learning 4.5.4 Vision-Language Conversion via Concept Learning 4.6 Summary 5 Story-awareVision-LanguageTranslation usingDeepConcept Hiearachies 5.1 Overview 5.2 Vision-Language Conversion as a Machine Translation 5.2.1 Statistical Machine Translation 5.2.2 Vision-Language Translation 5.3 Story-aware Vision-Language Translation using Deep Concept Hierarchies 5.3.1 Story-aware Vision-Language Translation 5.3.2 Vision-to-Language Translation 5.3.3 Language-to-Vision Translation 5.4 Story-aware Vision-Language Translation on Catoon Videos 5.4.1 Data and Experimental Setting 5.4.2 Scene-to-Sentence Generation 5.4.3 Sentence-to-Scene Generation 5.4.4 Visual-Linguistic Story Summarization of Cartoon Videos 5.5 Summary 6 Concluding Remarks 6.1 Summary of the Dissertation 6.2 Directions for Further Research Bibliography 한글초록Docto

    Linguistic and Structural Basis of Engineering Design Knowledge

    Full text link
    Artefact descriptions are the primary carriers of engineering design knowledge that is both an outcome and a driver of the design process. While an artefact could be described in different connotations, the design process requires a description to embody engineering design knowledge, which is expressed in the text through intricate placement of entities and relationships. As large-language models learn from all kinds of text merely as a sequence of characters/tokens, these are yet to generate text that embodies explicit engineering design facts. Existing ontological design theories are less likely to guide the large-language models whose applications are currently limited to ideation and learning purposes. In this article, we explicate engineering design knowledge as knowledge graphs from a large sample of 33,881 patent documents. We examine the constituents of these knowledge graphs to understand the linguistic and structural basis of engineering design knowledge. In terms of linguistic basis, we observe that entities and relationships could be generalised to 64 and 24 linguistic syntaxes. While relationships mainly capture attributes ('of'), structure ('in', 'with'), purpose ('to', 'for'), hierarchy ('include'), exemplification ('such as'), and behaviour ('to', 'from'), the hierarchical relationships could specifically be identified using 75 unique syntaxes. To understand the structural basis, we draw inspiration from various studies on biological/ecological networks and discover motifs from patent knowledge graphs. We identify four 3-node and four 4-node patterns that could further be converged and simplified into sequence [->...->], aggregation [->...]. Expected to guide large-language model based design tools, we propose few regulatory precepts for concretising abstract entities and relationships within subgraphs, while explicating hierarchical structures
    corecore