349,743 research outputs found

    Multi-view feature engineering and learning

    Full text link
    We frame the problem of local representation of imaging data as the computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination. We show that, under very stringent condi-tions, these are related to “feature descriptors ” commonly used in Computer Vision. Such conditions can be relaxed if multiple views of the same scene are available. We pro-pose a sampling-based and a point-estimate based approx-imation of such a representation, compared empirically on image-to-(multiple)image matching, for which we introduce a multi-view wide-baseline matching benchmark, consisting of a mixture of real and synthetic objects with ground truth camera motion and dense three-dimensional geometry. 1

    Multi-graph learning

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Multi-instance learning (MIL) is a special learning task where labels are only available for a bag of instances. Although MIL has been used for many applications, existing MIL algorithms cannot handle complex data objects, and all require that instances inside each bag are represented as feature vectors (e.g. being represented in an instance-feature format). In reality, many real-world objects are inherently complicated, and an object can be represented as multiple instances with dependency structures (i.e. graphs). Such dependency allows relationships between objects to play important roles, which, unfortunately, remain unaddressed in traditional instance-feature representations. Motivated by the challenges, this thesis formulates a new multi-graph learning paradigm for representing and classifying complicated objects. With the proposed multi-graph representation, the thesis systematically addresses several key learning tasks, including Multi-Graph Learning: A graph bag contains one or multiple graphs, and each bag is labeled as either positive or negative. The aim of multi-graph learning is to build a learning model from a number of labeled training bags to predict previously unseen bags with maximum accuracy. To solve the problem, we propose two types of approaches: 1) Multi-Graph Feature based Learning (gMGFL) algorithm that explores and selects an optimal set of subgraphs as features to transfer each bag into a single instance for further learning; and 2) Boosting based Multi-Graph Classification framework (bMGC), which employs dynamic weight adjustment, at both graph- and bag-levels, to select one subgraph in each iteration to form a set of weak graph classifiers. Multi-Instance Multi-Graph learning: A bag contains a number of instances and graphs in pairs, and the learning objective is to derive classification models from labeled bags, containing both instances and graphs, to predict previously unseen bags with maximum accuracy. In the thesis, we propose a Dual Embedding Multi-Instance Multi-Graph Learning (DE-MIMG) algorithm, which employs a dual embedding learning approach to (1) embed instance distributions into the informative subgraphs discovery process, and (2) embed discovered subgraphs into the instance feature selection process. Positive and Unlabeled Multi-Graph Learning: The training set only contains positive and unlabeled bags, where labels are only available for bags but not for individual graphs inside the bag. This problem setting raises significant challenges because bag-of-graph setting does not have features available to directly represent graph data, and no negative bags exits for deriving discriminative classification models. To solve the challenge, we propose a puMGL learning framework which relies on two iteratively combined processes: (1) deriving features to represent graphs for learning; and (2) deriving discriminative models with only positive and unlabeled graph bags. Multi-Graph-View Learning: A multi-graph-view model utilizes graphs constructed from multiple graph-views to represent an object. In our research, we formulate a new multi-graph-view learning task for graph classification, where each object to be classified is represented graphs under multi-graph-view. To solve the problem, we propose a Cross Graph-View Subgraph Feature based Learning (gCGVFL) algorithm that explores an optimal set of subgraph features cross multiple graph-views. In addition, a bag based multi-graph model is further used to relax the labeling by only requiring one label for each graph bag, which corresponds to one object. For learning classification models, we propose a multi-graph-view bag learning algorithm (MGVBL), to explore subgraphs from multiple graph-views for learning. Experiments on real-world data validate and demonstrate the performance of proposed methods for classifying complicated objects using multi-graph learning

    Multimodal human behavior analysis: Learning correlation and interaction across modalities

    Get PDF
    Multimodal human behavior analysis is a challenging task due to the presence of complex nonlinear correlations and interactions across modalities. We present a novel approach to this problem based on Kernel Canonical Correlation Analysis (KCCA) and Multi-view Hidden Conditional Random Fields (MV-HCRF). Our approach uses a nonlinear kernel to map multimodal data to a high-dimensional feature space and finds a new projection of the data that maximizes the correlation across modalities. We use a multi-chain structured graphical model with disjoint sets of latent variables, one set per modality, to jointly learn both view-shared and view-specific sub-structures of the projected data, capturing interaction across modalities explicitly. We evaluate our approach on a task of agreement and disagreement recognition from nonverbal audio-visual cues using the Canal 9 dataset. Experimental results show that KCCA makes capturing nonlinear hidden dynamics easier and MV-HCRF helps learning interaction across modalities.United States. Office of Naval Research (Grant N000140910625)National Science Foundation (U.S.) (Grant IIS-1118018)National Science Foundation (U.S.) (Grant IIS-1018055)United States. Army Research, Development, and Engineering Comman

    Instance-based and feature-based classification enhancement for short & sparse texts

    Full text link
    University of Technology, Sydney. Faculty of Engineering and Information Technology.Short, sparse texts are becoming increasingly prevalent as a result of the growing popularity of social networking web sites, such as micro-blogs, Twitter and Flickr, and sites offering online product reviews. These short & sparse texts usually consist of a dozen or more words, or a few sentences, which we represent as a sparse document-term matrix. Compared to normal texts, short & sparse texts have three specific characteristics: (1) insufficient word co-occurrence to measure similarity, (2) low quality data resulting from spelling error, acronyms and slang, and (3) data sparseness. Normal classification methods therefore fail to achieve the desired level of accuracy for classifying short & sparse text. In this thesis, we present a series of novel approaches to enhance the performance of short & sparse text classification. Most texts can be represented as a two-dimensional matrix and we use the terms - “instance” and “feature” to denote the “row” and “column” concept respectively in the matrix. Corresponding to the matrix’s two dimensions, we design an instance- and feature-based framework to expand the rows/columns in the matrix. • for the instance-based framework, we extract an auxiliary dataset from an external online source (i.e. Wikipedia) with predefined class information, and integrate the target and auxiliary datasets with an instance-based transfer learning tool to enhance the classification performance of the target short text domain. Moreover, we propose a sampling framework to handle the challenge of low quality data in auxiliary dataset; • for the feature-based framework, we infer two kinds of feature sets with the given short texts, and then combine them with multi-view learning tool to enhance the classification performance. To handle the view disagreement challenge, we integrate a Bagging framework with Multi-view learning. The aim of the proposed algorithms is to improve classification performance (i.e. accuracy). To evaluate the proposed algorithms, we test them using a variety of benchmark datasets and real world datasets, such as sentiment texts in Twitter, pre-processed 20 Newsgroup data, review texts for seminars, and search snippets. Moreover, we compare the algorithm with other benchmark algorithms on all datasets. The results of our experiments demonstrate that the accuracy of our proposed algorithms is superior to that of other similar algorithms

    Applications of Multi-view Learning Approaches for Software Comprehension

    Full text link
    Program comprehension concerns the ability of an individual to make an understanding of an existing software system to extend or transform it. Software systems comprise of data that are noisy and missing, which makes program understanding even more difficult. A software system consists of various views including the module dependency graph, execution logs, evolutionary information and the vocabulary used in the source code, that collectively defines the software system. Each of these views contain unique and complementary information; together which can more accurately describe the data. In this paper, we investigate various techniques for combining different sources of information to improve the performance of a program comprehension task. We employ state-of-the-art techniques from learning to 1) find a suitable similarity function for each view, and 2) compare different multi-view learning techniques to decompose a software system into high-level units and give component-level recommendations for refactoring of the system, as well as cross-view source code search. The experiments conducted on 10 relatively large Java software systems show that by fusing knowledge from different views, we can guarantee a lower bound on the quality of the modularization and even improve upon it. We proceed by integrating different sources of information to give a set of high-level recommendations as to how to refactor the software system. Furthermore, we demonstrate how learning a joint subspace allows for performing cross-modal retrieval across views, yielding results that are more aligned with what the user intends by the query. The multi-view approaches outlined in this paper can be employed for addressing problems in software engineering that can be encoded in terms of a learning problem, such as software bug prediction and feature location

    Probabilistic models for multi-view semi-supervised learning and coding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 146-160).This thesis investigates the problem of classification from multiple noisy sensors or modalities. Examples include speech and gesture interfaces and multi-camera distributed sensor networks. Reliable recognition in such settings hinges upon the ability to learn accurate classification models in the face of limited supervision and to cope with the relatively large amount of potentially redundant information transmitted by each sensor or modality (i.e., view). We investigate and develop novel multi view learning algorithms capable of learning from semi-supervised noisy sensor data, for automatically adapting to new users and working conditions, and for performing distributed feature selection on bandwidth limited sensor networks. We propose probabilistic models built upon multi-view Gaussian Processes (GPs) for solving this class of problems, and demonstrate our approaches for solving audio-visual speech and gesture, and multi-view object classification problems. Multi-modal tasks are good candidates for multi-view learning, since each modality provides a potentially redundant view to the learning algorithm. On audio-visual speech unit classification, and user agreement recognition using spoken utterances and head gestures, we demonstrate that multi-modal co-training can be used to learn from only a few labeled examples in one or both of the audio-visual modalities. We also propose a co-adaptation algorithm, which adapts existing audio-visual classifiers to a particular user or noise condition by leveraging the redundancy in the unlabeled data. Existing methods typically assume constant per-channel noise models.(cont.) In contrast we develop co-training algorithms that are able to learn from noisy sensor data corrupted by complex per-sample noise processes, e.g., occlusion common to multi sensor classification problems. We propose a probabilistic heteroscedastic approach to co-training that simultaneously discovers the amount of noise on a per-sample basis, while solving the classification task. This results in accurate performance in the presence of occlusion or other complex noise processes. We also investigate an extension of this idea for supervised multi-view learning where we develop a Bayesian multiple kernel learning algorithm that can learn a local weighting over each view of the input space. We additionally consider the problem of distributed object recognition or indexing from multiple cameras, where the computational power available at each camera sensor is limited and communication between cameras is prohibitively expensive. In this scenario, it is desirable to avoid sending redundant visual features from multiple views. Traditional supervised feature selection approaches are inapplicable as the class label is unknown at each camera. In this thesis, we propose an unsupervised multi-view feature selection algorithm based on a distributed coding approach. With our method, a Gaussian Process model of the joint view statistics is used at the receiver to obtain a joint encoding of the views without directly sharing information across encoders. We demonstrate our approach on recognition and indexing tasks with multi-view image databases and show that our method compares favorably to an independent encoding of the features from each camera.by C. Mario Christoudias.Ph.D

    Bot-Mgat: A Transfer Learning Model Based On A Multi-View Graph Attention Network To Detect Social Bots

    Get PDF
    Twitter, as a popular social network, has been targeted by different bot attacks. Detecting social bots is a challenging task, due to their evolving capacity to avoid detection. Extensive research efforts have proposed different techniques and approaches to solving this problem. Due to the scarcity of recently updated labeled data, the performance of detection systems degrades when exposed to a new dataset. Therefore, semi-supervised learning (SSL) techniques can improve performance, using both labeled and unlabeled examples. In this paper, we propose a framework based on the multi-view graph attention mechanism using a transfer learning (TL) approach, to predict social bots. We called the framework \u27Bot-MGAT\u27, which stands for bot multi-view graph attention network. The framework used both labeled and unlabeled data. We used profile features to reduce the overheads of the feature engineering. We executed our experiments on a recent benchmark dataset that included representative samples of social bots with graph structural information and profile features only. We applied cross-validation to avoid uncertainty in the model\u27s performance. Bot-MGAT was evaluated using graph SSL techniques: single graph attention networks (GAT), graph convolutional networks (GCN), and relational graph convolutional networks (RGCN). We compared Bot-MGAT to related work in the field of bot detection. The results of Bot-MGAT with TL outperformed, with an accuracy score of 97.8%, an F1 score of 0.9842, and an MCC score of 0.9481

    Bandit learning for sequential decision making : a practical way to address the trade-off between exploration and exploitation

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.The sequential decision making is to actively acquire information and then make decisions in large uncertain options, such as recommendation systems and the Internet. The sequential decision becomes challenging since the feedback is often partially observed. In this thesis we propose new algorithms of “bandit learning”, whose basic idea is to address the fundamental trade-off between exploration and exploitation in sequence. The goal of bandit learning algorithms is to maximize some objective when making decision. We study several novel methodologies for different scenarios, such as social networks, multi-view, multi-task, repeated labeling and active learning. We formalize these adaptive problems as sequential decision making for different real applications. We present several new insights into these popular problems from the perspective of bandit. We address the trade-off between exploration and exploitation using a bandit framework. In particular, we introduce “networked bandits” to model the multi-armed bandits with correlations, which exist in social networks. The “networked bandits” is a new bandit model that considers a set of interrelated arms varying over time and selecting an arm invokes the other arms. The objective is still to obtain the best cumulative payoffs. We propose a method that considers both the arm and its relationships between arms. The proposed method selects an arm according to the integrated confidence sets constructed from historical data. We study the problem of view selection in stream-based multi-view learning, where each view is obtained from a feature generator or source and is embedded in a reproducing kernel Hilbert space (RKHS). We propose an algorithm that selects a near-optimal subset of m views of n views and then makes the prediction based on the subset. To address this problem, we define the multi-view simple regret and study an upper bound of the expected regret for our algorithm. The proposed algorithm relies on the Rademacher complexity of the co-regularized kernel classes. We address an active learning scenario in the multi-task learning problem. Considering that labeling effective instances across different tasks may improve the generalization error of all tasks, we propose a new active multi-task learning algorithm based on the multi-armed bandits for effectively selecting instances. The proposed algorithm can balance the trade-off between exploration and exploitation by considering both the risk of multi-task learner and the corresponding confidence bounds. We study a popular annotation problem in crowdsourcing systems: repeated labeling. We introduce a new framework that actively selects the labeling tasks when facing a large number of labeling tasks. The objective is to identify the best labeling tasks from these noisy labeling tasks. We formalize the selection of repeated labeling tasks as a bandit framework. We consider a labeling task as an arm and the quality of a labeling task as the payoff. We introduce the definition of ε-optimal labeling task and use it to identify the optimal labeling task. Taking the expected labeling quality into account, we provide a simple repeated labeling strategy. We then extend this to address how to identify the best m labeling tasks, and in doing so propose the best m labeling algorithm by indexing the labeling tasks using the expected labeling quality. We study active learning in a new perspective of active learning. We build the bridge between the active learning and multi-armed bandits. Active learning aims to learn a classifier by actively acquiring the data points, whose labels are hidden initially and incur querying cost. The multi-armed bandit problem is a framework that can adapt the decision in sequence based on rewards that have been observed so far. Inspired by the multi-armed bandits, we consider active learning so as to identify the best hypothesis in an optimal candidate set of hypotheses by involving querying the labels of points as few as possible. Our algorithms are proposed to maintain the candidate set of hypotheses using the error or the corresponding general lower and upper error bounds to help select or eliminate hypotheses. To maintain the candidate set of hypotheses, in the realizable PAC setting, we directly use the error. In the agnostic setting, we use the lower and upper error bounds of the hypotheses. To label the data points, we use the uncertainty strategy based on the candidate set of hypotheses

    Signal processing and analytics of multimodal biosignals

    Get PDF
    Ph. D. ThesisBiosignals have been extensively studied by researchers for applications in diagnosis, therapy, and monitoring. As these signals are complex, they have to be crafted as features for machine learning to work. This begs the question of how to extract features that are relevant and yet invariant to uncontrolled extraneous factors. In the last decade or so, deep learning has been used to extract features from the raw signals automatically. Furthermore, with the proliferation of sensors, more raw signals are now available, making it possible to use multi-view learning to improve on the predictive performance of deep learning. The purpose of this work is to develop an effective deep learning model of the biosignals and make use of the multi-view information in the sequential data. This thesis describes two proposed methods, namely: (1) The use of a deep temporal convolution network to provide the temporal context of the signals to the deeper layers of a deep belief net. (2) The use of multi-view spectral embedding to blend the complementary data in an ensemble. This work uses several annotated biosignal data sets that are available in the open domain. They are non-stationary, noisy and non-linear signals. Using these signals in their raw form without feature engineering will yield poor results with the traditional machine learning techniques. By passing abstractions that are more useful through the deep belief net and blending the complementary data in an ensemble, there will be improvement in performance in terms of accuracy and variance, as shown by the results of 10-fold validations.Nanyang Polytechni
    corecore