10 research outputs found

    Sparse Social Domains Based Scalable Learning of Collective Behaviour

    Get PDF
    Abstract-Social networking is process where many people get connected with each other share their views and images. Social Networking has become very important these days where many people get connected globally, every individual today has an social networking site account for example we can consider Facebook which has gained a lot of importance when compared to other social networking sites. We have many social networking domains available in the market like Facebook, Twitter, Linkedin and many others. Social Network is good and interesting at the other side it is insecure also. Now a day's social network accounts are hacked so it is very important for every individual to logout properly in the system where they have used the network and also they should not share their account details with anyone which may lead to illegal issues. In this paper we are performing a scalable learning of a particular user through the usage of their social network and also giving a report like the main purpose for which the social network site was used by that user. Apart from the scalable learning we are also checking with the access control in the social networks where a user can share their views or images or videos to a specific group or to friends secretly. As the social network has gained more significance every individual is curious to get more likes to their posts so it is a very important task to stop the fake accounts or detect the Sybil users in the network. This paper does three tasks in total which are scalable learning, sharing access rights and detection of fake accounts

    Semantic Concept Co-Occurrence Patterns for Image Annotation and Retrieval.

    Get PDF
    Describing visual image contents by semantic concepts is an effective and straightforward way to facilitate various high level applications. Inferring semantic concepts from low-level pictorial feature analysis is challenging due to the semantic gap problem, while manually labeling concepts is unwise because of a large number of images in both online and offline collections. In this paper, we present a novel approach to automatically generate intermediate image descriptors by exploiting concept co-occurrence patterns in the pre-labeled training set that renders it possible to depict complex scene images semantically. Our work is motivated by the fact that multiple concepts that frequently co-occur across images form patterns which could provide contextual cues for individual concept inference. We discover the co-occurrence patterns as hierarchical communities by graph modularity maximization in a network with nodes and edges representing concepts and co-occurrence relationships separately. A random walk process working on the inferred concept probabilities with the discovered co-occurrence patterns is applied to acquire the refined concept signature representation. Through experiments in automatic image annotation and semantic image retrieval on several challenging datasets, we demonstrate the effectiveness of the proposed concept co-occurrence patterns as well as the concept signature representation in comparison with state-of-the-art approaches

    Multi-Label Classification Based on the Improved Probabilistic Neural Network

    Get PDF
    This paper aims to overcome the defects of the existing multi-label classification methods, such as the insufficient use of label correlation and class information. For this purpose, an improved probabilistic neural network for multi-label classification (ML-IPNN) was developed through the following steps. Firstly, the traditional PNN was structurally improved to fit in with multi-label data. Then secondly, a weight matrix was introduced to represent the label correlation and synthetize the information between classes, and the ML-IPNN was trained with the backpropagation mechanism. Finally, the classification results of the ML-IPNN on three common datasets were compared with those of the seven most popular multi-label classification algorithms. The results show that the ML-IPNN outperformed all contrastive algorithms. The research findings brought new light on multi-label classification and the application of artificial neural networks (ANNs)

    Multi-Instance Multi-Label Learning

    Get PDF
    In this paper, we propose the MIML (Multi-Instance Multi-Label learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicated objects which have multiple semantic meanings. To learn from MIML examples, we propose the MimlBoost and MimlSvm algorithms based on a simple degeneration strategy, and experiments show that solving problems involving complicated objects with multiple semantic meanings in the MIML framework can lead to good performance. Considering that the degeneration process may lose information, we propose the D-MimlSvm algorithm which tackles MIML problems directly in a regularization framework. Moreover, we show that even when we do not have access to the real objects and thus cannot capture more information from real objects by using the MIML representation, MIML is still useful. We propose the InsDif and SubCod algorithms. InsDif works by transforming single-instances into the MIML representation for learning, while SubCod works by transforming single-label examples into the MIML representation for learning. Experiments show that in some tasks they are able to achieve better performance than learning the single-instances or single-label examples directly.Comment: 64 pages, 10 figures; Artificial Intelligence, 201

    Semi-supervised Multi-label Learning by Constrained Non-negative Matrix Factorization

    No full text
    We present a novel framework for multi-label learning that explicitly addresses the challenge arising from the large number of classes and a small size of training data. The key assumption behind this work is that two examples tend to have large overlap in their assigned class memberships if they share high similarity in their input patterns. We capitalize this assumption by first computing two sets of similarities, one based on the input patterns of examples, and the other based on the class memberships of the examples. We then search for the optimal assignment of class memberships to the unlabeled data that minimizes the difference between these two sets of similarities. The optimization problem is formulated as a constrained Non-negative Matrix Factorization (NMF) problem, and an algorithm is presented to efficiently find the solution. Compared to the existing approaches for multi-label learning, the proposed approach is advantageous in that it is able to explore both the unlabeled data and the correlation among different classes simultaneously. Experiments with text categorization show that our approach performs significantly better than several state-of-the-art classification techniques when the number of classes is large and the size of training data is small

    The Emerging Trends of Multi-Label Learning

    Full text link
    Exabytes of data are generated daily by humans, leading to the growing need for new efforts in dealing with the grand challenges for multi-label learning brought by big data. For example, extreme multi-label classification is an active and rapidly growing research area that deals with classification tasks with an extremely large number of classes or labels; utilizing massive data with limited supervision to build a multi-label classification model becomes valuable for practical applications, etc. Besides these, there are tremendous efforts on how to harvest the strong learning capability of deep learning to better capture the label dependencies in multi-label learning, which is the key for deep learning to address real-world classification tasks. However, it is noted that there has been a lack of systemic studies that focus explicitly on analyzing the emerging trends and new challenges of multi-label learning in the era of big data. It is imperative to call for a comprehensive survey to fulfill this mission and delineate future research directions and new applications.Comment: Accepted to TPAMI 202

    Learning from text and images: generative and discriminative models for partially labeled data

    Get PDF
    Image annotation is a challenging task of assigning keywords to an image given the content of an image. It has a variety of applications in multi-media data-mining and computer vision. Traditional machine learning approaches to image annotation require large amounts of labeled data. This requirement is often unrealistic, as obtaining labeled data is, in general, expensive and time consuming. However, large amounts of weakly labeled data and tagged images is readily available, in particular in the web and social network communities. In this thesis we address the problem of image annotation using weak supervision. In particular, we formulate the problem of image annotation as multiple instance multiple label learning problem and propose generative and discriminative models to tackle this learning problem. Multiple instance multiple label learning is a generalization of supervised learning in which the training examples are bags of instances and each bag is labeled with a set of labels. We explore two learning frameworks: generative and discriminative, and propose models within each framework to address the problem of assigning text keywords to images. The first approach, the generative model attempts to describe the process according to which the data was generated, and then learn its parameters from the data. This model is a non-parametric generalization of the known mixture model used in the past. We extend this model to a Hierarchical Dirichlet Process which allows for countably infinite mixture components. Our experimental evaluation shows that the performance of this model does not depend on the number of mixture components, unlike the standard mixture model which suffers from over-fitting for a large number of mixture components. The second approach is a discriminative model, which unlike generative model answers the following question: given the input bag of instances what is the most likely assignment of labels to the bag. We address this problem by learning as many classifiers as there are possible labels and force the classifiers to share weights using trace-norm regularization. We show that the performance of this model is comparable to the state-of-the-art multiple instance multiple label classifiers and that unlike some state-of-the-art models, it is scalable and practical for datasets with a large number of training instances and possible labels. Finally we generalize the discriminative model to a semi-supervised setting to allow the model take advantage of labeled and unlabeled data. We do so by assuming that the data lies in a low-dimensional manifold and introducing a penalty that enforces the classifiers assign similar labels to indirectly similar instances (i.e. instances that are near-by in the manifold space). The manifold is learned by constructing a similarity neighborhood graph over bags, and then graph-Laplacian is used to compute the penalty term

    Simulation et inférence de réseaux de neurones à l’aide d’intelligence artificielle

    Get PDF
    La représentation par réseau est un outil puissant pour la modélisation des systèmes dynamiques complexes. Elle est notamment utilisée en neurosciences pour étudier le cerveau. Cependant, extraire un connectome, soit la liste des neurones et des connexions qui les relient, demeure un défi important pour des cerveaux de plusieurs milliers de neurones. C’est le cas du cerveau de la larve du poisson-zèbre qui contient près d’une centaine de milliers de neurones. Puisque les synapses ne peuvent être directement observées, les connexions entre neurones doivent plutôt être inférées. Plusieurs méthodes classiques, dites d’inférence fonctionnelle, issues des statistiques et de la théorie de l’information, prédisent la connectivité à partir des séries temporelles qui décrivent l’activité des neurones. Plus récemment, des avancées en intelligence artificielle ont ouvert la voie à de nouvelles méthodes d’inférence. L’objectif du projet de maîtrise exposé dans ce mémoire est de comparer la performance des méthodes de l’intelligence artificielle à celle des méthodes bien établies. Puisque la connectivité réelle est nécessaire pour une telle comparaison, un simulateur de réseau de neurones est utilisé pour générer des séries temporelles d’activité à partir de connectivités réelles extraites de vidéos d’activité. Il est montré que la calibration d’un tel simulateur, dans le but d’obtenir de l’activité similaire à celle des poissons-zèbres, n’est pas une tâche triviale. Une approche d’apprentissage profond est donc conçue pour prédire, à partir de métriques d’activité globale, les paramètres de simulation idéaux. Il est ensuite montré, sur 86% des simulations générées, qu’un modèle de réseau de neurones artificiels à convolution performe significativement mieux que les autres méthodes d’inférence. Cependant, lorsqu’un entraînement supervisé est impossible, la méthode classique de transfert d’entropie performe mieux qu’un modèle d’apprentissage profond nonsupervisé sur 78% des simulations générées.Complex network analysis is a powerful tool for the study of dynamical systems. It is often used in neuroscience to study the brain. However, extraction of complete connectomes, i.e. , the list of all neurons and connections, is still a challenge for large brains. This is the case for the brain of the zebrafish which contains almost a hundred thousand neurons. Since direct observation of synapses is still intractable for a brain of this size, connections between neurons must be inferred from their activity. It is indeed possible to extract time series of activity for all neurons, by making them fluorescent upon activation through genetic engineering and by leveraging the zebrafish’s transparency during the larval stage. Then, so-called methods of functional inference, based on information theory, can be used to predict the connectivity of neurons from time series of their activity. Recent breakthroughs in artificial intelligence have opened the door to new methods of inference. The goal of the project described in this thesis is to compare the performance of such new methods to the performance of well-established ones. Since ground truth of connectivity must be known for comparison, a simulator is used to generate synthetic time series of activity from known connectivity. It is shown that the tuning of such a simulator, in order to generate realistic data, is not an easy task. Therefore, a deep learning approach is proposed to predict optimal simulator parameters by analysis global dynamical metrics. Using the generated time series, it is shown that a convolutional neural network performs significantly better than well-established methods on 86% of simulations. However, in cases where supervised learning is impossible, the zebrafish’s case being an example, the classical method of Transfer Entropy performs better than an unsupervised deep learning model on 78% of simulations
    corecore