11 research outputs found

    RICH AND EFFICIENT VISUAL DATA REPRESENTATION

    Get PDF
    Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation

    Content-Based Visual Landmark Search via Multimodal Hypergraph Learning

    Get PDF
    Formerly IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics</p

    Deep knowledge transfer for generalization across tasks and domains under data scarcity

    Get PDF
    Over the last decade, deep learning approaches have achieved tremendous performance in a wide variety of fields, e.g., computer vision and natural language understanding, and across several sectors such as healthcare, industrial manufacturing, and driverless mobility. Most deep learning successes were accomplished in learning scenarios fulfilling the two following requirements. First, large amounts of data are available for training the deep learning model and there are no access restrictions to the data. Second, the data used for training and testing is independent and identically distributed (i.i.d.). However, many real-world applications infringe at least one of the aforementioned requirements, which results in challenging learning problems. The present thesis comprises four contributions to address four such learning problems. In each contribution, we propose a novel method and empirically demonstrate its effectiveness for the corresponding problem setting. The first part addresses the underexplored intersection of the few-shot learning and the one-class classification problems. In this learning scenario, the model has to learn a new task using only a few examples from only the majority class, without overfitting to the few examples or to the majority class. This learning scenario is faced in real-world applications of anomaly detection where data is scarce. We propose an episode sampling technique to adapt meta-learning algorithms designed for class-balanced few-shot classification to the addressed few-shot one-class classification problem. This is done by optimizing for a model initialization tailored for the addressed scenario. In addition, we provide theoretical and empirical analyses to investigate the need for second-order derivatives to learn such parameter initializations. Our experiments on 8 image and time-series datasets, including a real-world dataset of industrial sensor readings, demonstrate the effectiveness of our method. The second part tackles the intersection of the continual learning and the anomaly detection problems, which we are the first to explore, to the best of our knowledge. In this learning scenario, the model is exposed to a stream of anomaly detection tasks, i.e., only examples from the normal class are available, that it has to learn sequentially. Such problem settings are encountered in anomaly detection applications where the data distribution continuously changes. We propose a meta-learning approach that learns parameter-specific initializations and learning rates suitable for continual anomaly detection. Our empirical evaluations show that a model trained with our algorithm is able to learn up 100 anomaly detection tasks sequentially with minimal catastrophic forgetting and overfitting to the majority class. In the third part, we address the domain generalization problem, in which a model trained on several source domains is expected to generalize well to data from a previously unseen target domain, without any modification or exposure to its data. This challenging learning scenario is present in applications involving domain shift, e.g., different clinical centers using different MRI scanners or data acquisition protocols. We assume that learning to extract a richer set of features improves the transfer to a wider set of unknown domains. Motivated by this, we propose an algorithm that identifies the already learned features and corrupts them, hence enforcing new feature discovery. We leverage methods from the explainable machine learning literature to identify the features, and apply the targeted corruption on multiple representation levels, including input data and high-level embeddings. Our extensive empirical evaluation shows that our approach outperforms 18 domain generalization algorithms on multiple benchmark datasets. The last part of the thesis addresses the intersection of domain generalization and data-free learning methods, which we are the first to explore, to the best of our knowledge. Hereby, we address the learning scenario where a model robust to domain shift is needed and only models trained on the same task but different domains are available instead of the original datasets. This learning scenario is relevant for any domain generalization application where the access to the data of the source domains is restricted, e.g., due to concerns about data privacy concerns or intellectual property infringement. We develop an approach that extracts and fuses domain-specific knowledge from the available teacher models into a student model robust to domain shift, by generating synthetic cross-domain data. Our empirical evaluation demonstrates the effectiveness of our method which outperforms ensemble and data-free knowledge distillation baselines. Most importantly, the proposed approach substantially reduces the gap between the best data-free baseline and the upper-bound baseline that uses the original private data

    Convergence of Intelligent Data Acquisition and Advanced Computing Systems

    Get PDF
    This book is a collection of published articles from the Sensors Special Issue on "Convergence of Intelligent Data Acquisition and Advanced Computing Systems". It includes extended versions of the conference contributions from the 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS’2019), Metz, France, as well as external contributions

    Intelligence artificielle à la périphérie du réseau mobile avec efficacité de communication

    Get PDF
    L'intelligence artificielle (AI) et l'informatique à la périphérie du réseau (EC) ont permis de mettre en place diverses applications intelligentes incluant les maisons intelligentes, la fabrication intelligente, et les villes intelligentes. Ces progrès ont été alimentés principalement par la disponibilité d'un plus grand nombre de données, l'abondance de la puissance de calcul et les progrès de plusieurs techniques de compression. Toutefois, les principales avancées concernent le déploiement de modèles dans les dispositifs connectés. Ces modèles sont préalablement entraînés de manière centralisée. Cette prémisse exige que toutes les données générées par les dispositifs soient envoyées à un serveur centralisé, ce qui pose plusieurs problèmes de confidentialité et crée une surcharge de communication importante. Par conséquent, pour les derniers pas vers l'AI dans EC, il faut également propulser l'apprentissage des modèles ML à la périphérie du réseau. L'apprentissage fédéré (FL) est apparu comme une technique prometteuse pour l'apprentissage collaboratif de modèles ML sur des dispositifs connectés. Les dispositifs entraînent un modèle partagé sur leurs données stockées localement et ne partagent que les paramètres résultants avec une entité centralisée. Cependant, pour permettre l' utilisation de FL dans les réseaux périphériques sans fil, plusieurs défis hérités de l'AI et de EC doivent être relevés. En particulier, les défis liés à l'hétérogénéité statistique des données à travers les dispositifs ainsi que la rareté et l'hétérogénéité des ressources nécessitent une attention particulière. L'objectif de cette thèse est de proposer des moyens de relever ces défis et d'évaluer le potentiel de la FL dans de futures applications de villes intelligentes. Dans la première partie de cette thèse, l'accent est mis sur l'incorporation des propriétés des données dans la gestion de la participation des dispositifs dans FL et de l'allocation des ressources. Nous commençons par identifier les mesures de diversité des données qui peuvent être utilisées dans différentes applications. Ensuite, nous concevons un indicateur de diversité permettant de donner plus de priorité aux clients ayant des données plus informatives. Un algorithme itératif est ensuite proposé pour sélectionner conjointement les clients et allouer les ressources de communication. Cet algorithme accélère l'apprentissage et réduit le temps et l'énergie nécessaires. De plus, l'indicateur de diversité proposé est renforcé par un système de réputation pour éviter les clients malveillants, ce qui améliore sa robustesse contre les attaques par empoisonnement des données. Dans une deuxième partie de cette thèse, nous explorons les moyens de relever d'autres défis liés à la mobilité des clients et au changement de concept dans les distributions de données. De tels défis nécessitent de nouvelles mesures pour être traités. En conséquence, nous concevons un processus basé sur les clusters pour le FL dans les réseaux véhiculaires. Le processus proposé est basé sur la formation minutieuse de clusters pour contourner la congestion de la communication et est capable de traiter différents modèles en parallèle. Dans la dernière partie de cette thèse, nous démontrons le potentiel de FL dans un cas d'utilisation réel impliquant la prévision à court terme de la puissance électrique dans un réseau intelligent. Nous proposons une architecture permettant l'utilisation de FL pour encourager la collaboration entre les membres de la communauté et nous montrons son importance pour l'entraînement des modèles et la réduction du coût de communication à travers des résultats numériques.Abstract : Artificial intelligence (AI) and Edge computing (EC) have enabled various applications ranging from smart home, to intelligent manufacturing, and smart cities. This progress was fueled mainly by the availability of more data, abundance of computing power, and the progress of several compression techniques. However, the main advances are in relation to deploying cloud-trained machine learning (ML) models on edge devices. This premise requires that all data generated by end devices be sent to a centralized server, thus raising several privacy concerns and creating significant communication overhead. Accordingly, paving the last mile of AI on EC requires pushing the training of ML models to the edge of the network. Federated learning (FL) has emerged as a promising technique for the collaborative training of ML models on edge devices. The devices train a globally shared model on their locally stored data and only share the resulting parameters with a centralized entity. However, to enable FL in wireless edge networks, several challenges inherited from both AI and EC need to be addressed. In particular, challenges related to the statistical heterogeneity of the data across the devices alongside the scarcity and the heterogeneity of the resources require particular attention. The goal of this thesis is to propose ways to address these challenges and to evaluate the potential of FL in future applications. In the first part of this thesis, the focus is on incorporating the data properties of FL in handling the participation and resource allocation of devices in FL. We start by identifying data diversity measures allowing us to evaluate the richness of local datasets in different applications. Then, we design a diversity indicator allowing us to give more priority to clients with more informative data. An iterative algorithm is then proposed to jointly select clients and allocate communication resources. This algorithm accelerates the training and reduces the overall needed time and energy. Furthermore, the proposed diversity indicator is reinforced with a reputation system to avoid malicious clients, thus enhancing its robustness against poisoning attacks. In the second part of this thesis, we explore ways to tackle other challenges related to the mobility of the clients and concept-shift in data distributions. Such challenges require new measures to be handled. Accordingly, we design a cluster-based process for FL for the particular case of vehicular networks. The proposed process is based on careful clusterformation to bypass the communication bottleneck and is able to handle different models in parallel. In the last part of this thesis, we demonstrate the potential of FL in a real use-case involving short-term forecasting of electrical power in smart grid. We propose an architecture empowered with FL to encourage the collaboration among community members and show its importance for both training and judicious use of communication resources through numerical results

    Combining simulated and real images in deep learning

    Get PDF
    To train a deep learning (DL) model, considerable amounts of data are required to generalize to unseen cases successfully. Furthermore, such data is often manually labeled, making its annotation process costly and time-consuming. We propose the use of simulated data, obtained from simulators, as a way to surpass the increasing need for annotated data. Although the use of simulated environments represents an unlimited and cost-effective supply of automatically annotated data, we are still referring to synthetic information. As such, it differs in representation and distribution comparatively to real-world data. The field which addresses the problem of merging the useful features from each of these domains is called domain adaptation (DA), a branch of transfer learning. In this field, several advances have been made, from fine-tuning existing networks to sample-reconstruction approaches. Adversarial DA methods, which make use of Generative Adversarial Networks (GANs), are state-of-the-art and the most widely used. With previous approaches, training data was being sourced from already existent datasets, and the usage of simulators as a means to obtain new observations was an alternative not fully explored. We aim to survey possible DA techniques and apply them to this context of obtaining simulated data with the purpose of training DL models. Stemming from a previous project, aimed to automate quality control at the end of a vehicle's production line, a proof-of-concept will be developed. Previously, a DL model that identified vehicle parts was trained using only data obtained through a simulator. By making use of DA techniques to combine simulated and real images, a new model will be trained to be applied to the real-world more effectively. The model's performance, using both types of data, will be compared to its performance when using exclusively one of the two types. We believe this can be expanded to new areas where, until now, the usage of DL was not feasible due to the constraints imposed by data collection

    Unsupervised learning of relation detection patterns

    Get PDF
    L'extracció d'informació és l'àrea del processament de llenguatge natural l'objectiu de la qual és l'obtenir dades estructurades a partir de la informació rellevant continguda en fragments textuals. L'extracció d'informació requereix una quantitat considerable de coneixement lingüístic. La especificitat d'aquest coneixement suposa un inconvenient de cara a la portabilitat dels sistemes, ja que un canvi d'idioma, domini o estil té un cost en termes d'esforç humà. Durant dècades, s'han aplicat tècniques d'aprenentatge automàtic per tal de superar aquest coll d'ampolla de portabilitat, reduint progressivament la supervisió humana involucrada. Tanmateix, a mida que augmenta la disponibilitat de grans col·leccions de documents, esdevenen necessàries aproximacions completament nosupervisades per tal d'explotar el coneixement que hi ha en elles. La proposta d'aquesta tesi és la d'incorporar tècniques de clustering a l'adquisició de patrons per a extracció d'informació, per tal de reduir encara més els elements de supervisió involucrats en el procés En particular, el treball se centra en el problema de la detecció de relacions. L'assoliment d'aquest objectiu final ha requerit, en primer lloc, el considerar les diferents estratègies en què aquesta combinació es podia dur a terme; en segon lloc, el desenvolupar o adaptar algorismes de clustering adequats a les nostres necessitats; i en tercer lloc, el disseny de procediments d'adquisició de patrons que incorporessin la informació de clustering. Al final d'aquesta tesi, havíem estat capaços de desenvolupar i implementar una aproximació per a l'aprenentatge de patrons per a detecció de relacions que, utilitzant tècniques de clustering i un mínim de supervisió humana, és competitiu i fins i tot supera altres aproximacions comparables en l'estat de l'art.Information extraction is the natural language processing area whose goal is to obtain structured data from the relevant information contained in textual fragments. Information extraction requires a significant amount of linguistic knowledge. The specificity of such knowledge supposes a drawback on the portability of the systems, as a change of language, domain or style demands a costly human effort. Machine learning techniques have been applied for decades so as to overcome this portability bottleneck¿progressively reducing the amount of involved human supervision. However, as the availability of large document collections increases, completely unsupervised approaches become necessary in order to mine the knowledge contained in them. The proposal of this thesis is to incorporate clustering techniques into pattern learning for information extraction, in order to further reduce the elements of supervision involved in the process. In particular, the work focuses on the problem of relation detection. The achievement of this ultimate goal has required, first, considering the different strategies in which this combination could be carried out; second, developing or adapting clustering algorithms suitable to our needs; and third, devising pattern learning procedures which incorporated clustering information. By the end of this thesis, we had been able to develop and implement an approach for learning of relation detection patterns which, using clustering techniques and minimal human supervision, is competitive and even outperforms other comparable approaches in the state of the art.Postprint (published version

    Cyber defensive capacity and capability::A perspective from the financial sector of a small state

    Get PDF
    This thesis explores ways in which the financial sectors of small states are able todefend themselves against ever-growing cyber threats, as well as ways these states can improve their cyber defense capability in order to withstand current andfuture attacks. To date, the context of small states in general is understudied. This study presents the challenges faced by financial sectors in small states with regard to withstanding cyberattacks. This study applies a mixed method approach through the use of various surveys, brainstorming sessions with financial sector focus groups, interviews with critical infrastructure stakeholders, a literature review, a comparative analysis of secondary data and a theoretical narrative review. The findings suggest that, for the Aruban financial sector, compliance is important, as with minimal drivers, precautionary behavior is significant. Countermeasures of formal, informal, and technical controls need to be in place. This study indicates the view that defending a small state such as Aruba is challenging, yet enough economic indicators indicate it not being outside the realm of possibility. On a theoretical level, this thesis proposes a conceptual “whole-of-cyber” model inspired by military science and the VSM (Viable Systems Model). The concept of fighting power components and governance S4 function form cyber defensive capacity’s shield and capability. The “whole-of-cyber” approach may be a good way to compensate for the lack of resources of small states. Collaboration may be an only out, as the fastest-growing need will be for advanced IT skillsets
    corecore