Search CORE

830 research outputs found

Deep learning in a bilateral brain with hemispheric specialization

Author: Goldberg Elkhonon
Kowadlo Gideon
Rajagopalan Chandramouli
Rawlinson David
Publication venue
Publication date: 19/09/2022
Field of study

The brains of all bilaterally symmetric animals on Earth are are divided into left and right hemispheres. The anatomy and functionality of the hemispheres have a large degree of overlap, but they specialize to possess different attributes. The left hemisphere is believed to specialize in specificity and routine, the right in generalities and novelty. In this study, we propose an artificial neural network that imitates that bilateral architecture using two convolutional neural networks with different training objectives and test it on an image classification task. The bilateral architecture outperforms architectures of similar representational capacity that don't exploit differential specialization. It demonstrates the efficacy of bilateralism and constitutes a new principle that could be incorporated into other computational neuroscientific models and used as an inductive bias when designing new ML systems. An analysis of the model can help us to understand the human brain.Comment: 11 pages, 10 figure

arXiv.org e-Print Archive

Knowledge Distillation for Federated Learning: a Practical Guide

Author: Bellavista Paolo
Mora Alessio
Rish Irina
Tenison Irene
Publication venue
Publication date: 09/11/2022
Field of study

Federated Learning (FL) enables the training of Deep Learning models without centrally collecting possibly sensitive raw data. This paves the way for stronger privacy guarantees when building predictive models. The most used algorithms for FL are parameter-averaging based schemes (e.g., Federated Averaging) that, however, have well known limits: (i) Clients must implement the same model architecture; (ii) Transmitting model weights and model updates implies high communication cost, which scales up with the number of model parameters; (iii) In presence of non-IID data distributions, parameter-averaging aggregation schemes perform poorly due to client model drifts. Federated adaptations of regular Knowledge Distillation (KD) can solve and/or mitigate the weaknesses of parameter-averaging FL algorithms while possibly introducing other trade-offs. In this article, we provide a review of KD-based algorithms tailored for specific FL issues.Comment: 9 pages, 1 figur

arXiv.org e-Print Archive

Collaborative Learning in Computer Vision

Author: Ahmed Waqar
Publication venue: Universit\ue0 degli studi di Genova
Publication date: 25/02/2022
Field of study

The science of designing machines to extract meaningful information from digital images, videos, and other visual inputs is known as Computer Vision (CV). Deep learning algorithms cope CV problems by automatically learning task-specific features. Especially, Deep Neural Networks (DNNs) have become an essential component in CV solutions due to their ability to encode large amounts of data and capacity to manipulate billions of model parameters. Unlike machines, humans learn by rapidly constructing abstract models. This is undoubtedly due to the fact that good teachers supply their students with much more than just the correct answer; they also provide intuitive comments, comparisons, and explanations. In deep learning, the availability of such auxiliary information at training time (but not at test time) is referred to as learning by Privileged Information (PI). Typically, predictions (e.g., soft labels) produced by a bigger and better network teacher are used as structured knowledge to supervise the training of a smaller network student, helping the student network to generalize better than that trained from scratch. This dissertation focuses on the category of deep learning systems known as Collaborative Learning, where one DNN model helps other models or several models help each other during training to achieve strong generalization and thus high performance. The question we address here is thus the following: how can we take advantage of PI for training a deep learning model, knowing that, at test time, such PI might be missing? In this context, we introduce new methods to tackle several challenging real-world computer vision problems. First, we propose a method for model compression that leverages PI in a teacher-student framework along with customizable block-wise optimization for learning a target-specific lightweight structure of the neural network. In particular, the proposed resource-aware optimization is employed on suitable parts of the student network while respecting the expected resource budget (e.g., floating-point operations per inference and model parameters). In addition, soft predictions produced by the teacher network are leveraged as a source of PI, forcing the student to preserve baseline performance during network structure optimization. Second, we propose a multiple-model learning method for action recognition, specifically devised for challenging video footages in which actions are not explicitly visualized, but rather, only implicitly referred. We use such videos as stimuli and involve a large sample of subjects to collect a high-definition EEG and video dataset. Next, we employ collaborative learning in a multi-modal setting i.e., the EEG (teacher) model helps the video (student) model by distilling the knowledge (implicit meaning of visual stimuli) to it, sharply boosting the recognition performance. The goal of Unsupervised Domain Adaptation (UDA) methods is to use the labeled source together with the unlabeled target domain data to train a model that generalizes well on the target domain. In contrast, we cast UDA as a pseudo-label refinery problem in the challenging source-free scenario i.e., in cases where the source domain data is inaccessible during training. We propose Negative Ensemble Learning (NEL) technique, a unified method for adaptive noise filtering and progressive pseudo-label refinement. In particular, the ensemble members collaboratively learn with a Disjoint Set of Residual Labels, an outcome of the output prediction consensus, to refine the challenging noise associated with the inferred pseudo-labels. A single model trained with the refined pseudo-labels leads to superior performance on the target domain, without using source data samples at all. We conclude this dissertation with a method extending our previous study by incorporating Continual Learning in the Source-Free UDA. Our new method comprises of two stages: a Source-Free UDA pipeline based on pseudo-label refinement, and a procedure for extracting class-conditioned source-style images by leveraging the pre-trained source model. While stage 1 holds the same collaborative peculiarities, in stage 2, the collaboration exists in an indirect manner i.e., it is the source model that provides the only possibility to generate source-style synthetic images which eventually helps the final model in preserving good performance on both source and target domains. In each study, we consider heterogeneous CV tasks. Nevertheless, with an extensive pool of experiments on various benchmarks carrying diverse complexities and challenges, we show that the collaborative learning framework outperforms the related state-of-the-art methods by a considerable margin

Archivio istituzionale della ricerca - Università di Genova

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

Author: Dai D.
Liniger A.
Van Gool L.
Yu F.
Zhang Z.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

End-to-end approaches to autonomous driving commonly rely on expert demonstrations. Although humans are good drivers, they are not good coaches for end-to-end algorithms that demand dense on-policy supervision. On the contrary, automated experts that leverage privileged information can efficiently generate large scale on-policy and off-policy demonstrations. However, existing automated experts for urban driving make heavy use of hand-crafted rules and perform suboptimally even on driving simulators, where ground-truth information is available. To address these issues, we train a reinforcement learning expert that maps bird's-eye view images to continuous low-level actions. While setting a new performance upper-bound on CARLA, our expert is also a better coach that provides informative supervision signals for imitation learning agents to learn from. Supervised by our reinforcement learning coach, a baseline end-to-end agent with monocular camera-input achieves expert-level performance. Our end-to-end agent achieves a 78% success rate while generalizing to a new town and new weather on the NoCrash-dense benchmark and state-of-the-art performance on the more challenging CARLA LeaderBoard

Repository for Publications and Research Data

MPG.PuRe

Tehisnärvivõrgud bioloogiliste andmete analüüsimiseks

Author: Tampuu Ardi
Publication venue
Publication date: 02/09/2020
Field of study

Väitekirja elektrooniline versioon ei sisalda publikatsiooneTehisnärvivõrgud viimastel aastatel populaarsust kogunud masinõppe algoritm, mis on võimeline näidete põhjal õppima. Erinevad tehisnärvivõrkude alamtüübid on kasutusel mitmetes arvutiteaduse harudes: konvolutsioonilisi võrke rakendatakse objekti- ja näotuvastuses; rekurrentsed võrgud on efektiivsed kõnetuvastuses ja keeletehnoloogias. Need ei ole aga ainsad võimalikud tehisnärvivõrkude rakendamise valdkonnad - selles doktoritöös näitasime me tehisnärvivõrkude kasulikkust kahe bioloogilise probleemi lahendamisel. Esiteks küsisime, kas ainult DNA jupis sisalduva info põhjal on võimalik ennustada, kas see järjestus pärineb viiruse (ja mitte mõnda muud tüüpi organismi) genoomist. Läbi kahe publikatsiooni tõestasime me, et masinõppe algoritmid on selleks tõesti võimelised. Parima täpsuse saavutas konvolutsiooniline närvivõrk. Loodud lahendus võimaldab viroloogidel tuvastada seni tundmatuid viiruseliike, millel võib olla oluline mõju inimese tervisele. Teine käsitletud bioloogiline andmestik pärineb neuroteadusest. Imetajate hipokampuses esineb nn. koharakke, mis aktiveeruvad vaid juhul, kui loom asub teatud ruumipunktis. Näitasime, et rekurrentsete närvivõrkude abil saab vaid mõnekümne koharaku aktiivsuse põhjal ennustada roti asukohta ligi 10 cm täpsusega. Rekurrentsed võrgud osutusid efektiivsemaks kui neuroteaduses enim levinud Bayesi meetodid. Need võrgud suudavad kasutada rakkude eelnevat aktiivsust kontekstina, mis aitab täpsustada asukoha ennustust. Ka teistes neuroandmestikes võib eelnev ajuaktiivsus peegeldada konteksti, mis sisaldab olulist infot hetkel toimuva kohta. Seega võivad rekurrentsed tehisnärvivõrgud osutuda ajusignaalide mõistmisel ülimalt kasulikuks. Samuti on bioinformaatikas veel hulk andmestikke, kus konvolutsioonilised võrgud võivad osutuda efektiivsemaks kui senised meetodid. Loodame, et käesolev töö julgustab teadlasi tehisnärvivõrke proovima ka oma andmestikel.Artificial neural networks (ANNs) are a machine learning algorithm that has gained popularity in recent years. Different subtypes of ANNs are used in various fields of computer science. For example, convolutional networks are useful in object and face recognition systems; whereas recurrent neural networks are effective in speech recognition and natural language processing. However, these examples are not the only possible applications of neural nets - in this thesis we demonstrated the benefits of ANNs in analyzing two biological datasets. First, we investigated if based only on the information contained within a DNA snippet it is possible to predict if the snippet originates from a viral genome or not. Through two publications we demonstrated that machine learning algorithms can make this prediction. Convolutional neural networks (CNNs) proved to be the most accurate. The recommendation system created allows virologists to identify yet unknown viral species, which may have important effects on human health. The second biological dataset analyzed originates from neuroscience. In mammalian hippocampus there are so called place cells which activate only if the animal is in a specific location in space. We showed that recurrent neural networks (RNNs) allow to predict the animal’s location with ~10cm precision based on the activity of only a few dozen place cells. RNNs proved to be more effective than the most commonly used Bayesian methods. These networks use the past neuronal activity as a context that helps fine-tune the location predictions. Also in many other neural datasets the prior brain activity might reflect important information about the current behaviour. Hence, RNNs might turn out to be very useful in making sense of brain signals. Similarly, CNNs are likely to prove more efficient than the currently used methods on many other bioinformatics datasets. We hope this thesis encourages more scientists to try neural networks on their own datasets.https://www.ester.ee/record=b536839

DSpace at Tartu University Library

Sequence specificity despite intrinsic disorder: How a disease-associated Val/Met polymorphism rearranges tertiary interactions in a long disordered protein

Author: Brannigan Grace
Lohia Ruchi
Salari Reza
Publication venue: Digital Commons@Becker
Publication date: 16/05/2019
Field of study

The role of electrostatic interactions and mutations that change charge states in intrinsically disordered proteins (IDPs) is well-established, but many disease-associated mutations in IDPs are charge-neutral. The Val66Met single nucleotide polymorphism (SNP) in precursor brain-derived neurotrophic factor (BDNF) is one of the earliest SNPs to be associated with neuropsychiatric disorders, and the underlying molecular mechanism is unknown. Here we report on over 250 μs of fully-atomistic, explicit solvent, temperature replica-exchange molecular dynamics (MD) simulations of the 91 residue BDNF prodomain, for both the V66 and M66 sequence. The simulations were able to correctly reproduce the location of both local and non-local secondary structure changes due to the Val66Met mutation, when compared with NMR spectroscopy. We find that the change in local structure is mediated via entropic and sequence specific effects. We developed a hierarchical sequence-based framework for analysis and conceptualization, which first identifies blobs of 4-15 residues representing local globular regions or linkers. We use this framework within a novel test for enrichment of higher-order (tertiary) structure in disordered proteins; the size and shape of each blob is extracted from MD simulation of the real protein (RP), and used to parameterize a self-avoiding heterogenous polymer (SAHP). The SAHP version of the BDNF prodomain suggested a protein segmented into three regions, with a central long, highly disordered polyampholyte linker separating two globular regions. This effective segmentation was also observed in full simulations of the RP, but the Val66Met substitution significantly increased interactions across the linker, as well as the number of participating residues. The Val66Met substitution replaces β-bridging between V66 and V94 (on either side of the linker) with specific side-chain interactions between M66 and M95. The protein backbone in the vicinity of M95 is then free to form β-bridges with residues 31-41 near the N-terminus, which condenses the protein. A significant role for Met/Met interactions is consistent with previously-observed non-local effects of the Val66Met SNP, as well as established interactions between the Met66 sequence and a Met-rich receptor that initiates neuronal growth cone retraction

Digital Commons@Becker

FigShare

Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review

Author: Agushaka Jeffery O.
Ezugwu Absalom E.
Ho Yuh-Shan
Ikotun Abiodun M.
Oyelade Olaide N.
Publication venue
Publication date: 18/04/2023
Field of study

In this paper, a critical bibliometric analysis study is conducted, coupled with an extensive literature survey on recent developments and associated applications in machine learning research with a perspective on Africa. The presented bibliometric analysis study consists of 2761 machine learning-related documents, of which 98% were articles with at least 482 citations published in 903 journals during the past 30 years. Furthermore, the collated documents were retrieved from the Science Citation Index EXPANDED, comprising research publications from 54 African countries between 1993 and 2021. The bibliometric study shows the visualization of the current landscape and future trends in machine learning research and its application to facilitate future collaborative research and knowledge exchange among authors from different research institutions scattered across the African continent

arXiv.org e-Print Archive

Combining Synthesis of Cardiorespiratory Signals and Artifacts with Deep Learning for Robust Vital Sign Estimation

Author: Silva Diogo Filipe Pereira Fontes Fernandes
Publication venue
Publication date: 01/01/2019
Field of study

Healthcare has been remarkably morphing on the account of Big Data. As Machine Learning (ML) consolidates its place in simpler clinical chores, more complex Deep Learning (DL) algorithms have struggled to keep up, despite their superior capabilities. This is mainly attributed to the need for large amounts of data for training, which the scientific community is unable to satisfy. The number of promising DL algorithms is considerable, although solutions directly targeting the shortage of data lack. Currently, dynamical generative models are the best bet, but focus on single, classical modalities and tend to complicate significantly with the amount of physiological effects they can simulate. This thesis aims at providing and validating a framework, specifically addressing the data deficit in the scope of cardiorespiratory signals. Firstly, a multimodal statistical synthesizer was designed to generate large, annotated artificial signals. By expressing data through coefficients of pre-defined, fitted functions and describing their dependence with Gaussian copulas, inter- and intra-modality associations were learned. Thereafter, new coefficients are sampled to generate artificial, multimodal signals with the original physiological dynamics. Moreover, normal and pathological beats along with artifacts were included by employing Markov models. Secondly, a convolutional neural network (CNN) was conceived with a novel sensor-fusion architecture and trained with synthesized data under real-world experimental conditions to evaluate how its performance is affected. Both the synthesizer and the CNN not only performed at state of the art level but also innovated with multiple types of generated data and detection error improvements, respectively. Cardiorespiratory data augmentation corrected performance drops when not enough data is available, enhanced the CNN’s ability to perform on noisy signals and to carry out new tasks when introduced to, otherwise unavailable, types of data. Ultimately, the framework was successfully validated showing potential to leverage future DL research on Cardiology into clinical standards

Repositório da Universidade Nova de Lisboa