1,409 research outputs found

    Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

    Full text link
    In lifelong learning systems, especially those based on artificial neural networks, one of the biggest obstacles is the severe inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this article, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the immensely popular back-propagation of errors. Grounded in the neurocognitive theory of predictive processing, our model adapts its synapses in a biologically-plausible fashion, while another, complementary neural system rapidly learns to direct and control this cortex-like structure by mimicking the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting as compared to standard neural models and outperforms a wide swath of previously proposed methods even though it is trained across task datasets in a stream-like fashion. The promising performance of our complementary system on benchmarks, e.g., SplitMNIST, Split Fashion MNIST, and Split NotMNIST, offers evidence that by incorporating mechanisms prominent in real neuronal systems, such as competition, sparse activation patterns, and iterative input processing, a new possibility for tackling the grand challenge of lifelong machine learning opens up.Comment: Key updates including results on standard benchmarks, e.g., split mnist/fmnist/not-mnist. Task selection/basal ganglia model has been integrate

    Scalable Profiling and Visualization for Characterizing Microbiomes

    Get PDF
    Metagenomics is the study of the combined genetic material found in microbiome samples, and it serves as an instrument for studying microbial communities, their biodiversities, and the relationships to their host environments. Creating, interpreting, and understanding microbial community profiles produced from microbiome samples is a challenging task as it requires large computational resources along with innovative techniques to process and analyze datasets that can contain terabytes of information. The community profiles are critical because they provide information about what microorganisms are present in the sample, and in what proportions. This is particularly important as many human diseases and environmental disasters are linked to changes in microbiome compositions. In this work we propose novel approaches for the creation and interpretation of microbial community profiles. This includes: (a) a cloud-based, distributed computational system that generates detailed community profiles by processing large DNA sequencing datasets against large reference genome collections, (b) the creation of Microbiome Maps: interpretable, high-resolution visualizations of community profiles, and (c) a machine learning framework for characterizing microbiomes from the Microbiome Maps that delivers deep insights into microbial communities. The proposed approaches have been implemented in three software solutions: Flint, a large scale profiling framework for commercial cloud systems that can process millions of DNA sequencing fragments and produces microbial community profiles at a very low cost; Jasper, a novel method for creating Microbiome Maps, which visualizes the abundance profiles based on the Hilbert curve; and Amber, a machine learning framework for characterizing microbiomes using the Microbiome Maps generated by Jasper with high accuracy. Results show that Flint scales well for reference genome collections that are an order of magnitude larger than those used by competing tools, while using less than a minute to profile a million reads on the cloud with 65 commodity processors. Microbiome maps produced by Jasper are compact, scalable representations of extremely complex microbial community profiles with numerous demonstrable advantages, including the ability to display latent relationships that are hard to elicit. Finally, experiments show that by using images as input instead of unstructured tabular input, the carefully engineered software, Amber, can outperform other sophisticated machine learning tools available for classification of microbiomes

    Specific-to-General Learning for Temporal Events with Application to Learning Event Definitions from Video

    Full text link
    We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that is sufficiently expressive to represent many events yet sufficiently restrictive to support learning. We then give algorithms, along with lower and upper complexity bounds, for the subsumption and generalization problems for AMA formulas. We present a positive-examples--only specific-to-general learning method based on these algorithms. We also present a polynomial-time--computable ``syntactic'' subsumption test that implies semantic subsumption without being equivalent to it. A generalization algorithm based on syntactic subsumption can be used in place of semantic generalization to improve the asymptotic complexity of the resulting learning algorithm. Finally, we apply this algorithm to the task of learning relational event definitions from video and show that it yields definitions that are competitive with hand-coded ones

    ์ƒ๋ฌผํ•™์  ์„œ์—ด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ํ‘œํ˜„ ํ•™์Šต

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2021.8. ์œค์„ฑ๋กœ.As we are living in the era of big data, the biomedical domain is not an exception. With the advent of technologies such as next-generation sequencing, developing methods to capitalize on the explosion of biomedical data is one of the most major challenges in bioinformatics. Representation learning, in particular deep learning, has made significant advancements in diverse fields where the artificial intelligence community has struggled for many years. However, although representation learning has also shown great promises in bioinformatics, it is not a silver bullet. Off-the-shelf applications of representation learning cannot always provide successful results for biological sequence data. There remain full of challenges and opportunities to be explored. This dissertation presents a set of representation learning methods to address three issues in biological sequence data analysis. First, we propose a two-stage training strategy to address throughput and information trade-offs within wet-lab CRISPR-Cpf1 activity experiments. Second, we propose an encoding scheme to model interaction between two sequences for functional microRNA target prediction. Third, we propose a self-supervised pre-training method to bridge the exponentially growing gap between the numbers of unlabeled and labeled protein sequences. In summary, this dissertation proposes a set of representation learning methods that can derive invaluable information from the biological sequence data.์šฐ๋ฆฌ๋Š” ๋น…๋ฐ์ดํ„ฐ์˜ ์‹œ๋Œ€๋ฅผ ๋งž์ดํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ์˜์ƒ๋ช… ๋ถ„์•ผ ๋˜ํ•œ ์˜ˆ์™ธ๊ฐ€ ์•„๋‹ˆ๋‹ค. ์ฐจ์„ธ๋Œ€ ์—ผ๊ธฐ์„œ์—ด ๋ถ„์„๊ณผ ๊ฐ™์€ ๊ธฐ์ˆ ๋“ค์ด ๋„๋ž˜ํ•จ์— ๋”ฐ๋ผ, ํญ๋ฐœ์ ์ธ ์˜์ƒ๋ช… ๋ฐ์ดํ„ฐ์˜ ์ฆ๊ฐ€๋ฅผ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•๋ก ์˜ ๊ฐœ๋ฐœ์€ ์ƒ๋ฌผ์ •๋ณดํ•™ ๋ถ„์•ผ์˜ ์ฃผ์š” ๊ณผ์ œ ์ค‘์˜ ํ•˜๋‚˜์ด๋‹ค. ์‹ฌ์ธต ํ•™์Šต์„ ํฌํ•จํ•œ ํ‘œํ˜„ ํ•™์Šต ๊ธฐ๋ฒ•๋“ค์€ ์ธ๊ณต์ง€๋Šฅ ํ•™๊ณ„๊ฐ€ ์˜ค๋žซ๋™์•ˆ ์–ด๋ ค์›€์„ ๊ฒช์–ด์˜จ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ์ƒ๋‹นํ•œ ๋ฐœ์ „์„ ์ด๋ฃจ์—ˆ๋‹ค. ํ‘œํ˜„ ํ•™์Šต์€ ์ƒ๋ฌผ์ •๋ณดํ•™ ๋ถ„์•ผ์—์„œ๋„ ๋งŽ์€ ๊ฐ€๋Šฅ์„ฑ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ๋‹จ์ˆœํ•œ ์ ์šฉ์œผ๋กœ๋Š” ์ƒ๋ฌผํ•™์  ์„œ์—ด ๋ฐ์ดํ„ฐ ๋ถ„์„์˜ ์„ฑ๊ณต์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ํ•ญ์ƒ ์–ป์„ ์ˆ˜๋Š” ์•Š์œผ๋ฉฐ, ์—ฌ์ „ํžˆ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•œ ๋งŽ์€ ๋ฌธ์ œ๋“ค์ด ๋‚จ์•„์žˆ๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ ์ƒ๋ฌผํ•™์  ์„œ์—ด ๋ฐ์ดํ„ฐ ๋ถ„์„๊ณผ ๊ด€๋ จ๋œ ์„ธ ๊ฐ€์ง€ ์‚ฌ์•ˆ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ํ‘œํ˜„ ํ•™์Šต์— ๊ธฐ๋ฐ˜ํ•œ ์ผ๋ จ์˜ ๋ฐฉ๋ฒ•๋ก ๋“ค์„ ์ œ์•ˆํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, ์œ ์ „์ž๊ฐ€์œ„ ์‹คํ—˜ ๋ฐ์ดํ„ฐ์— ๋‚ด์žฌ๋œ ์ •๋ณด์™€ ์ˆ˜์œจ์˜ ๊ท ํ˜•์— ๋Œ€์ฒ˜ํ•  ์ˆ˜ ์žˆ๋Š” 2๋‹จ๊ณ„ ํ•™์Šต ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ, ๋‘ ์—ผ๊ธฐ ์„œ์—ด ๊ฐ„์˜ ์ƒํ˜ธ ์ž‘์šฉ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ ๋ถ€ํ˜ธํ™” ๋ฐฉ์‹์„ ์ œ์•ˆํ•œ๋‹ค. ์„ธ ๋ฒˆ์งธ๋กœ, ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ํŠน์ง•๋˜์ง€ ์•Š์€ ๋‹จ๋ฐฑ์งˆ ์„œ์—ด์„ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•œ ์ž๊ธฐ ์ง€๋„ ์‚ฌ์ „ ํ•™์Šต ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์š”์•ฝํ•˜์ž๋ฉด, ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ ์ƒ๋ฌผํ•™์  ์„œ์—ด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜์—ฌ ์ค‘์š”ํ•œ ์ •๋ณด๋ฅผ ๋„์ถœํ•  ์ˆ˜ ์žˆ๋Š” ํ‘œํ˜„ ํ•™์Šต์— ๊ธฐ๋ฐ˜ํ•œ ์ผ๋ จ์˜ ๋ฐฉ๋ฒ•๋ก ๋“ค์„ ์ œ์•ˆํ•œ๋‹ค.1 Introduction 1 1.1 Motivation 1 1.2 Contents of Dissertation 4 2 Background 8 2.1 Representation Learning 8 2.2 Deep Neural Networks 12 2.2.1 Multi-layer Perceptrons 12 2.2.2 Convolutional Neural Networks 14 2.2.3 Recurrent Neural Networks 16 2.2.4 Transformers 19 2.3 Training of Deep Neural Networks 23 2.4 Representation Learning in Bioinformatics 26 2.5 Biological Sequence Data Analyses 29 2.6 Evaluation Metrics 32 3 CRISPR-Cpf1 Activity Prediction 36 3.1 Methods 39 3.1.1 Model Architecture 39 3.1.2 Training of Seq-deepCpf1 and DeepCpf1 41 3.2 Experiment Results 44 3.2.1 Datasets 44 3.2.2 Baselines 47 3.2.3 Evaluation of Seq-deepCpf1 49 3.2.4 Evaluation of DeepCpf1 51 3.3 Summary 55 4 Functional microRNA Target Prediction 56 4.1 Methods 62 4.1.1 Candidate Target Site Selection 63 4.1.2 Input Encoding 64 4.1.3 Residual Network 67 4.1.4 Post-processing 68 4.2 Experiment Results 70 4.2.1 Datasets 70 4.2.2 Classification of Functional and Non-functional Targets 71 4.2.3 Distinguishing High-functional Targets 73 4.2.4 Ablation Studies 76 4.3 Summary 77 5 Self-supervised Learning of Protein Representations 78 5.1 Methods 83 5.1.1 Pre-training Procedure 83 5.1.2 Fine-tuning Procedure 86 5.1.3 Model Architecturen 87 5.2 Experiment Results 90 5.2.1 Experiment Setup 90 5.2.2 Pre-training Results 92 5.2.3 Fine-tuning Results 93 5.2.4 Comparison with Larger Protein Language Models 97 5.2.5 Ablation Studies 100 5.2.6 Qualitative Interpreatation Analyses 103 5.3 Summary 106 6 Discussion 107 6.1 Challenges and Opportunities 107 7 Conclusion 111 Bibliography 113 Abstract in Korean 130๋ฐ•

    Towards Efficient Lifelong Machine Learning in Deep Neural Networks

    Get PDF
    Humans continually learn and adapt to new knowledge and environments throughout their lifetimes. Rarely does learning new information cause humans to catastrophically forget previous knowledge. While deep neural networks (DNNs) now rival human performance on several supervised machine perception tasks, when updated on changing data distributions, they catastrophically forget previous knowledge. Enabling DNNs to learn new information over time opens the door for new applications such as self-driving cars that adapt to seasonal changes or smartphones that adapt to changing user preferences. In this dissertation, we propose new methods and experimental paradigms for efficiently training continual DNNs without forgetting. We then apply these methods to several visual and multi-modal perception tasks including image classification, visual question answering, analogical reasoning, and attribute and relationship prediction in visual scenes

    Discriminatively Trained Latent Ordinal Model for Video Classification

    Full text link
    We study the problem of video classification for facial analysis and human action recognition. We propose a novel weakly supervised learning method that models the video as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for "smile", running and jumping for "highjump"). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF -- it extends such frameworks to model the ordinal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations and on three challenging human action datasets. We also validate the method with qualitative results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text overlap with arXiv:1604.0150

    Toward adaptive heuristic video frames capturing and correction in real-time

    Get PDF
    Multimedia devices are widely used in professional applications as well as personal purposes. The use of computer vision systems enables detection and extraction of important features exposed in images. However constantly increasing demand for this type of video with high quality requires simple however reliable methods. The objective of presented research is to investigate applicability of heuristic method for real-time video frames capturing and correction
    • โ€ฆ
    corecore