25 research outputs found

    Empiricism without Magic: Transformational Abstraction in Deep Convolutional Neural Networks

    Get PDF
    In artificial intelligence, recent research has demonstrated the remarkable potential of Deep Convolutional Neural Networks (DCNNs), which seem to exceed state-of-the-art performance in new domains weekly, especially on the sorts of very difficult perceptual discrimination tasks that skeptics thought would remain beyond the reach of artificial intelligence. However, it has proven difficult to explain why DCNNs perform so well. In philosophy of mind, empiricists have long suggested that complex cognition is based on information derived from sensory experience, often appealing to a faculty of abstraction. Rationalists have frequently complained, however, that empiricists never adequately explained how this faculty of abstraction actually works. In this paper, I tie these two questions together, to the mutual benefit of both disciplines. I argue that the architectural features that distinguish DCNNs from earlier neural networks allow them to implement a form of hierarchical processing that I call “transformational abstraction”. Transformational abstraction iteratively converts sensory-based representations of category exemplars into new formats that are increasingly tolerant to “nuisance variation” in input. Reflecting upon the way that DCNNs leverage a combination of linear and non-linear processing to efficiently accomplish this feat allows us to understand how the brain is capable of bi-directional travel between exemplars and abstractions, addressing longstanding problems in empiricist philosophy of mind. I end by considering the prospects for future research on DCNNs, arguing that rather than simply implementing 80s connectionism with more brute-force computation, transformational abstraction counts as a qualitatively distinct form of processing ripe with philosophical and psychological significance, because it is significantly better suited to depict the generic mechanism responsible for this important kind of psychological processing in the brain

    Short-term bitcoin market prediction via machine learning

    Get PDF
    We analyze the predictability of the bitcoin market across prediction horizons ranging from 1 to 60 min. In doing so, we test various machine learning models and find that, while all models outperform a random classifier, recurrent neural networks and gradient boosting classifiers are especially well-suited for the examined prediction tasks. We use a comprehensive feature set, including technical, blockchain-based, sentiment-/interest-based, and asset-based features. Our results show that technical features remain most relevant for most methods, followed by selected blockchain-based and sentiment-/interest-based features. Additionally, we find that predictability increases for longer prediction horizons. Although a quantile-based long-short trading strategy generates monthly returns of up to 39% before transaction costs, it leads to negative returns after taking transaction costs into account due to the particularly short holding periods

    Knowledge Elicitation in Deep Learning Models

    Get PDF
    Embora o aprendizado profundo (mais conhecido como deep learning) tenha se tornado uma ferramenta popular na solução de problemas modernos em vários domínios, ele apresenta um desafio significativo - a interpretabilidade. Esta tese percorre um cenário de elicitação de conhecimento em modelos de deep learning, lançando luz sobre a visualização de características, mapas de saliência e técnicas de destilação. Estas técnicas foram aplicadas a duas arquiteturas: redes neurais convolucionais (CNNs) e um modelo de pacote (Google Vision). A nossa investigação forneceu informações valiosas sobre a sua eficácia na elicitação e interpretação do conhecimento codificado. Embora tenham demonstrado potencial, também foram observadas limitações, sugerindo espaço para mais desenvolvimento neste campo. Este trabalho não só realça a necessidade de modelos de deep learning mais transparentes e explicáveis, como também impulsiona o desenvolvimento de técnicas para extrair conhecimento. Trata-se de garantir uma implementação responsável e enfatizar a importância da transparência e compreensão no aprendizado de máquina. Além de avaliar os métodos existentes, esta tese explora também o potencial de combinar múltiplas técnicas para melhorar a interpretabilidade dos modelos de deep learning. Uma mistura de visualização de características, mapas de saliência e técnicas de destilação de modelos foi usada de uma maneira complementar para extrair e interpretar o conhecimento das arquiteturas escolhidas. Os resultados experimentais destacam a utilidade desta abordagem combinada, revelando uma compreensão mais abrangente dos processos de tomada de decisão dos modelos. Além disso, propomos um novo modelo para a elicitação sistemática de conhecimento em deep learning, que integra de forma coesa estes métodos. Este quadro demonstra o valor de uma abordagem holística para a interpretabilidade do modelo, em vez de se basear num único método. Por fim, discutimos as implicações éticas do nosso trabalho. À medida que os modelos de deep learning continuam a permear vários setores, desde a saúde até às finanças, garantir que as suas decisões são explicáveis e justificadas torna-se cada vez mais crucial. A nossa investigação sublinha esta importância, preparando o terreno para a criação de sistemas de inteligência artificial mais transparentes e responsáveis no futuro.Though a buzzword in modern problem-solving across various domains, deep learning presents a significant challenge - interpretability. This thesis journeys through a landscape of knowledge elicitation in deep learning models, shedding light on feature visualization, saliency maps, and model distillation techniques. These techniques were applied to two deep learning architectures: convolutional neural networks (CNNs) and a black box package model (Google Vision). Our investigation provided valuable insights into their effectiveness in eliciting and interpreting the encoded knowledge. While they demonstrated potential, limitations were also observed, suggesting room for further development in this field. This work does not just highlight the need for more transparent, more explainable deep learning models, it gives a gentle nudge to developing innovative techniques to extract knowledge. It is all about ensuring responsible deployment and emphasizing the importance of transparency and comprehension in machine learning. In addition to evaluating existing methods, this thesis also explores the potential for combining multiple techniques to enhance the interpretability of deep learning models. A blend of feature visualization, saliency maps, and model distillation techniques was used in a complementary manner to extract and interpret the knowledge from our chosen architectures. Experimental results highlight the utility of this combined approach, revealing a more comprehensive understanding of the models' decision-making processes. Furthermore, we propose a novel framework for systematic knowledge elicitation in deep learning, which cohesively integrates these methods. This framework showcases the value of a holistic approach toward model interpretability rather than relying on a single method. Lastly, we discuss the ethical implications of our work. As deep learning models continue to permeate various sectors, from healthcare to finance, ensuring their decisions are explainable and justified becomes increasingly crucial. Our research underscores this importance, laying the groundwork for creating more transparent, accountable AI systems in the future

    Modern Approaches to Dynamic Portfolio Optimization

    Get PDF
    Although appealing from a theoretical point of view, empirical assessments of dynamic portfolio optimizations in a mean-variance framework often fail to reach the high expectations set forth by analytical evaluations. A major reason for this shortfall is the imprecise estimation of asset moments and in particular, the expected return. This work levers recent advancements in the field of machine learning and employs three types of artificial neural networks in an attempt to improve the accuracy of the asset return estimation and the expected associated portfolio performances. After an introduction of the dynamic portfolio optimization framework and the artificial neural networks, their suitability for the considered application is analyzed in a two asset universe of a market and a risk-free asset. A comparison of the corresponding risk-return characteristics and those achieved using a more traditional exponentially weighted moving average estimator is subsequently drawn. While outperformance of the artificial neural networks is found for daily and monthly estimated returns, significance can only be established in the latter case, especially in light of trading costs. Multiple robustness checks are performed before an outlook for subsequent research opportunities is given. Keywords: Portfolio optimization; machine learning; multilayer perceptron; convolutional neural network; long short-term memory.Although appealing from a theoretical point of view, empirical assessments of dynamic portfolio optimizations in a mean-variance framework often fail to reach the high expectations set forth by analytical evaluations. A major reason for this shortfall is the imprecise estimation of asset moments and in particular, the expected return. This work levers recent advancements in the field of machine learning and employs three types of artificial neural networks in an attempt to improve the accuracy of the asset return estimation and the expected associated portfolio performances. After an introduction of the dynamic portfolio optimization framework and the artificial neural networks, their suitability for the considered application is analyzed in a two asset universe of a market and a risk-free asset. A comparison of the corresponding risk-return characteristics and those achieved using a more traditional exponentially weighted moving average estimator is subsequently drawn. While outperformance of the artificial neural networks is found for daily and monthly estimated returns, significance can only be established in the latter case, especially in light of trading costs. Multiple robustness checks are performed before an outlook for subsequent research opportunities is given. Keywords: Portfolio optimization; machine learning; multilayer perceptron; convolutional neural network; long short-term memory

    Online failure prediction in air traffic control systems

    Get PDF
    This thesis introduces a novel approach to online failure prediction for mission critical distributed systems that has the distinctive features to be black-box, non-intrusive and online. The approach combines Complex Event Processing (CEP) and Hidden Markov Models (HMM) so as to analyze symptoms of failures that might occur in the form of anomalous conditions of performance metrics identified for such purpose. The thesis presents an architecture named CASPER, based on CEP and HMM, that relies on sniffed information from the communication network of a mission critical system, only, for predicting anomalies that can lead to software failures. An instance of Casper has been implemented, trained and tuned to monitor a real Air Traffic Control (ATC) system developed by Selex ES, a Finmeccanica Company. An extensive experimental evaluation of CASPER is presented. The obtained results show (i) a very low percentage of false positives over both normal and under stress conditions, and (ii) a sufficiently high failure prediction time that allows the system to apply appropriate recovery procedures

    Online failure prediction in air traffic control systems

    Get PDF
    This thesis introduces a novel approach to online failure prediction for mission critical distributed systems that has the distinctive features to be black-box, non-intrusive and online. The approach combines Complex Event Processing (CEP) and Hidden Markov Models (HMM) so as to analyze symptoms of failures that might occur in the form of anomalous conditions of performance metrics identified for such purpose. The thesis presents an architecture named CASPER, based on CEP and HMM, that relies on sniffed information from the communication network of a mission critical system, only, for predicting anomalies that can lead to software failures. An instance of Casper has been implemented, trained and tuned to monitor a real Air Traffic Control (ATC) system developed by Selex ES, a Finmeccanica Company. An extensive experimental evaluation of CASPER is presented. The obtained results show (i) a very low percentage of false positives over both normal and under stress conditions, and (ii) a sufficiently high failure prediction time that allows the system to apply appropriate recovery procedures

    Literacy development: evidence review

    Get PDF
    Literacy includes the word-level skills of word reading and spelling and the text-level skills of reading comprehension and writing composition. These skills are involved in virtually all everyday activities. As a result, poor literacy impacts on every aspect of life. Word reading, spelling, reading comprehension, and writing composition are supported by similar language and cognitive skills as well as affective and environment factors. Learning to be literate builds upon existing knowledge of the language from speech. Becoming literate then enables children to learn more about language. However, literacy is unlikely to be achieved without explicit and prolonged instruction. This review provides an evidence base for decision-making during literacy education. We identify key skills that must be in place to enable children to reach their optimum potential and highlight where weakness can suggest a need for extra support. We begin by discussing models of literacy development as these models provide a framework within which to present the evidence base for the rest of the review. We then consider the underlying skills in greater depth, beginning first with the proximal factors that underpin word-level and text-level reading and writing. Then we consider distal child-based and wider environmental factors that indirectly impact on literacy development
    corecore