922 research outputs found

    An Architecture-Altering and Training Methodology for Neural Logic Networks: Application in the Banking Sector

    Get PDF
    Artificial neural networks have been universally acknowledged for their ability on constructing forecasting and classifying systems. Among their desirable features, it has always been the interpretation of their structure, aiming to provide further knowledge for the domain experts. A number of methodologies have been developed for this reason. One such paradigm is the neural logic networks concept. Neural logic networks have been especially designed in order to enable the interpretation of their structure into a number of simple logical rules and they can be seen as a network representation of a logical rule base. Although powerful by their definition in this context, neural logic networks have performed poorly when used in approaches that required training from data. Standard training methods, such as the back-propagation, require the network’s synapse weight altering, which destroys the network’s interpretability. The methodology in this paper overcomes these problems and proposes an architecture-altering technique, which enables the production of highly antagonistic solutions while preserving any weight-related information. The implementation involves genetic programming using a grammar-guided training approach, in order to provide arbitrarily large and connected neural logic networks. The methodology is tested in a problem from the banking sector with encouraging results

    Causal Discovery from Temporal Data: An Overview and New Perspectives

    Full text link
    Temporal data, representing chronological observations of complex systems, has always been a typical data structure that can be widely generated by many domains, such as industry, medicine and finance. Analyzing this type of data is extremely valuable for various applications. Thus, different temporal data analysis tasks, eg, classification, clustering and prediction, have been proposed in the past decades. Among them, causal discovery, learning the causal relations from temporal data, is considered an interesting yet critical task and has attracted much research attention. Existing casual discovery works can be divided into two highly correlated categories according to whether the temporal data is calibrated, ie, multivariate time series casual discovery, and event sequence casual discovery. However, most previous surveys are only focused on the time series casual discovery and ignore the second category. In this paper, we specify the correlation between the two categories and provide a systematical overview of existing solutions. Furthermore, we provide public datasets, evaluation metrics and new perspectives for temporal data casual discovery.Comment: 52 pages, 6 figure

    Modelling causality in law = Modélisation de la causalité en droit

    Full text link
    L'intérêt en apprentissage machine pour étudier la causalité s'est considérablement accru ces dernières années. Cette approche est cependant encore peu répandue dans le domaine de l’intelligence artificielle (IA) et du droit. Elle devrait l'être. L'approche associative actuelle d’apprentissage machine révèle certaines limites que l'analyse causale peut surmonter. Cette thèse vise à découvrir si les modèles causaux peuvent être utilisés en IA et droit. Nous procédons à une brève revue sur le raisonnement et la causalité en science et en droit. Traditionnellement, les cadres normatifs du raisonnement étaient la logique et la rationalité, mais la théorie duale démontre que la prise de décision humaine dépend de nombreux facteurs qui défient la rationalité. À ce titre, des statistiques et des probabilités étaient nécessaires pour améliorer la prédiction des résultats décisionnels. En droit, les cadres de causalité ont été définis par des décisions historiques, mais la plupart des modèles d’aujourd’hui de l'IA et droit n'impliquent pas d'analyse causale. Nous fournissons un bref résumé de ces modèles, puis appliquons le langage structurel de Judea Pearl et les définitions Halpern-Pearl de la causalité pour modéliser quelques décisions juridiques canadiennes qui impliquent la causalité. Les résultats suggèrent qu'il est non seulement possible d'utiliser des modèles de causalité formels pour décrire les décisions juridiques, mais également utile car un schéma uniforme élimine l'ambiguïté. De plus, les cadres de causalité sont utiles pour promouvoir la responsabilisation et minimiser les biais.The machine learning community’s interest in causality has significantly increased in recent years. This trend has not yet been made popular in AI & Law. It should be because the current associative ML approach reveals certain limitations that causal analysis may overcome. This research paper aims to discover whether formal causal frameworks can be used in AI & Law. We proceed with a brief account of scholarship on reasoning and causality in science and in law. Traditionally, normative frameworks for reasoning have been logic and rationality, but the dual theory has shown that human decision-making depends on many factors that defy rationality. As such, statistics and probability were called for to improve the prediction of decisional outcomes. In law, causal frameworks have been defined by landmark decisions but most of the AI & Law models today do not involve causal analysis. We provide a brief summary of these models and then attempt to apply Judea Pearl’s structural language and the Halpern-Pearl definitions of actual causality to model a few Canadian legal decisions that involve causality. Results suggest that it is not only possible to use formal causal models to describe legal decisions, but also useful because a uniform schema eliminates ambiguity. Also, causal frameworks are helpful in promoting accountability and minimizing biases

    A Construction Kit for Efficient Low Power Neural Network Accelerator Designs

    Get PDF
    Implementing embedded neural network processing at the edge requires efficient hardware acceleration that couples high computational performance with low power consumption. Driven by the rapid evolution of network architectures and their algorithmic features, accelerator designs are constantly updated and improved. To evaluate and compare hardware design choices, designers can refer to a myriad of accelerator implementations in the literature. Surveys provide an overview of these works but are often limited to system-level and benchmark-specific performance metrics, making it difficult to quantitatively compare the individual effect of each utilized optimization technique. This complicates the evaluation of optimizations for new accelerator designs, slowing-down the research progress. This work provides a survey of neural network accelerator optimization approaches that have been used in recent works and reports their individual effects on edge processing performance. It presents the list of optimizations and their quantitative effects as a construction kit, allowing to assess the design choices for each building block separately. Reported optimizations range from up to 10'000x memory savings to 33x energy reductions, providing chip designers an overview of design choices for implementing efficient low power neural network accelerators

    Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

    Get PDF
    Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can be used for arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness

    The Use of Artificial Intelligence in the Supply Chain Management in Finnish Large Enterprises

    Get PDF
    The artificial intelligence (AI) provides a lot of potential to development of supply chain management (SCM). AI can operate on strategical -, tactical -, and operational decision-making levels. Operational levels as forecasting, production and warehouse actions are the most common fields where AI can operate. Aim of increasing SCM value creation via AI is to reach almost perfectly accurate forecasts and decreasing costs of production. Customer needs can be filled with sophisticated tools and companies must consider their next game plan all the time as markets become more competitive. Purpose of artificial intelligence technology is to create solutions for problems and fill missing parts of human-made gaps. Supply chain is one of the most critical function of business that must act straightforwardly and fluently. The aim of this research is to map out possible AI applications and determine maturity level of AI in SCM in large Finnish enterprises. This research is based on quantitative and qualitative analysis which illustrates the use of artificial intelligence in supply chain management. The research shows by measuring maturity - and automation level of artificial intelligence that these are lower than expected. The study cleared out what solutions large enterprises use in their supply chain management and results show that they focus on demand forecasting, optimisation, and preparation. This research found that companies are waiting to implement sophisticated artificial intelligence solutions until their maturity of big data is mature enough. As a discussion, due to companies’ uncertainty to lose competitive advantage and low maturity level led to scanty data collection. Suggestion for the future research is to examine this subject area when companies have been confident in implementing advanced AI technological acts in their SCM and AI has become more intelligence.Tekoäly (AI) tarjoaa paljon potentiaalia toimitusketjun hallinnan (SCM) kehittämiseen. Tekoäly voi toimia strategisella -, taktisella – ja operatiivisella päätöksentekotasolla. Operatiivisina tasoina ennuste-, tuotanto- ja varastotoiminta ovat yleisimpiä tekoälyä hyödyntäviä toimintoja. Toimitusketjun arvon luominen tekoälyn kautta tapahtuu lähes tarkkojen ennusteiden saavuttamisesta ja tuotantokustannusten alentamisesta. Kehittyneiden työkalujen avulla asiakkaiden kasvavat odotukset on mahdollista toteuttaa ja markkinoiden muuttuessa kilpailukykyisemmiksi, yritysten tulee jatkuvasti miettiä seuraavaa siirtoaan. Tekoälyteknologian tarkoitus on tuoda ratkaisuja muuttuviin ongelmiin ja täydentää puuttuvia osia ihmisten tekemissä virheissä. Toimitusketju on osa yrityksen kriittisimmistä toiminnoista, minkä vuoksi sen tulee toimia virheettömästi ja tuottaa lisäarvoa yritykselle. Tutkimuksen tavoitteena on kartoittaa mahdollisia tekoälysovelluksia ja määrittää tekoälyn kypsyystaso toimitusketjun hallinnassa suomalaisissa suuryrityksissä. Tämä tutkimus perustuu kvantitatiiviseen ja kvalitatiiviseen analyysiin, jotka muodostavat kuvan tekoälyteknologian käytöstä toimitusketjun hallinnassa. Mitattaessa tekoälyn maturiteetti- ja automaatiotasoa tutkimustulokset osoittivat, että nämä olivat alemmalla tasolla kuin oli oletettu. Tutkimus selvitti, mitä ratkaisuja suuryritykset käyttävät toimitusketjun hallinnassa. Tulokset osoittivat, että ne keskittyvät kysynnän ennustamiseen, optimointiin ja valmisteluun. Tutkimuksessa havaittiin, että yritykset odottavat kehittyneempien tekoälytoimintojen toteuttamista, kunnes heidän big datan maturiteettitaso on tarpeeksi kypsä. Pohdintana, tutkimuksen suppea aineistonkeruu saattaa johtua yrityksien epävarmuudesta kilpailukyvyn menettämiseen ja matalasta tekoälyn maturiteettitasosta. Tulevaisuudessa tämän aihealueen tutkiminen on ajankohtaista vasta kun yritykset ovat varmuudella toteuttaneet kehittyneitä tekoälyteknologisia toimintoja toimitusketjun hallinnassaan ja tekoäly on kehittynyt älykkäämmäksi

    Counterfactual (Non-)identifiability of Learned Structural Causal Models

    Full text link
    Recent advances in probabilistic generative modeling have motivated learning Structural Causal Models (SCM) from observational datasets using deep conditional generative models, also known as Deep Structural Causal Models (DSCM). If successful, DSCMs can be utilized for causal estimation tasks, e.g., for answering counterfactual queries. In this work, we warn practitioners about non-identifiability of counterfactual inference from observational data, even in the absence of unobserved confounding and assuming known causal structure. We prove counterfactual identifiability of monotonic generation mechanisms with single dimensional exogenous variables. For general generation mechanisms with multi-dimensional exogenous variables, we provide an impossibility result for counterfactual identifiability, motivating the need for parametric assumptions. As a practical approach, we propose a method for estimating worst-case errors of learned DSCMs' counterfactual predictions. The size of this error can be an essential metric for deciding whether or not DSCMs are a viable approach for counterfactual inference in a specific problem setting. In evaluation, our method confirms negligible counterfactual errors for an identifiable SCM from prior work, and also provides informative error bounds on counterfactual errors for a non-identifiable synthetic SCM

    Discovering Causal Relations and Equations from Data

    Full text link
    Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws and principles that are invariant, robust and causal explanations of the world has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventional studies in the system under study. With the advent of big data and the use of data-driven methods, causal and equation discovery fields have grown and made progress in computer science, physics, statistics, philosophy, and many applied fields. All these domains are intertwined and can be used to discover causal relations, physical laws, and equations from observational data. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of Physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for observational causal and equation discovery, point out connections, and showcase a complete set of case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is being revolutionised with the efficient exploitation of observational data, modern machine learning algorithms and the interaction with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.Comment: 137 page
    • …
    corecore