922 research outputs found
An Architecture-Altering and Training Methodology for Neural Logic Networks: Application in the Banking Sector
Artificial neural networks have been universally acknowledged for their ability on constructing forecasting and classifying systems. Among their desirable features, it has always been the interpretation of their structure, aiming to provide further knowledge for the domain experts. A number of methodologies have been developed for this reason. One such paradigm is the neural logic networks concept. Neural logic networks have been especially designed in order to enable the interpretation of their structure into a number of simple logical rules and they can be seen as a network representation of a logical rule base. Although powerful by their definition in this context, neural logic networks have performed poorly when used in approaches that required training from data. Standard training methods, such as the back-propagation, require the network’s synapse weight altering, which destroys the network’s interpretability. The methodology in this paper overcomes these problems and proposes an architecture-altering technique, which enables the production of highly antagonistic solutions while preserving any weight-related information. The implementation involves genetic programming using a grammar-guided training approach, in order to provide arbitrarily large and connected neural logic networks. The methodology is tested in a problem from the banking sector with encouraging results
Causal Discovery from Temporal Data: An Overview and New Perspectives
Temporal data, representing chronological observations of complex systems,
has always been a typical data structure that can be widely generated by many
domains, such as industry, medicine and finance. Analyzing this type of data is
extremely valuable for various applications. Thus, different temporal data
analysis tasks, eg, classification, clustering and prediction, have been
proposed in the past decades. Among them, causal discovery, learning the causal
relations from temporal data, is considered an interesting yet critical task
and has attracted much research attention. Existing casual discovery works can
be divided into two highly correlated categories according to whether the
temporal data is calibrated, ie, multivariate time series casual discovery, and
event sequence casual discovery. However, most previous surveys are only
focused on the time series casual discovery and ignore the second category. In
this paper, we specify the correlation between the two categories and provide a
systematical overview of existing solutions. Furthermore, we provide public
datasets, evaluation metrics and new perspectives for temporal data casual
discovery.Comment: 52 pages, 6 figure
Modelling causality in law = Modélisation de la causalité en droit
L'intérêt en apprentissage machine pour étudier la causalité s'est considérablement accru ces
dernières années. Cette approche est cependant encore peu répandue dans le domaine de
l’intelligence artificielle (IA) et du droit. Elle devrait l'être. L'approche associative actuelle
d’apprentissage machine révèle certaines limites que l'analyse causale peut surmonter. Cette
thèse vise à découvrir si les modèles causaux peuvent être utilisés en IA et droit.
Nous procédons à une brève revue sur le raisonnement et la causalité en science et en droit.
Traditionnellement, les cadres normatifs du raisonnement étaient la logique et la rationalité, mais
la théorie duale démontre que la prise de décision humaine dépend de nombreux facteurs qui
défient la rationalité. À ce titre, des statistiques et des probabilités étaient nécessaires pour
améliorer la prédiction des résultats décisionnels. En droit, les cadres de causalité ont été définis
par des décisions historiques, mais la plupart des modèles d’aujourd’hui de l'IA et droit
n'impliquent pas d'analyse causale. Nous fournissons un bref résumé de ces modèles, puis
appliquons le langage structurel de Judea Pearl et les définitions Halpern-Pearl de la causalité
pour modéliser quelques décisions juridiques canadiennes qui impliquent la causalité.
Les résultats suggèrent qu'il est non seulement possible d'utiliser des modèles de causalité
formels pour décrire les décisions juridiques, mais également utile car un schéma uniforme
élimine l'ambiguïté. De plus, les cadres de causalité sont utiles pour promouvoir la
responsabilisation et minimiser les biais.The machine learning community’s interest in causality has significantly increased in recent years.
This trend has not yet been made popular in AI & Law. It should be because the current
associative ML approach reveals certain limitations that causal analysis may overcome. This
research paper aims to discover whether formal causal frameworks can be used in AI & Law.
We proceed with a brief account of scholarship on reasoning and causality in science and in law.
Traditionally, normative frameworks for reasoning have been logic and rationality, but the dual
theory has shown that human decision-making depends on many factors that defy rationality. As
such, statistics and probability were called for to improve the prediction of decisional outcomes. In
law, causal frameworks have been defined by landmark decisions but most of the AI & Law
models today do not involve causal analysis. We provide a brief summary of these models and
then attempt to apply Judea Pearl’s structural language and the Halpern-Pearl definitions of
actual causality to model a few Canadian legal decisions that involve causality.
Results suggest that it is not only possible to use formal causal models to describe legal decisions,
but also useful because a uniform schema eliminates ambiguity. Also, causal frameworks are
helpful in promoting accountability and minimizing biases
A Construction Kit for Efficient Low Power Neural Network Accelerator Designs
Implementing embedded neural network processing at the edge requires
efficient hardware acceleration that couples high computational performance
with low power consumption. Driven by the rapid evolution of network
architectures and their algorithmic features, accelerator designs are
constantly updated and improved. To evaluate and compare hardware design
choices, designers can refer to a myriad of accelerator implementations in the
literature. Surveys provide an overview of these works but are often limited to
system-level and benchmark-specific performance metrics, making it difficult to
quantitatively compare the individual effect of each utilized optimization
technique. This complicates the evaluation of optimizations for new accelerator
designs, slowing-down the research progress. This work provides a survey of
neural network accelerator optimization approaches that have been used in
recent works and reports their individual effects on edge processing
performance. It presents the list of optimizations and their quantitative
effects as a construction kit, allowing to assess the design choices for each
building block separately. Reported optimizations range from up to 10'000x
memory savings to 33x energy reductions, providing chip designers an overview
of design choices for implementing efficient low power neural network
accelerators
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine
Deep neural networks have achieved impressive results in computer vision and
machine learning. Unfortunately, state-of-the-art networks are extremely
compute and memory intensive which makes them unsuitable for mW-devices such as
IoT end-nodes. Aggressive quantization of these networks dramatically reduces
the computation and memory footprint. Binary-weight neural networks (BWNs)
follow this trend, pushing weight quantization to the limit. Hardware
accelerators for BWNs presented up to now have focused on core efficiency,
disregarding I/O bandwidth and system-level efficiency that are crucial for
deployment of accelerators in ultra-low power devices. We present Hyperdrive: a
BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel
binary-weight streaming approach, which can be used for arbitrarily sized
convolutional neural network architecture and input resolution by exploiting
the natural scalability of the compute units both at chip-level and
system-level by arranging Hyperdrive chips systolically in a 2D mesh while
processing the entire feature map together in parallel. Hyperdrive achieves 4.3
TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than
state-of-the-art BWN accelerators, even if its core uses resource-intensive
FP16 arithmetic for increased robustness
The Use of Artificial Intelligence in the Supply Chain Management in Finnish Large Enterprises
The artificial intelligence (AI) provides a lot of potential to development of supply chain management (SCM). AI can operate on strategical -, tactical -, and operational decision-making levels. Operational levels as forecasting, production and warehouse actions are the most common fields where AI can operate. Aim of increasing SCM value creation via AI is to reach almost perfectly accurate forecasts and decreasing costs of production. Customer needs can be filled with sophisticated tools and companies must consider their next game plan all the time as markets become more competitive. Purpose of artificial intelligence technology is to create solutions for problems and fill missing parts of human-made gaps. Supply chain is one of the most critical function of business that must act straightforwardly and fluently. The aim of this research is to map out possible AI applications and determine maturity level of AI in SCM in large Finnish enterprises. This research is based on quantitative and qualitative analysis which illustrates the use of artificial intelligence in supply chain management. The research shows by measuring maturity - and automation level of artificial intelligence that these are lower than expected. The study cleared out what solutions large enterprises use in their supply chain management and results show that they focus on demand forecasting, optimisation, and preparation. This research found that companies are waiting to implement sophisticated artificial intelligence solutions until their maturity of big data is mature enough. As a discussion, due to companies’ uncertainty to lose competitive advantage and low maturity level led to scanty data collection. Suggestion for the future research is to examine this subject area when companies have been confident in implementing advanced AI technological acts in their SCM and AI has become more intelligence.Tekoäly (AI) tarjoaa paljon potentiaalia toimitusketjun hallinnan (SCM) kehittämiseen. Tekoäly voi toimia strategisella -, taktisella – ja operatiivisella päätöksentekotasolla. Operatiivisina tasoina ennuste-, tuotanto- ja varastotoiminta ovat yleisimpiä tekoälyä hyödyntäviä toimintoja. Toimitusketjun arvon luominen tekoälyn kautta tapahtuu lähes tarkkojen ennusteiden saavuttamisesta ja tuotantokustannusten alentamisesta. Kehittyneiden työkalujen avulla asiakkaiden kasvavat odotukset on mahdollista toteuttaa ja markkinoiden muuttuessa kilpailukykyisemmiksi, yritysten tulee jatkuvasti miettiä seuraavaa siirtoaan. Tekoälyteknologian tarkoitus on tuoda ratkaisuja muuttuviin ongelmiin ja täydentää puuttuvia osia ihmisten tekemissä virheissä. Toimitusketju on osa yrityksen kriittisimmistä toiminnoista, minkä vuoksi sen tulee toimia virheettömästi ja tuottaa lisäarvoa yritykselle. Tutkimuksen tavoitteena on kartoittaa mahdollisia tekoälysovelluksia ja määrittää tekoälyn kypsyystaso toimitusketjun hallinnassa suomalaisissa suuryrityksissä. Tämä tutkimus perustuu kvantitatiiviseen ja kvalitatiiviseen analyysiin, jotka muodostavat kuvan tekoälyteknologian käytöstä toimitusketjun hallinnassa. Mitattaessa tekoälyn maturiteetti- ja automaatiotasoa tutkimustulokset osoittivat, että nämä olivat alemmalla tasolla kuin oli oletettu. Tutkimus selvitti, mitä ratkaisuja suuryritykset käyttävät toimitusketjun hallinnassa. Tulokset osoittivat, että ne keskittyvät kysynnän ennustamiseen, optimointiin ja valmisteluun. Tutkimuksessa havaittiin, että yritykset odottavat kehittyneempien tekoälytoimintojen toteuttamista, kunnes heidän big datan maturiteettitaso on tarpeeksi kypsä. Pohdintana, tutkimuksen suppea aineistonkeruu saattaa johtua yrityksien epävarmuudesta kilpailukyvyn menettämiseen ja matalasta tekoälyn maturiteettitasosta. Tulevaisuudessa tämän aihealueen tutkiminen on ajankohtaista vasta kun yritykset ovat varmuudella toteuttaneet kehittyneitä tekoälyteknologisia toimintoja toimitusketjun hallinnassaan ja tekoäly on kehittynyt älykkäämmäksi
Counterfactual (Non-)identifiability of Learned Structural Causal Models
Recent advances in probabilistic generative modeling have motivated learning
Structural Causal Models (SCM) from observational datasets using deep
conditional generative models, also known as Deep Structural Causal Models
(DSCM). If successful, DSCMs can be utilized for causal estimation tasks, e.g.,
for answering counterfactual queries. In this work, we warn practitioners about
non-identifiability of counterfactual inference from observational data, even
in the absence of unobserved confounding and assuming known causal structure.
We prove counterfactual identifiability of monotonic generation mechanisms with
single dimensional exogenous variables. For general generation mechanisms with
multi-dimensional exogenous variables, we provide an impossibility result for
counterfactual identifiability, motivating the need for parametric assumptions.
As a practical approach, we propose a method for estimating worst-case errors
of learned DSCMs' counterfactual predictions. The size of this error can be an
essential metric for deciding whether or not DSCMs are a viable approach for
counterfactual inference in a specific problem setting. In evaluation, our
method confirms negligible counterfactual errors for an identifiable SCM from
prior work, and also provides informative error bounds on counterfactual errors
for a non-identifiable synthetic SCM
Discovering Causal Relations and Equations from Data
Physics is a field of science that has traditionally used the scientific
method to answer questions about why natural phenomena occur and to make
testable models that explain the phenomena. Discovering equations, laws and
principles that are invariant, robust and causal explanations of the world has
been fundamental in physical sciences throughout the centuries. Discoveries
emerge from observing the world and, when possible, performing interventional
studies in the system under study. With the advent of big data and the use of
data-driven methods, causal and equation discovery fields have grown and made
progress in computer science, physics, statistics, philosophy, and many applied
fields. All these domains are intertwined and can be used to discover causal
relations, physical laws, and equations from observational data. This paper
reviews the concepts, methods, and relevant works on causal and equation
discovery in the broad field of Physics and outlines the most important
challenges and promising future lines of research. We also provide a taxonomy
for observational causal and equation discovery, point out connections, and
showcase a complete set of case studies in Earth and climate sciences, fluid
dynamics and mechanics, and the neurosciences. This review demonstrates that
discovering fundamental laws and causal relations by observing natural
phenomena is being revolutionised with the efficient exploitation of
observational data, modern machine learning algorithms and the interaction with
domain knowledge. Exciting times are ahead with many challenges and
opportunities to improve our understanding of complex systems.Comment: 137 page
- …