8,846 research outputs found
A mathematical theory of semantic development in deep neural networks
An extensive body of empirical research has revealed remarkable regularities
in the acquisition, organization, deployment, and neural representation of
human semantic knowledge, thereby raising a fundamental conceptual question:
what are the theoretical principles governing the ability of neural networks to
acquire, organize, and deploy abstract knowledge by integrating across many
individual experiences? We address this question by mathematically analyzing
the nonlinear dynamics of learning in deep linear networks. We find exact
solutions to this learning dynamics that yield a conceptual explanation for the
prevalence of many disparate phenomena in semantic cognition, including the
hierarchical differentiation of concepts through rapid developmental
transitions, the ubiquity of semantic illusions between such transitions, the
emergence of item typicality and category coherence as factors controlling the
speed of semantic processing, changing patterns of inductive projection over
development, and the conservation of semantic similarity in neural
representations across species. Thus, surprisingly, our simple neural model
qualitatively recapitulates many diverse regularities underlying semantic
development, while providing analytic insight into how the statistical
structure of an environment can interact with nonlinear deep learning dynamics
to give rise to these regularities
Physical Primitive Decomposition
Objects are made of parts, each with distinct geometry, physics,
functionality, and affordances. Developing such a distributed, physical,
interpretable representation of objects will facilitate intelligent agents to
better explore and interact with the world. In this paper, we study physical
primitive decomposition---understanding an object through its components, each
with physical and geometric attributes. As annotated data for object parts and
physics are rare, we propose a novel formulation that learns physical
primitives by explaining both an object's appearance and its behaviors in
physical events. Our model performs well on block towers and tools in both
synthetic and real scenarios; we also demonstrate that visual and physical
observations often provide complementary signals. We further present ablation
and behavioral studies to better understand our model and contrast it with
human performance.Comment: ECCV 2018. Project page: http://ppd.csail.mit.edu
Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations
Post-hoc explanations of machine learning models are crucial for people to
understand and act on algorithmic predictions. An intriguing class of
explanations is through counterfactuals, hypothetical examples that show people
how to obtain a different prediction. We posit that effective counterfactual
explanations should satisfy two properties: feasibility of the counterfactual
actions given user context and constraints, and diversity among the
counterfactuals presented. To this end, we propose a framework for generating
and evaluating a diverse set of counterfactual explanations based on
determinantal point processes. To evaluate the actionability of
counterfactuals, we provide metrics that enable comparison of
counterfactual-based methods to other local explanation methods. We further
address necessary tradeoffs and point to causal implications in optimizing for
counterfactuals. Our experiments on four real-world datasets show that our
framework can generate a set of counterfactuals that are diverse and well
approximate local decision boundaries, outperforming prior approaches to
generating diverse counterfactuals. We provide an implementation of the
framework at https://github.com/microsoft/DiCE.Comment: 13 page
CompILE: Compositional Imitation Learning and Execution
We introduce Compositional Imitation Learning and Execution (CompILE): a
framework for learning reusable, variable-length segments of
hierarchically-structured behavior from demonstration data. CompILE uses a
novel unsupervised, fully-differentiable sequence segmentation module to learn
latent encodings of sequential data that can be re-composed and executed to
perform new tasks. Once trained, our model generalizes to sequences of longer
length and from environment instances not seen during training. We evaluate
CompILE in a challenging 2D multi-task environment and a continuous control
task, and show that it can find correct task boundaries and event encodings in
an unsupervised manner. Latent codes and associated behavior policies
discovered by CompILE can be used by a hierarchical agent, where the high-level
policy selects actions in the latent code space, and the low-level,
task-specific policies are simply the learned decoders. We found that our
CompILE-based agent could learn given only sparse rewards, where agents without
task-specific policies struggle.Comment: ICML (2019
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn
Current learning machines have successfully solved hard application problems,
reaching high accuracy and displaying seemingly "intelligent" behavior. Here we
apply recent techniques for explaining decisions of state-of-the-art learning
machines and analyze various tasks from computer vision and arcade games. This
showcases a spectrum of problem-solving behaviors ranging from naive and
short-sighted, to well-informed and strategic. We observe that standard
performance evaluation metrics can be oblivious to distinguishing these diverse
problem solving behaviors. Furthermore, we propose our semi-automated Spectral
Relevance Analysis that provides a practically effective way of characterizing
and validating the behavior of nonlinear learning machines. This helps to
assess whether a learned model indeed delivers reliably for the problem that it
was conceived for. Furthermore, our work intends to add a voice of caution to
the ongoing excitement about machine intelligence and pledges to evaluate and
judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication
Non-classical measurement theory: a framework forbehavioral sciences
Instances of non-commutativity are pervasive in human behavior. In this paper, we suggest that psychological properties such as attitudes, values, preferences and beliefs may be suitably described in terms of the mathematical formalism of quantum mechanics. We expose the foundations of non-classical measurement theory building on a simple notion of orthospace and ortholattice (logic). Two axioms are formulated and the characteristic state-property duality is derived. A last axiom concerned with the impact of measurements on the state takes us with a leap toward the Hilbert space model of Quantum Mechanics. An application to behavioral sciences is proposed. First, we suggest an interpretation of the axioms and basic properties for human behavior. Then we explore an application to decision theory in an example of preference reversal. We conclude by formulating basic ingredients of a theory of actualized preferences based in non-classical measurement theory.non-classsical measurement ; orthospace ; state ; properties ; non-commutativity
More cat than cute? Interpretable Prediction of Adjective-Noun Pairs
The increasing availability of affect-rich multimedia resources has bolstered
interest in understanding sentiment and emotions in and from visual content.
Adjective-noun pairs (ANP) are a popular mid-level semantic construct for
capturing affect via visually detectable concepts such as "cute dog" or
"beautiful landscape". Current state-of-the-art methods approach ANP prediction
by considering each of these compound concepts as individual tokens, ignoring
the underlying relationships in ANPs. This work aims at disentangling the
contributions of the `adjectives' and `nouns' in the visual prediction of ANPs.
Two specialised classifiers, one trained for detecting adjectives and another
for nouns, are fused to predict 553 different ANPs. The resulting ANP
prediction model is more interpretable as it allows us to study contributions
of the adjective and noun components. Source code and models are available at
https://imatge-upc.github.io/affective-2017-musa2/ .Comment: Oral paper at ACM Multimedia 2017 Workshop on Multimodal
Understanding of Social, Affective and Subjective Attributes (MUSA2
- …