9 research outputs found
Learning Hierarchical Visual Representations in Deep Neural Networks Using Hierarchical Linguistic Labels
Modern convolutional neural networks (CNNs) are able to achieve human-level
object classification accuracy on specific tasks, and currently outperform
competing models in explaining complex human visual representations. However,
the categorization problem is posed differently for these networks than for
humans: the accuracy of these networks is evaluated by their ability to
identify single labels assigned to each image. These labels often cut
arbitrarily across natural psychological taxonomies (e.g., dogs are separated
into breeds, but never jointly categorized as "dogs"), and bias the resulting
representations. By contrast, it is common for children to hear both "dog" and
"Dalmatian" to describe the same stimulus, helping to group perceptually
disparate objects (e.g., breeds) into a common mental class. In this work, we
train CNN classifiers with multiple labels for each image that correspond to
different levels of abstraction, and use this framework to reproduce classic
patterns that appear in human generalization behavior.Comment: 6 pages, 4 figures, 1 table. Accepted as a paper to the 40th Annual
Meeting of the Cognitive Science Society (CogSci 2018
Discovering the Compositional Structure of Vector Representations with Role Learning Networks
How can neural networks perform so well on compositional tasks even though
they lack explicit compositional representations? We use a novel analysis
technique called ROLE to show that recurrent neural networks perform well on
such tasks by converging to solutions which implicitly represent symbolic
structure. This method uncovers a symbolic structure which, when properly
embedded in vector space, closely approximates the encodings of a standard
seq2seq network trained to perform the compositional SCAN task. We verify the
causal importance of the discovered symbolic structure by showing that, when we
systematically manipulate hidden embeddings based on this symbolic structure,
the model's output is changed in the way predicted by our analysis
Differentiable Tree Operations Promote Compositional Generalization
In the context of structure-to-structure transformation tasks, learning
sequences of discrete symbolic operations poses significant challenges due to
their non-differentiability. To facilitate the learning of these symbolic
sequences, we introduce a differentiable tree interpreter that compiles
high-level symbolic tree operations into subsymbolic matrix operations on
tensors. We present a novel Differentiable Tree Machine (DTM) architecture that
integrates our interpreter with an external memory and an agent that learns to
sequentially select tree operations to execute the target transformation in an
end-to-end manner. With respect to out-of-distribution compositional
generalization on synthetic semantic parsing and language generation tasks, DTM
achieves 100% while existing baselines such as Transformer, Tree Transformer,
LSTM, and Tree2Tree LSTM achieve less than 30%. DTM remains highly
interpretable in addition to its perfect performance.Comment: ICML 2023. Code available at https://github.com/psoulos/dt
Disentangled deep generative models reveal coding principles of the human face processing network.
Despite decades of research, much is still unknown about the computations carried out in the human face processing network. Recently, deep networks have been proposed as a computational account of human visual processing, but while they provide a good match to neural data throughout visual cortex, they lack interpretability. We introduce a method for interpreting brain activity using a new class of deep generative models, disentangled representation learning models, which learn a low-dimensional latent space that "disentangles" different semantically meaningful dimensions of faces, such as rotation, lighting, or hairstyle, in an unsupervised manner by enforcing statistical independence between dimensions. We find that the majority of our model's learned latent dimensions are interpretable by human raters. Further, these latent dimensions serve as a good encoding model for human fMRI data. We next investigate the representation of different latent dimensions across face-selective voxels. We find that low- and high-level face features are represented in posterior and anterior face-selective regions, respectively, corroborating prior models of human face recognition. Interestingly, though, we find identity-relevant and irrelevant face features across the face processing network. Finally, we provide new insight into the few "entangled" (uninterpretable) dimensions in our model by showing that they match responses in the ventral stream and carry information about facial identity. Disentangled face encoding models provide an exciting alternative to standard "black box" deep learning approaches for modeling and interpreting human brain data
Recommended from our members
Learning Hierarchical Visual Representations in Deep Neural NetworksUsing Hierarchical Linguistic Labels
Modern convolutional neural networks (CNNs) are able toachieve human-level object classification accuracy on specifictasks, and currently outperform competing models in explain-ing complex human visual representations. However, the cate-gorization problem is posed differently for these networks thanfor humans: the accuracy of these networks is evaluated bytheir ability to identify single labels assigned to each image.These labels often cut arbitrarily across natural psychologi-cal taxonomies (e.g., dogs are separated into breeds, but neverjointly categorized as “dogs”), and bias the resulting represen-tations. By contrast, it is common for children to hear bothdog and Dalmatian to describe the same stimulus, helping togroup perceptually disparate objects (e.g., breeds) into a com-mon mental class. In this work, we train CNN classifiers withmultiple labels for each image that correspond to different lev-els of abstraction, and use this framework to reproduce classicpatterns that appear in human generalization behavior