144 research outputs found
Compositionality and Concepts in Linguistics and Psychology
cognitive science; semantics; languag
Complex Politics: A Quantitative Semantic and Topological Analysis of UK House of Commons Debates
This study is a first, exploratory attempt to use quantitative semantics
techniques and topological analysis to analyze systemic patterns arising in a
complex political system. In particular, we use a rich data set covering all
speeches and debates in the UK House of Commons between 1975 and 2014. By the
use of dynamic topic modeling (DTM) and topological data analysis (TDA) we show
that both members and parties feature specific roles within the system,
consistent over time, and extract global patterns indicating levels of
political cohesion. Our results provide a wide array of novel hypotheses about
the complex dynamics of political systems, with valuable policy applications
Computer-aided biomimetics : semi-open relation extraction from scientific biological texts
Engineering inspired by biology – recently termed biom* – has led to various ground-breaking technological developments. Example areas of application include aerospace
engineering and robotics. However, biom* is not always successful and only sporadically applied in industry. The reason is that a systematic approach to biom* remains
at large, despite the existence of a plethora of methods and design tools. In recent
years computational tools have been proposed as well, which can potentially support
a systematic integration of relevant biological knowledge during biom*. However,
these so-called Computer-Aided Biom* (CAB) tools have not been able to fill all
the gaps in the biom* process. This thesis investigates why existing CAB tools
fail, proposes a novel approach – based on Information Extraction – and develops a
proof-of-concept for a CAB tool that does enable a systematic approach to biom*.
Key contributions include: 1) a disquisition of existing tools guides the selection of a strategy for systematic CAB, 2) a dataset of 1,500 manually-annotated
sentences, 3) a novel Information Extraction approach that combines the outputs
from a supervised Relation Extraction system and an existing Open Information
Extraction system. The implemented exploratory approach indicates that it is possible to extract a focused selection of relations from scientific texts with reasonable
accuracy, without imposing limitations on the types of information extracted. Furthermore, the tool developed in this thesis is shown to i) speed up a trade-off analysis
by domain-experts, and ii) also improve the access to biology information for non-exper
Computer-Aided Biomimetics : Semi-Open Relation Extraction from scientific biological texts
Engineering inspired by biology – recently termed biom* – has led to various groundbreaking technological developments. Example areas of application include aerospace
engineering and robotics. However, biom* is not always successful and only sporadically applied in industry. The reason is that a systematic approach to biom* remains
at large, despite the existence of a plethora of methods and design tools. In recent
years computational tools have been proposed as well, which can potentially support
a systematic integration of relevant biological knowledge during biom*. However,
these so-called Computer-Aided Biom* (CAB) tools have not been able to fill all
the gaps in the biom* process. This thesis investigates why existing CAB tools
fail, proposes a novel approach – based on Information Extraction – and develops a
proof-of-concept for a CAB tool that does enable a systematic approach to biom*.
Key contributions include: 1) a disquisition of existing tools guides the selection of a strategy for systematic CAB, 2) a dataset of 1,500 manually-annotated
sentences, 3) a novel Information Extraction approach that combines the outputs
from a supervised Relation Extraction system and an existing Open Information
Extraction system. The implemented exploratory approach indicates that it is possible to extract a focused selection of relations from scientific texts with reasonable
accuracy, without imposing limitations on the types of information extracted. Furthermore, the tool developed in this thesis is shown to i) speed up a trade-off analysis
by domain-experts, and ii) also improve the access to biology information for nonexperts
Grounded Semantic Composition for Visual Scenes
We present a visually-grounded language understanding model based on a study
of how people verbally describe objects in scenes. The emphasis of the model is
on the combination of individual word meanings to produce meanings for complex
referring expressions. The model has been implemented, and it is able to
understand a broad range of spatial referring expressions. We describe our
implementation of word level visually-grounded semantics and their embedding in
a compositional parsing framework. The implemented system selects the correct
referents in response to natural language expressions for a large percentage of
test cases. In an analysis of the system's successes and failures we reveal how
visual context influences the semantics of utterances and propose future
extensions to the model that take such context into account
A Defense of Pure Connectionism
Connectionism is an approach to neural-networks-based cognitive modeling that encompasses the recent deep learning movement in artificial intelligence. It came of age in the 1980s, with its roots in cybernetics and earlier attempts to model the brain as a system of simple parallel processors. Connectionist models center on statistical inference within neural networks with empirically learnable parameters, which can be represented as graphical models. More recent approaches focus on learning and inference within hierarchical generative models. Contra influential and ongoing critiques, I argue in this dissertation that the connectionist approach to cognitive science possesses in principle (and, as is becoming increasingly clear, in practice) the resources to model even the most rich and distinctly human cognitive capacities, such as abstract, conceptual thought and natural language comprehension and production.
Consonant with much previous philosophical work on connectionism, I argue that a core principle—that proximal representations in a vector space have similar semantic values—is the key to a successful connectionist account of the systematicity and productivity of thought, language, and other core cognitive phenomena. My work here differs from preceding work in philosophy in several respects: (1) I compare a wide variety of connectionist responses to the systematicity challenge and isolate two main strands that are both historically important and reflected in ongoing work today: (a) vector symbolic architectures and (b) (compositional) vector space semantic models; (2) I consider very recent applications of these approaches, including their deployment on large-scale machine learning tasks such as machine translation; (3) I argue, again on the basis mostly of recent developments, for a continuity in representation and processing across natural language, image processing and other domains; (4) I explicitly link broad, abstract features of connectionist representation to recent proposals in cognitive science similar in spirit, such as hierarchical Bayesian and free energy minimization approaches, and offer a single rebuttal of criticisms of these related paradigms; (5) I critique recent alternative proposals that argue for a hybrid Classical (i.e. serial symbolic)/statistical model of mind; (6) I argue that defending the most plausible form of a connectionist cognitive architecture requires rethinking certain distinctions that have figured prominently in the history of the philosophy of mind and language, such as that between word- and phrase-level semantic content, and between inference and association
Context, cognition and communication in language
Questions pertaining to the unique structure and organisation of language have a
long history in the field of linguistics. In recent years, researchers have explored
cultural evolutionary explanations, showing how language structure emerges from
weak biases amplified over repeated patterns of learning and use. One outstanding
issue in these frameworks is accounting for the role of context. In particular,
many linguistic phenomena are said to to be context-dependent; interpretation
does not take place in a void, and requires enrichment from the current state
of the conversation, the physical situation, and common knowledge about the
world. Modelling the relationship between language structure and context is
therefore crucial for developing a cultural evolutionary approach to language.
One approach is to use statistical analyses to investigate large-scale, cross-cultural
datasets. However, due to the inherent limitations of statistical analyses, especially
with regards to the inadequacy of these methods to test hypotheses about
causal relationships, I argue that experiments are better suited to address questions
pertaining to language structure and context. From here, I present a series
of artificial language experiments, with the central aim being to test how
manipulations to context influence the structure and organisation of language.
Experiment 1 builds upon previous work in iterated learning and communication
games through demonstrating that the emergence of optimal communication systems
is contingent on the contexts in which languages are learned and used. The
results show that language systems gradually evolve to only encode information
that is informative for conveying the intended meaning of the speaker - resulting
in markedly different systems of communication. Whereas Experiment 1 focused
on how context influences the emergence of structure, Experiments 2 and 3 investigate
under what circumstances do manipulations to context result in the loss
of structure. While the results are inconclusive across these two experiments,
there is tentative evidence that manipulations to context can disrupt structure,
but only when interacting with other factors. Lastly, Experiment 4 investigates
whether the degree of signal autonomy (the capacity for a signal to be interpreted without recourse to contextual information) is shaped by manipulations
to contextual predictability: the extent to which a speaker can estimate and exploit
contextual information a hearer uses in interpreting an utterance. When
the context is predictable, speakers organise languages to be less autonomous
(more context-dependent) through combining linguistic signals with contextual
information to reduce effort in production and minimise uncertainty in comprehension.
By decreasing contextual predictability, speakers increasingly rely on
strategies that promote more autonomous signals, as these signals depend less on
contextual information to discriminate between possible meanings. Overall, these
experiments provide proof-of-concept for investigating the relationship between
language structure and context, showing that the organisational principles underpinning
language are the result of competing pressures from context, cognition,
and communication
Semi-supervised learning for big social data analysis
In an era of social media and connectivity, web users are becoming increasingly enthusiastic about interacting, sharing, and working together through online collaborative media. More recently, this collective intelligence has spread to many different areas, with a growing impact on everyday life, such as in education, health, commerce and tourism, leading to an exponential growth in the size of the social Web. However, the distillation of knowledge from such unstructured Big data is, an extremely challenging task. Consequently, the semantic and multimodal contents of the Web in this present day are, whilst being well suited for human use, still barely accessible to machines. In this work, we explore the potential of a novel semi-supervised learning model based on the combined use of random projection scaling as part of a vector space model, and support vector machines to perform reasoning on a knowledge base. The latter is developed by merging a graph representation of commonsense with a linguistic resource for the lexical representation of affect. Comparative simulation results show a significant improvement in tasks such as emotion recognition and polarity detection, and pave the way for development of future semi-supervised learning approaches to big social data analytics
- …