3,173 research outputs found
Recommended from our members
Representation Learning beyond Semantic Similarity: Character-aware and Function-specific Approaches
Representation learning is a research area within machine learning and natural language processing (NLP) concerned with building machine-understandable representations of discrete units of text. Continuous representations are at the core of modern machine learning applications, and representation learning has thereby become one of the central research areas in NLP. The induction of text representations is typically based on the distributional hypothesis, and consequently encodes general information about word similarity. Words or phrases with similar meaning obtain similar representations in a vector space constructed for this purpose. This established methodology excels for morphologically-simple languages such as English, and in data-rich settings. However, several useful lexical relations such as entailment or selectional preference, are not captured or get conflated with other relations. Another challenge is dealing with low-data regimes for morphologically-complex and under-resourced languages.
In this thesis we construct novel representation learning methods that go beyond the limitations of the distributional hypothesis and investigate solutions that induce vector spaces with diverse properties. In particular, we look at how the vector space induction process influences the contained information, and how the information manifests in a number of core NLP tasks: semantic similarity, lexical entailment, selectional preference, and language modeling. We contribute novel evaluations of state-of-the-art models highlighting their current capabilities and limitations. An analysis of language modeling in 50 typologically-diverse languages demonstrates that representations can indeed pose a performance bottleneck. We introduce a novel approach to leveraging subword-level information in word representations: our solution lifts this bottleneck in low-resource scenarios. Finally, we introduce a novel paradigm of function-specific representation learning that aims to integrate fine-grained semantic relations and real-world knowledge into the word vector spaces. We hope this thesis can serve as a valuable overview on word representations, and inspire future work in modeling \textit{semantic similarity and beyond}.ERC Consolidator Grant LEXICAL (648909
Directional adposition use in English, Swedish and Finnish
Directional adpositions such as to the left of describe where a Figure is in relation to a Ground. English and Swedish directional adpositions refer to the location of a Figure in relation to a Ground, whether both are static or in motion. In contrast, the Finnish directional adpositions edellÀ (in front of) and jÀljessÀ (behind) solely describe the location of a moving Figure in relation to a moving Ground (Nikanne, 2003).
When using directional adpositions, a frame of reference must be assumed for interpreting the meaning of directional adpositions. For example, the meaning of to the left of in English can be based on a relative (speaker or listener based) reference frame or an intrinsic (object based) reference frame (Levinson, 1996). When a Figure and a Ground are both in motion, it is possible for a Figure to be described as being behind or in front of the Ground, even if neither have intrinsic features. As shown by Walker (in preparation), there are good reasons to assume that in the latter case a motion based reference frame is involved. This means that if Finnish speakers would use edellÀ (in front of) and jÀljessÀ (behind) more frequently in situations where both the Figure and Ground are in motion, a difference in reference frame use between Finnish on one hand and English and Swedish on the other could be expected.
We asked native English, Swedish and Finnish speakersâ to select adpositions from a language specific list to describe the location of a Figure relative to a Ground when both were shown to be moving on a computer screen. We were interested in any differences between Finnish, English and Swedish speakers.
All languages showed a predominant use of directional spatial adpositions referring to the lexical concepts TO THE LEFT OF, TO THE RIGHT OF, ABOVE and BELOW. There were no differences between the languages in directional adpositions use or reference frame use, including reference frame use based on motion.
We conclude that despite differences in the grammars of the languages involved, and potential differences in reference frame system use, the three languages investigated encode Figure location in relation to Ground location in a similar way when both are in motion.
Levinson, S. C. (1996). Frames of reference and Molyneuxâs question: Crosslingiuistic evidence. In P. Bloom, M.A. Peterson, L. Nadel & M.F. Garrett (Eds.) Language and Space (pp.109-170). Massachusetts: MIT Press.
Nikanne, U. (2003). How Finnish postpositions see the axis system. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space. Oxford, UK: Oxford University Press.
Walker, C. (in preparation). Motion encoding in language, the use of spatial locatives in a motion context. Unpublished doctoral dissertation, University of Lincoln, Lincoln. United Kingdo
Formalized Conceptual Spaces with a Geometric Representation of Correlations
The highly influential framework of conceptual spaces provides a geometric
way of representing knowledge. Instances are represented by points in a
similarity space and concepts are represented by convex regions in this space.
After pointing out a problem with the convexity requirement, we propose a
formalization of conceptual spaces based on fuzzy star-shaped sets. Our
formalization uses a parametric definition of concepts and extends the original
framework by adding means to represent correlations between different domains
in a geometric way. Moreover, we define various operations for our
formalization, both for creating new concepts from old ones and for measuring
relations between concepts. We present an illustrative toy-example and sketch a
research project on concept formation that is based on both our formalization
and its implementation.Comment: Published in the edited volume "Conceptual Spaces: Elaborations and
Applications". arXiv admin note: text overlap with arXiv:1706.06366,
arXiv:1707.02292, arXiv:1707.0516
An End-to-end Neural Natural Language Interface for Databases
The ability to extract insights from new data sets is critical for decision
making. Visual interactive tools play an important role in data exploration
since they provide non-technical users with an effective way to visually
compose queries and comprehend the results. Natural language has recently
gained traction as an alternative query interface to databases with the
potential to enable non-expert users to formulate complex questions and
information needs efficiently and effectively. However, understanding natural
language questions and translating them accurately to SQL is a challenging
task, and thus Natural Language Interfaces for Databases (NLIDBs) have not yet
made their way into practical tools and commercial products.
In this paper, we present DBPal, a novel data exploration tool with a natural
language interface. DBPal leverages recent advances in deep models to make
query understanding more robust in the following ways: First, DBPal uses a deep
model to translate natural language statements to SQL, making the translation
process more robust to paraphrasing and other linguistic variations. Second, to
support the users in phrasing questions without knowing the database schema and
the query features, DBPal provides a learned auto-completion model that
suggests partial query extensions to users during query formulation and thus
helps to write complex queries
An aesthetics of touch: investigating the language of design relating to form
How well can designers communicate qualities of touch?
This paper presents evidence that they have some capability to do so, much of which appears to have been learned, but at present make limited use of such language. Interviews with graduate designer-makers suggest that they are aware of and value the importance of touch and materiality in their work, but lack a vocabulary to fully relate to their detailed explanations of other aspects such as their intent or selection of materials. We believe that more attention should be paid to the verbal dialogue that happens in the design process, particularly as other researchers show that even making-based learning also has a strong verbal element to it. However, verbal language alone does not appear to be adequate for a comprehensive language of touch. Graduate designers-makersâ descriptive practices combined non-verbal manipulation within verbal accounts. We thus argue that haptic vocabularies do not simply describe material qualities, but rather are situated competences that physically demonstrate the presence of haptic qualities. Such competencies are more important than groups of verbal vocabularies in isolation. Design support for developing and extending haptic competences must take this wide range of considerations into account to comprehensively improve designersâ capabilities
- âŠ