4,259 research outputs found
Mitigating problems in analogy-based EBMT with SMT and vice versa: a case study with named entity transliteration
Five years ago, a number of papers reported an experimental implementation of an Example Based Machine Translation (EBMT) system using proportional analogy. This approach, a type of analogical learning, was attractive because of its simplicity; and the paper reported considerable success with the method using various language pairs. In this paper, we describe our attempt to use this approach for tackling English–Hindi Named Entity (NE) Transliteration. We have implemented our own EBMT system using proportional analogy and have found that the analogy-based system on its own has low precision but a high recall due to the fact that a large number of names are untransliterated with the approach. However, mitigating problems in analogy-based EBMT with SMT and vice-versa have shown considerable improvement over the individual approach
Ranking relations using analogies in biological and information networks
Analogical reasoning depends fundamentally on the ability to learn and
generalize about relations between objects. We develop an approach to
relational learning which, given a set of pairs of objects
,
measures how well other pairs A:B fit in with the set . Our work
addresses the following question: is the relation between objects A and B
analogous to those relations found in ? Such questions are
particularly relevant in information retrieval, where an investigator might
want to search for analogous pairs of objects that match the query set of
interest. There are many ways in which objects can be related, making the task
of measuring analogies very challenging. Our approach combines a similarity
measure on function spaces with Bayesian analysis to produce a ranking. It
requires data containing features of the objects of interest and a link matrix
specifying which relationships exist; no further attributes of such
relationships are necessary. We illustrate the potential of our method on text
analysis and information networks. An application on discovering functional
interactions between pairs of proteins is discussed in detail, where we show
that our approach can work in practice even if a small set of protein pairs is
provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Distributed Representations of Words and Phrases and their Compositionality
The recently introduced continuous Skip-gram model is an efficient method for
learning high-quality distributed vector representations that capture a large
number of precise syntactic and semantic word relationships. In this paper we
present several extensions that improve both the quality of the vectors and the
training speed. By subsampling of the frequent words we obtain significant
speedup and also learn more regular word representations. We also describe a
simple alternative to the hierarchical softmax called negative sampling. An
inherent limitation of word representations is their indifference to word order
and their inability to represent idiomatic phrases. For example, the meanings
of "Canada" and "Air" cannot be easily combined to obtain "Air Canada".
Motivated by this example, we present a simple method for finding phrases in
text, and show that learning good vector representations for millions of
phrases is possible
Accelerating Innovation Through Analogy Mining
The availability of large idea repositories (e.g., the U.S. patent database)
could significantly accelerate innovation and discovery by providing people
with inspiration from solutions to analogous problems. However, finding useful
analogies in these large, messy, real-world repositories remains a persistent
challenge for either human or automated methods. Previous approaches include
costly hand-created databases that have high relational structure (e.g.,
predicate calculus representations) but are very sparse. Simpler
machine-learning/information-retrieval similarity metrics can scale to large,
natural-language datasets, but struggle to account for structural similarity,
which is central to analogy. In this paper we explore the viability and value
of learning simpler structural representations, specifically, "problem
schemas", which specify the purpose of a product and the mechanisms by which it
achieves that purpose. Our approach combines crowdsourcing and recurrent neural
networks to extract purpose and mechanism vector representations from product
descriptions. We demonstrate that these learned vectors allow us to find
analogies with higher precision and recall than traditional
information-retrieval methods. In an ideation experiment, analogies retrieved
by our models significantly increased people's likelihood of generating
creative ideas compared to analogies retrieved by traditional methods. Our
results suggest a promising approach to enabling computational analogy at scale
is to learn and leverage weaker structural representations.Comment: KDD 201
Design thinking support: information systems versus reasoning
Numerous attempts have been made to conceive and implement appropriate information systems to support architectural designers in their creative design thinking processes. These information systems aim at providing support in very diverse ways: enabling designers to make diverse kinds of visual representations of a design, enabling them to make complex calculations and simulations which take into account numerous relevant parameters in the design context, providing them with loads of information and knowledge from all over the world, and so forth. Notwithstanding the continued efforts to develop these information systems, they still fail to provide essential support in the core creative activities of architectural designers. In order to understand why an appropriately effective support from information systems is so hard to realize, we started to look into the nature of design thinking and on how reasoning processes are at play in this design thinking. This investigation suggests that creative designing rests on a cyclic combination of abductive, deductive and inductive reasoning processes. Because traditional information systems typically target only one of these reasoning processes at a time, this could explain the limited applicability and usefulness of these systems. As research in information technology is increasingly targeting the combination of these reasoning modes, improvements may be within reach for design thinking support by information systems
The Latent Relation Mapping Engine: Algorithm and Experiments
Many AI researchers and cognitive scientists have argued that analogy is the
core of cognition. The most influential work on computational modeling of
analogy-making is Structure Mapping Theory (SMT) and its implementation in the
Structure Mapping Engine (SME). A limitation of SME is the requirement for
complex hand-coded representations. We introduce the Latent Relation Mapping
Engine (LRME), which combines ideas from SME and Latent Relational Analysis
(LRA) in order to remove the requirement for hand-coded representations. LRME
builds analogical mappings between lists of words, using a large corpus of raw
text to automatically discover the semantic relations among the words. We
evaluate LRME on a set of twenty analogical mapping problems, ten based on
scientific analogies and ten based on common metaphors. LRME achieves
human-level performance on the twenty problems. We compare LRME with a variety
of alternative approaches and find that they are not able to reach the same
level of performance.Comment: related work available at http://purl.org/peter.turney
On empirical methodology, constraints, and hierarchy in artificial grammar learning
This paper considers the AGL literature from a psycholinguistic perspective. It first presents a taxonomy of the experimental familiarization test procedures used, which is followed by a consideration of shortcomings and potential improvements of the empirical methodology. It then turns to reconsidering the issue of grammar learning from the point of view of acquiring constraints, instead of the traditional AGL approach in terms of acquiring sets of rewrite rules. This is, in particular, a natural way of handling long‐distance dependences. The final section addresses an underdeveloped issue in the AGL literature, namely how to detect latent hierarchical structure in AGL response patterns
Motion as manipulation: Implementation of motion and force analogies by event-file binding and action planning\ud
Tool improvisation analogies are a special case of motion and force analogies that appear to be implemented pre-conceptually, in many species, by event-file binding and action planning. A detailed reconstruction of the analogical reasoning steps involved in Rutherford's and Bohr's development of the first quantized-orbit model of atomic structure is used to show that human motion and force analogies generally can be implemented by the event-file binding and action planning mechanism. Predictions that distinguish this model from competing concept-level models of analogy are discussed, available data pertaining to them are reviewed, and further experimental tests are proposed
Proof-Pattern Recognition and Lemma Discovery in ACL2
We present a novel technique for combining statistical machine learning for
proof-pattern recognition with symbolic methods for lemma discovery. The
resulting tool, ACL2(ml), gathers proof statistics and uses statistical
pattern-recognition to pre-processes data from libraries, and then suggests
auxiliary lemmas in new proofs by analogy with already seen examples. This
paper presents the implementation of ACL2(ml) alongside theoretical
descriptions of the proof-pattern recognition and lemma discovery methods
involved in it
- …