534 research outputs found
Human-Level Performance on Word Analogy Questions by Latent Relational Analysis
This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason/stone is analogous to the pair carpenter/wood; the relations between mason and stone are highly similar to the relations between carpenter and wood. Past work on semantic similarity measures has mainly been concerned with attributional similarity. For instance, Latent Semantic Analysis (LSA) can measure the degree of similarity between two words, but not between two relations. Recently the Vector Space Model (VSM) of information retrieval has been adapted to the task of measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus (they are not predefined), (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data (it is also used this way in LSA), and (3) automatically generated synonyms are used to explore reformulations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying noun-modifier relations, LRA achieves similar gains over the VSM, while using a smaller corpus
Escaping the Trap of too Precise Topic Queries
At the very center of digital mathematics libraries lie controlled
vocabularies which qualify the {\it topic} of the documents. These topics are
used when submitting a document to a digital mathematics library and to perform
searches in a library. The latter are refined by the use of these topics as
they allow a precise classification of the mathematics area this document
addresses. However, there is a major risk that users employ too precise topics
to specify their queries: they may be employing a topic that is only "close-by"
but missing to match the right resource. We call this the {\it topic trap}.
Indeed, since 2009, this issue has appeared frequently on the i2geo.net
platform. Other mathematics portals experience the same phenomenon. An approach
to solve this issue is to introduce tolerance in the way queries are understood
by the user. In particular, the approach of including fuzzy matches but this
introduces noise which may prevent the user of understanding the function of
the search engine.
In this paper, we propose a way to escape the topic trap by employing the
navigation between related topics and the count of search results for each
topic. This supports the user in that search for close-by topics is a click
away from a previous search. This approach was realized with the i2geo search
engine and is described in detail where the relation of being {\it related} is
computed by employing textual analysis of the definitions of the concepts
fetched from the Wikipedia encyclopedia.Comment: 12 pages, Conference on Intelligent Computer Mathematics 2013 Bath,
U
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
Techniques for organizational memory information systems
The KnowMore project aims at providing active support to humans working on knowledge-intensive tasks. To this end the knowledge available in the modeled business processes or their incarnations in specific workflows shall be used to improve information handling. We present a representation formalism for knowledge-intensive tasks and the specification of its object-oriented realization. An operational semantics is sketched by specifying the basic functionality of the Knowledge Agent which works on the knowledge intensive task representation.
The Knowledge Agent uses a meta-level description of all information sources available in the Organizational Memory. We discuss the main dimensions that such a description scheme must be designed along, namely information content, structure, and context. On top of relational database management systems, we basically realize deductive object- oriented modeling with a comfortable annotation facility. The concrete knowledge descriptions are obtained by configuring the generic formalism with ontologies which describe the required modeling dimensions.
To support the access to documents, data, and formal knowledge in an Organizational Memory an integrated domain ontology and thesaurus is proposed which can be constructed semi-automatically by combining document-analysis and knowledge engineering methods. Thereby the costs for up-front knowledge engineering and the need to consult domain experts can be considerably reduced. We present an automatic thesaurus generation tool and show how it can be applied to build and enhance an integrated ontology /thesaurus. A first evaluation shows that the proposed method does indeed facilitate knowledge acquisition and maintenance of an organizational memory
Design-by-analogy: experimental evaluation of a functional analogy search methodology for concept generation improvement
Design-by-analogy is a growing field of study and practice, due to its power to augment and extend traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. This paper presents the results of experimentally testing a new method for extracting functional analogies from general data sources, such as patent databases, to assist designers in systematically seeking and identifying analogies. In summary, the approach produces significantly improved results on the novelty of solutions generated and no significant change in the total quantity of solutions generated. Computationally, this design-by-analogy facilitation methodology uses a novel functional vector space representation to quantify the functional similarity between represented design problems and, in this case, patent descriptions of products. The mapping of the patents into the functional analogous words enables the generation of functionally relevant novel ideas that can be customized in various ways. Overall, this approach provides functionally relevant novel sources of design-by-analogy inspiration to designers and design teams.SUTD-MIT International Design Centre (IDC)National Science Foundation (U.S.) (Grant Numbers CMMI-0855326, CMMI-0855510, and CMMI-08552930
- …