12,344 research outputs found

    Soft Seeded SSL Graphs for Unsupervised Semantic Similarity-based Retrieval

    Full text link
    Semantic similarity based retrieval is playing an increasingly important role in many IR systems such as modern web search, question-answering, similar document retrieval etc. Improvements in retrieval of semantically similar content are very significant to applications like Quora, Stack Overflow, Siri etc. We propose a novel unsupervised model for semantic similarity based content retrieval, where we construct semantic flow graphs for each query, and introduce the concept of "soft seeding" in graph based semi-supervised learning (SSL) to convert this into an unsupervised model. We demonstrate the effectiveness of our model on an equivalent question retrieval problem on the Stack Exchange QA dataset, where our unsupervised approach significantly outperforms the state-of-the-art unsupervised models, and produces comparable results to the best supervised models. Our research provides a method to tackle semantic similarity based retrieval without any training data, and allows seamless extension to different domain QA communities, as well as to other semantic equivalence tasks.Comment: Published in Proceedings of the 2017 ACM Conference on Information and Knowledge Management (CIKM '17

    Precise Modelling of Compensating Business Transactions and its Application to BPEL

    No full text
    We describe the StAC language which can be used to specify the orchestration of activities in long running business transactions. Long running business transactions use compensation to cope with exceptions. StAC supports sequential and parallel behaviour as well as exception and compensation handling. We also show how the B notation may be combined with StAC to specify the data aspects of transactions. The combination of StAC and B provides a rich formal notation which allows for succinct and precise specification of business transactions. BPEL is an industry standard language for specifying business transactions and includes compensation constructs. We show how a substantial subset of BPEL can be mapped to StAC thus demonstrating the expressiveness of StAC and providing a formal semantics for BPEL

    mARC: Memory by Association and Reinforcement of Contexts

    Full text link
    This paper introduces the memory by Association and Reinforcement of Contexts (mARC). mARC is a novel data modeling technology rooted in the second quantization formulation of quantum mechanics. It is an all-purpose incremental and unsupervised data storage and retrieval system which can be applied to all types of signal or data, structured or unstructured, textual or not. mARC can be applied to a wide range of information clas-sification and retrieval problems like e-Discovery or contextual navigation. It can also for-mulated in the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast to Conway approach, the objects evolve in a massively multidimensional space. In order to start evaluating the potential of mARC we have built a mARC-based Internet search en-gine demonstrator with contextual functionality. We compare the behavior of the mARC demonstrator with Google search both in terms of performance and relevance. In the study we find that the mARC search engine demonstrator outperforms Google search by an order of magnitude in response time while providing more relevant results for some classes of queries

    A New Geometric Approach to Latent Topic Modeling and Discovery

    Full text link
    A new geometrically-motivated algorithm for nonnegative matrix factorization is developed and applied to the discovery of latent "topics" for text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme points of empirical cross-document word-frequencies that correspond to novel "words" unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets.Comment: This paper was submitted to the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2013 on November 30, 201

    Semantic model-driven development of web service architectures.

    Get PDF
    Building service-based architectures has become a major area of interest since the advent of Web services. Modelling these architectures is a central activity. Model-driven development is a recent approach to developing software systems based on the idea of making models the central artefacts for design representation, analysis, and code generation. We propose an ontology-based engineering methodology for semantic model-driven composition and transformation of Web service architectures. Ontology technology as a logic-based knowledge representation and reasoning framework can provide answers to the needs of sharable and reusable semantic models and descriptions needed for service engineering. Based on modelling, composition and code generation techniques for service architectures, our approach provides a methodological framework for ontology-based semantic service architecture
    corecore