177,877 research outputs found

    Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach

    Get PDF
    A significant amount of search queries originate from some real world information need or tasks. In order to improve the search experience of the end users, it is important to have accurate representations of tasks. As a result, significant amount of research has been devoted to extracting proper representations of tasks in order to enable search systems to help users complete their tasks, as well as providing the end user with better query suggestions, for better recommendations, for satisfaction prediction, and for improved personalization in terms of tasks. Most existing task extraction methodologies focus on representing tasks as flat structures. However, tasks often tend to have multiple subtasks associated with them and a more naturalistic representation of tasks would be in terms of a hierarchy, where each task can be composed of multiple (sub)tasks. To this end, we propose an efficient Bayesian nonparametric model for extracting hierarchies of such tasks \& subtasks. We evaluate our method based on real world query log data both through quantitative and crowdsourced experiments and highlight the importance of considering task/subtask hierarchies.Comment: 10 pages. Accepted at SIGIR 2017 as a full pape

    A Factor Graph Approach to Automated Design of Bayesian Signal Processing Algorithms

    Get PDF
    The benefits of automating design cycles for Bayesian inference-based algorithms are becoming increasingly recognized by the machine learning community. As a result, interest in probabilistic programming frameworks has much increased over the past few years. This paper explores a specific probabilistic programming paradigm, namely message passing in Forney-style factor graphs (FFGs), in the context of automated design of efficient Bayesian signal processing algorithms. To this end, we developed "ForneyLab" (https://github.com/biaslab/ForneyLab.jl) as a Julia toolbox for message passing-based inference in FFGs. We show by example how ForneyLab enables automatic derivation of Bayesian signal processing algorithms, including algorithms for parameter estimation and model comparison. Crucially, due to the modular makeup of the FFG framework, both the model specification and inference methods are readily extensible in ForneyLab. In order to test this framework, we compared variational message passing as implemented by ForneyLab with automatic differentiation variational inference (ADVI) and Monte Carlo methods as implemented by state-of-the-art tools "Edward" and "Stan". In terms of performance, extensibility and stability issues, ForneyLab appears to enjoy an edge relative to its competitors for automated inference in state-space models.Comment: Accepted for publication in the International Journal of Approximate Reasonin

    Development and evaluation of a web-based learning system based on learning object design and generative learning to improve higher-order thinking skills and learning

    Get PDF
    This research aims to design, develop and evaluate the effectiveness of a Webbased learning system prototype called Generative Object Oriented Design (GOOD) learning system. Result from the preliminary study conducted showed most of the students were at lower order thinking skills (LOTS) compared to higher order thinking skills (HOTS) based on Bloom’s Taxonomy. Based on such concern, GOOD learning system was designed and developed based on learning object design and generative learning to improve HOTS and learning. A conceptual model design of GOOD learning system, called Generative Learning Object Organizer and Thinking Tasks (GLOOTT) model, has been proposed from the theoretical framework of this research. The topic selected for this research was Computer System (CS) which focused on the hardware concepts from the first year Diploma of Computer Science subjects. GOOD learning system acts as a mindtool to improve HOTS and learning in CS. A pre-experimental research design of one group pretest and posttest was used in this research. The samples of this research were 30 students and 12 lecturers. Data was collected from the pretest, posttest, portfolio, interview and Web-based learning system evaluation form. The paired-samples T test analysis was used to analyze the achievement of the pretest and posttest and the result showed that there was significance difference between the mean scores of pretest and posttest at the significant level a = 0.05 (p=0.000). In addition, the paired-samples T test analysis of the cognitive operations from Bloom’s Taxonomy showed that there was significance difference for each of the cognitive operation of the students before and after using GOOD learning system. Results from the study showed improvement of HOTS and learning among the students. Besides, analysis of portfolio showed that the students engaged HOTS during the use of the system. Most of the students and lecturers gave positive comments about the effectiveness of the system in improving HOTS and learning in CS. From the findings in this research, GOOD learning system has the potential to improve students’ HOTS and learning

    Aerospace Manufacturing Industry: A Simulation-Based Decision Support Framework for the Scheduling of Complex Hoist Lines

    Get PDF
    The hoist scheduling problem is a critical issue in the design and control of Automated Manufacturing Systems. To deal with the major complexities appearing in such problem, this work introduces an advanced simulation model to represent the short-term scheduling of complex hoist lines. The aim is to find the best jobs schedule that minimizing the makespan while maximizing throughput with no defective outputs. Several hard constraints are considered in the model: single shared hoist, heterogeneous recipes, eventual recycles flows, and no buffers between workstations. Different heuristic-based strategies are incorporated into the computer model in order to improve the solutions generated over time. The alternative solutions can be quickly evaluated by using a graphical user interface developed together with the simulation model.Fil: Basán, Natalia Paola. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Desarrollo Tecnológico para la Industria Química. Universidad Nacional del Litoral. Instituto de Desarrollo Tecnológico para la Industria Química; ArgentinaFil: Pulido, Raul. Universidad Politécnica de Madrid; EspañaFil: Coccola, Mariana Evangelina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Desarrollo Tecnológico para la Industria Química. Universidad Nacional del Litoral. Instituto de Desarrollo Tecnológico para la Industria Química; ArgentinaFil: Mendez, Carlos Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Desarrollo Tecnológico para la Industria Química. Universidad Nacional del Litoral. Instituto de Desarrollo Tecnológico para la Industria Química; Argentin

    Finding Non-overlapping Clusters for Generalized Inference Over Graphical Models

    Full text link
    Graphical models use graphs to compactly capture stochastic dependencies amongst a collection of random variables. Inference over graphical models corresponds to finding marginal probability distributions given joint probability distributions. In general, this is computationally intractable, which has led to a quest for finding efficient approximate inference algorithms. We propose a framework for generalized inference over graphical models that can be used as a wrapper for improving the estimates of approximate inference algorithms. Instead of applying an inference algorithm to the original graph, we apply the inference algorithm to a block-graph, defined as a graph in which the nodes are non-overlapping clusters of nodes from the original graph. This results in marginal estimates of a cluster of nodes, which we further marginalize to get the marginal estimates of each node. Our proposed block-graph construction algorithm is simple, efficient, and motivated by the observation that approximate inference is more accurate on graphs with longer cycles. We present extensive numerical simulations that illustrate our block-graph framework with a variety of inference algorithms (e.g., those in the libDAI software package). These simulations show the improvements provided by our framework.Comment: Extended the previous version to include extensive numerical simulations. See http://www.ima.umn.edu/~dvats/GeneralizedInference.html for code and dat

    Efficient Multi-way Theta-Join Processing Using MapReduce

    Full text link
    Multi-way Theta-join queries are powerful in describing complex relations and therefore widely employed in real practices. However, existing solutions from traditional distributed and parallel databases for multi-way Theta-join queries cannot be easily extended to fit a shared-nothing distributed computing paradigm, which is proven to be able to support OLAP applications over immense data volumes. In this work, we study the problem of efficient processing of multi-way Theta-join queries using MapReduce from a cost-effective perspective. Although there have been some works using the (key,value) pair-based programming model to support join operations, efficient processing of multi-way Theta-join queries has never been fully explored. The substantial challenge lies in, given a number of processing units (that can run Map or Reduce tasks), mapping a multi-way Theta-join query to a number of MapReduce jobs and having them executed in a well scheduled sequence, such that the total processing time span is minimized. Our solution mainly includes two parts: 1) cost metrics for both single MapReduce job and a number of MapReduce jobs executed in a certain order; 2) the efficient execution of a chain-typed Theta-join with only one MapReduce job. Comparing with the query evaluation strategy proposed in [23] and the widely adopted Pig Latin and Hive SQL solutions, our method achieves significant improvement of the join processing efficiency.Comment: VLDB201

    Description of the Chinese-to-Spanish rule-based machine translation system developed with a hybrid combination of human annotation and statistical techniques

    Get PDF
    Two of the most popular Machine Translation (MT) paradigms are rule based (RBMT) and corpus based, which include the statistical systems (SMT). When scarce parallel corpus is available, RBMT becomes particularly attractive. This is the case of the Chinese--Spanish language pair. This article presents the first RBMT system for Chinese to Spanish. We describe a hybrid method for constructing this system taking advantage of available resources such as parallel corpora that are used to extract dictionaries and lexical and structural transfer rules. The final system is freely available online and open source. Although performance lags behind standard SMT systems for an in-domain test set, the results show that the RBMT’s coverage is competitive and it outperforms the SMT system in an out-of-domain test set. This RBMT system is available to the general public, it can be further enhanced, and it opens up the possibility of creating future hybrid MT systems.Peer ReviewedPostprint (author's final draft

    A Survey on Compiler Autotuning using Machine Learning

    Full text link
    Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

    Discovery and Selection of Certified Web Services Through Registry-Based Testing and Verification

    Get PDF
    Reliability and trust are fundamental prerequisites for the establishment of functional relationships among peers in a Collaborative Networked Organisation (CNO), especially in the context of Virtual Enterprises where economic benefits can be directly at stake. This paper presents a novel approach towards effective service discovery and selection that is no longer based on informal, ambiguous and potentially unreliable service descriptions, but on formal specifications that can be used to verify and certify the actual Web service implementations. We propose the use of Stream X-machines (SXMs) as a powerful modelling formalism for constructing the behavioural specification of a Web service, for performing verification through the generation of exhaustive test cases, and for performing validation through animation or model checking during service selection
    • …
    corecore