329 research outputs found

    Transforming Graph Representations for Statistical Relational Learning

    Full text link
    Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since the choice of relational data representation for the nodes, links, and features can dramatically affect the capabilities of SRL algorithms, we survey approaches and opportunities for relational representation transformation designed to improve the performance of these algorithms. This leads us to introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. In particular, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey and compare competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed

    Finding the pitfalls in query performance

    Get PDF
    Despite their popularity, database benchmarks only highlight a small part of the capabilities of any given system. They do not necessarily highlight problematic components encountered in real life or provide hints for further research and engineering.In this paper we introduce discriminative performance benchmarking, which aids in exploring a larger search space to find performance outliers and their underlying cause. The approach is based on deriving a domain specific language from a sample query to identify a query workload. SQLscalpel subsequently explores the space using query morphing, and simulated annealing to find performance outliers, and the query components responsible. To speedup the exploration for often time-consuming experiments SQLscalpel has been designed to run asynchronously on a large cluster of machines.</p

    Influence modelling and learning between dynamic bayesian networks using score-based structure learning

    Get PDF
    A Ph.D. thesis submitted to the Faculty of Science, University of the Witwatersrand, in fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science May 2018Although partially observable stochastic processes are ubiquitous in many fields of science, little work has been devoted to discovering and analysing the means by which several such processes may interact to influence each other. In this thesis we extend probabilistic structure learning between random variables to the context of temporal models which represent partially observable stochastic processes. Learning an influence structure and distribution between processes can be useful for density estimation and knowledge discovery. A common approach to structure learning, in observable data, is score-based structure learning, where we search for the most suitable structure by using a scoring metric to value structural configurations relative to the data. Most popular structure scores are variations on the likelihood score which calculates the probability of the data given a potential structure. In observable data, the decomposability of the likelihood score, which is the ability to represent the score as a sum of family scores, allows for efficient learning procedures and significant computational saving. However, in incomplete data (either by latent variables or missing samples), the likelihood score is not decomposable and we have to perform inference to evaluate it. This forces us to use non-linear optimisation techniques to optimise the likelihood function. Furthermore, local changes to the network can affect other parts of the network, which makes learning with incomplete data all the more difficult. We define two general types of influence scenarios: direct influence and delayed influence which can be used to define influence around richly structured spaces; consisting of multiple processes that are interrelated in various ways. We will see that although it is possible to capture both types of influence in a single complex model by using a setting of the parameters, complex representations run into fragmentation issues. This is handled by extending the language of dynamic Bayesian networks to allow us to construct single compact models that capture the properties of a system’s dynamics, and produce influence distributions dynamically. The novelty and intuition of our approach is to learn the optimal influence structure in layers. We firstly learn a set of independent temporal models, and thereafter, optimise a structure score over possible structural configurations between these temporal models. Since the search for the optimal structure is done using complete data we can take advantage of efficient learning procedures from the structure learning literature. We provide the following contributions: we (a) introduce the notion of influence between temporal models; (b) extend traditional structure scores for random variables to structure scores for temporal models; (c) provide a complete algorithm to recover the influence structure between temporal models; (d) provide a notion of structural assembles to relate temporal models for types of influence; and finally, (e) provide empirical evidence for the effectiveness of our method with respect to generative ground-truth distributions. The presented results emphasise the trade-off between likelihood of an influence structure to the ground-truth and the computational complexity to express it. Depending on the availability of samples we might choose different learning methods to express influence relations between processes. On one hand, when given too few samples, we may choose to learn a sparse structure using tree-based structure learning or even using no influence structure at all. On the other hand, when given an abundant number of samples, we can use penalty-based procedures that achieve rich meaningful representations using local search techniques. Once we consider high-level representations of dynamic influence between temporal models, we open the door to very rich and expressive representations which emphasise the importance of knowledge discovery and density estimation in the temporal setting.MT 201

    A simulation modelling approach to improve the OEE of a bottling line

    Get PDF
    This dissertation presents a simulation approach to improve the efficiency performance, in terms of OEE, of an automated bottling line. A simulation model of the system is created by means of the software AnyLogic; it is used to solve the case. The problems faced are a sequencing problem related to the order the formats of bottles are processed and the buffer sizing problem. Either theoretical aspects on OEE, job sequencing and simulation and practical aspects are presented

    Optimización del diseño estructural de pavimentos asfálticos para calles y carreteras

    Get PDF
    gráficos, tablasThe construction of asphalt pavements in streets and highways is an activity that requires optimizing the consumption of significant economic and natural resources. Pavement design optimization meets contradictory objectives according to the availability of resources and users’ needs. This dissertation explores the application of metaheuristics to optimize the design of asphalt pavements using an incremental design based on the prediction of damage and vehicle operating costs (VOC). The costs are proportional to energy and resource consumption and polluting emissions. The evolution of asphalt pavement design and metaheuristic optimization techniques on this topic were reviewed. Four computer programs were developed: (1) UNLEA, a program for the structural analysis of multilayer systems. (2) PSO-UNLEA, a program that uses particle swarm optimization metaheuristic (PSO) for the backcalculation of pavement moduli. (3) UNPAVE, an incremental pavement design program based on the equations of the North American MEPDG and includes the computation of vehicle operating costs based on IRI. (4) PSO-PAVE, a PSO program to search for thicknesses that optimize the design considering construction and vehicle operating costs. The case studies show that the backcalculation and structural design of pavements can be optimized by PSO considering restrictions in the thickness and the selection of materials. Future developments should reduce the computational cost and calibrate the pavement performance and VOC models. (Texto tomado de la fuente)La construcción de pavimentos asfálticos en calles y carreteras es una actividad que requiere la optimización del consumo de cuantiosos recursos económicos y naturales. La optimización del diseño de pavimentos atiende objetivos contradictorios de acuerdo con la disponibilidad de recursos y las necesidades de los usuarios. Este trabajo explora el empleo de metaheurísticas para optimizar el diseño de pavimentos asfálticos empleando el diseño incremental basado en la predicción del deterioro y los costos de operación vehicular (COV). Los costos son proporcionales al consumo energético y de recursos y las emisiones contaminantes. Se revisó la evolución del diseño de pavimentos asfálticos y el desarrollo de técnicas metaheurísticas de optimización en este tema. Se desarrollaron cuatro programas de computador: (1) UNLEA, programa para el análisis estructural de sistemas multicapa. (2) PSO-UNLEA, programa que emplea la metaheurística de optimización con enjambre de partículas (PSO) para el cálculo inverso de módulos de pavimentos. (3) UNPAVE, programa de diseño incremental de pavimentos basado en las ecuaciones de la MEPDG norteamericana, y el cálculo de costos de construcción y operación vehicular basados en el IRI. (4) PSO-PAVE, programa que emplea la PSO en la búsqueda de espesores que permitan optimizar el diseño considerando los costos de construcción y de operación vehicular. Los estudios de caso muestran que el cálculo inverso y el diseño estructural de pavimentos pueden optimizarse mediante PSO considerando restricciones en los espesores y la selección de materiales. Los desarrollos futuros deben enfocarse en reducir el costo computacional y calibrar los modelos de deterioro y COV.DoctoradoDoctor en Ingeniería - Ingeniería AutomáticaDiseño incremental de pavimentosEléctrica, Electrónica, Automatización Y Telecomunicacione
    corecore