4 research outputs found

    Model Transformation Languages with Modular Information Hiding

    Get PDF
    Model transformations, together with models, form the principal artifacts in model-driven software development. Industrial practitioners report that transformations on larger models quickly get sufficiently large and complex themselves. To alleviate entailed maintenance efforts, this thesis presents a modularity concept with explicit interfaces, complemented by software visualization and clustering techniques. All three approaches are tailored to the specific needs of the transformation domain

    Métodos de clustering en datos de expresión génica

    Get PDF
    Clustering is an old data analysis problem that has been extensively studied during the last decades. However, there is not a single algorithm that provides a satisfactory result for every data set. Moreover, there exist some problems related to cluster analysis that also remain unsolved. In this monograph we study some of such problems as they commonly appear in practice, and test how they work when applied to gene expression data analysis, where clustering is widely used. Different clustering algorithms often lead to different results, and in order to make sense out of them it is important to understand how clusters from one analysis relate to those from a different one. A comparison method to find and visualize many-to-many relationships between two clusterings, either two flat clusterings or a flat and a hierarchical clustering, is presented. The similarities between clusters are represented by a weighted bipartite graph, where the nodes are the clusters and an edge weight shows the number of elements in common to the connected nodes. To visualize the relationships between clusterings the number of edge crossings is minimized. When applied to the case of comparing a hierarchical and a flat clustering we use a criterion based either on the graph layout aesthetics or in the mutual information, to decide where to cut the hierarchical tree. Since iterative methods are sensitive to the initial parameters, we have developed two refinement algorithms designed to improve this initial state, based on the notion of data depth. One of these algorithms looks for initial points in the same data space, while the second one, using the bootstrap technique, selects the initial seeds in a new space of bootstrap centroids. Also, this second approach allows to construct a soft (non-hard) clustering of the data, that assigns to each point a probability of belonging to each cluster, and thus a single point may partially belong to more than one cluster. On the other hand, the number of clusters underlying in a data set is usually unknown. Using ideas from the clustering comparison method previously proposed and from the data depth concept, we present three procedures to estimate the number of real groups. The first two methods consist basically in sampling pairs of clusterings from a population and successively performing comparisons between them to find a consensus in the number of clusters, and the third one looks for representative subsets of the clusters whose diameter is used to estimate the optimal number of real groups. The extensive study we carried out in simulated and real gene expression data shows that the techniques presented here are useful and e±cient. The results that we obtained with real data make sense not only from a statistical point of view, but they have proven to have a biological meaning. ______________________________________________El análisis cluster es un antiguo problema revivido en las últimas décadas. En el trabajo presentado abordamos algunos problemas que aparecen en la práctica. Para entender los distintos resultados producidos por diferentes algoritmos es importante estudiar la relación entre clusters procedentes de análisis diferentes, por lo que presentamos un método de comparación para visualizar relaciones entre clusterings jerárquicos o no-jerárquicos, basado en grafos, utilizando un criterio de estética o de información mutua para cortar los dendrogramas en el caso jerárquico. Desarrollamos dos algoritmos de refinamiento del estado inicial de métodos de clustering iterativos, utilizando el concepto de profundidad y bootstrap. Esto además permite desarrollar un algoritmo de clustering no rígido, asignando a los puntos probabilidades de pertenencia a los clusters. Para determinar el número de grupos de un conjunto (habitualmente desconocido) hemos utilizado ideas del método de comparación y el concepto de profundidad, desarrollando tres técnicas de estimación. Hemos realizado un estudio extensivo para todos los métodos propuestos en datos simulados y en datos de expresión génica, y hemos probado que las técnicas desarrolladas en este trabajo son útiles y eficientes, tanto desde un punto de vista estadístico como biológic

    Software restructuring: understanding longitudinal architectural changes and refactoring

    Get PDF
    The complexity of software systems increases as the systems evolve. As the degradation of the system's structure accumulates, maintenance effort and defect-proneness tend to increase. In addition, developers often opt to employ sub-optimal solutions in order to achieve short-time goals, in a phenomenon that has been recently called technical debt. In this context, software restructuring serves as a way to alleviate and/or prevent structural degradation. Restructuring of software is usually performed in either higher or lower levels of granularity, where the first indicates broader changes in the system's structural architecture and the latter indicates refactorings performed to fewer and localised code elements. Although tools to assist architectural changes and refactoring are available, there is still no evidence these approaches are widely adopted by practitioners. Hence, an understanding of how developers perform architectural changes and refactoring in their daily basis and in the context of the software development processes they adopt is necessary. Current software development is iterative and incremental with short cycles of development and release. Thus, tools and processes that enable this development model, such as continuous integration and code review, are widespread among software engineering practitioners. Hence, this thesis investigates how developers perform longitudinal and incremental architectural changes and refactoring during code review through a wide range of empirical studies that consider different moments of the development lifecycle, different approaches, different automated tools and different analysis mechanisms. Finally, the observations and conclusions drawn from these empirical investigations extend the existing knowledge on how developers restructure software systems, in a way that future studies can leverage this knowledge to propose new tools and approaches that better fit developers' working routines and development processes
    corecore