29 research outputs found

    (Di)graph products, labelings and related results

    Get PDF
    Gallian's survey shows that there is a big variety of labelings of graphs. By means of (di)graphs products we can establish strong relations among some of them. Moreover, due to the freedom of one of the factors, we can also obtain enumerative results that provide lower bounds on the number of nonisomorphic labelings of a particular type. In this paper, we will focus in three of the (di)graphs products that have been used in these duties: the ⊗h-product of digraphs, the weak tensor product of graphs and the weak ⊗h-product of graphs.Reseach supported by the Spanish Government under project MTM2014-60127-P and symbolically by the Catalan Research Council under grant 2014SGR1147

    Practical programming for static average-case analysis: the MOQA investigation

    Get PDF
    This work considers the static calculation of a program’s average-case time. The number of systems that currently tackle this research problem is quite small due to the difficulties inherent in average-case analysis. While each of these systems make a pertinent contribution, and are individually discussed in this work, only one of them forms the basis of this research. That particular system is known as MOQA. The MOQA system consists of the MOQA language and the MOQA static analysis tool. Its technique for statically determining average-case behaviour centres on maintaining strict control over both the data structure type and the labeling distribution. This research develops and evaluates the MOQA language implementation, and adds to the functions already available in this language. Furthermore, the theory that backs MOQA is generalised and the range of data structures for which the MOQA static analysis tool can determine average-case behaviour is increased. Also, some of the MOQA applications and extensions suggested in other works are logically examined here. For example, the accuracy of classifying the MOQA language as reversible is investigated, along with the feasibility of incorporating duplicate labels into the MOQA theory. Finally, the analyses that take place during the course of this research reveal some of the MOQA strengths and weaknesses. This thesis aims to be pragmatic when evaluating the current MOQA theory, the advancements set forth in the following work and the benefits of MOQA when compared to similar systems. Succinctly, this work’s significant expansion of the MOQA theory is accompanied by a realistic assessment of MOQA’s accomplishments and a serious deliberation of the opportunities available to MOQA in the future

    Clustering of RCE Workflow Graphs

    Get PDF
    RCE is an integration environment which allows to create automated workflows orchestrating multi-disciplinary simulation tools in a distributed manner. A workflow consists of components representing tools and connections between these components. The components can be grouped by users within the GUI by creating colored labels. This requires specialist knowledge and is a fully manual task. We investigate the feasibility of automating this task by applying graph clustering methods on such workflows. To this end, we model graphs based on workflows by adopting components as vertices and connections as edges whereby we transfer connection properties to edge weights. We examine three different hierarchical clustering algorithms: edge betweenness, spectral bisection and agglomerative clustering. Additionally, we apply four different metrics to stop the algorithms when a cluster is found: cluster density, global clustering coefficient, average local clustering coefficient and modularity. We examine different mappings of edge weights in combination with the mentioned algorithms and metrics. As groups in workflows have no canonical definition we evaluate our approach qualitatively. We consider 27 results of 1008 parameter combinations as useful. The most expedient approach across multiple workflows is the edge betweenness algorithm with the modularity metric with an undirected graph representation. The scores for the metrics and the mapping vary across workflows and do not enable us to draw general conclusions. We show that our approach is feasible, whereas we remark that a quantitative study is necessary to validate our results in general

    The Doors of Perception

    Get PDF
    We investigate how a player’s strategic behavior is affected by the set of notions she uses in thinking about the game, i.e., the “frame”. To do so, we consider matching games where two players are presented with a set of objects, from which each player must privately choose one (with the goal of matching the counterpart’s choice). We propose a novel theory positing that different player types are aware of different attributes of the strategy options, hence different frames; we then rationalize why differences in players’ frames may lead to differences in choice behavior. Unlike previous theories of framing, our model features an epistemic structure allowing for the case in which an individual learns new frames, given some initial unawareness (of the fact that her perception of attributes may be incomplete). To test our model, we introduce an experimental design in which we bring about different frames by manipulating subjects’ awareness of various attributes. The experimental results provide strong support for our theory

    Automated Gene Classification using Nonnegative Matrix Factorization on Biomedical Literature

    Get PDF
    Understanding functional gene relationships is a challenging problem for biological applications. High-throughput technologies such as DNA microarrays have inundated biologists with a wealth of information, however, processing that information remains problematic. To help with this problem, researchers have begun applying text mining techniques to the biological literature. This work extends previous work based on Latent Semantic Indexing (LSI) by examining Nonnegative Matrix Factorization (NMF). Whereas LSI incorporates the singular value decomposition (SVD) to approximate data in a dense, mixed-sign space, NMF produces a parts-based factorization that is directly interpretable. This space can, in theory, be used to augment existing ontologies and annotations by identifying themes within the literature. Of course, performing NMF does not come without a price—namely, the large number of parameters. This work attempts to analyze the effects of some of the NMF parameters on both convergence and labeling accuracy. Since there is a dearth of automated label evaluation techniques as well as “gold standard” hierarchies, a method to produce “correct” trees is proposed as well as a technique to label trees and to evaluate those labels

    On improving the performance of optimistic distributed simulations

    No full text
    This report investigates means of improving the performance of optimistic distributed simulations without affecting the simulation accuracy. We argue that existing clustering algorithms are not adequate for application in distributed simulations, and outline some characteristics of an ideal algorithm that could be applied in this field. This report is structured as follows. We start by introducing the area of distributed simulation. Following a comparison of the dominant protocols used in distributed simulation, we elaborate on the current approaches of improving the simulation performance, using computation efficient techniques, exploiting the hardware configuration of processors, optimizations that can be derived from the simulation scenario, etc. We introduce the core characteristics of clustering approaches and argue that these cannot be applied in real-life distributed simulation problems. We present a typical distributed simulation setting and elaborate on the reasons that existing clustering approaches are not expected to improve the performance of a distributed simulation. We introduce a prototype distributed simulation platform that has been developed in the scope of this research, focusing on the area of emergency response and specifically building evacuation. We continue by outlining our current work on this issue, and finally, we end this report by outlining next actions which could be made in this field

    Evaluating Clusterings by Estimating Clarity

    Get PDF
    In this thesis I examine clustering evaluation, with a subfocus on text clusterings specifically. The principal work of this thesis is the development, analysis, and testing of a new internal clustering quality measure called informativeness. I begin by reviewing clustering in general. I then review current clustering quality measures, accompanying this with an in-depth discussion of many of the important properties one needs to understand about such measures. This is followed by extensive document clustering experiments that show problems with standard clustering evaluation practices. I then develop informativeness, my new internal clustering quality measure for estimating the clarity of clusterings. I show that informativeness, which uses classification accuracy as a proxy for human assessment of clusterings, is both theoretically sensible and works empirically. I present a generalization of informativeness that leverages external clustering quality measures. I also show its use in a realistic application: email spam filtering. I show that informativeness can be used to select clusterings which lead to superior spam filters when few true labels are available. I conclude this thesis with a discussion of clustering evaluation in general, informativeness, and the directions I believe clustering evaluation research should take in the future

    An inter-domain supervision framework for collaborative clustering of data with mixed types.

    Get PDF
    We propose an Inter-Domain Supervision (IDS) clustering framework to discover clusters within diverse data formats, mixed-type attributes and different sources of data. This approach can be used for combined clustering of diverse representations of the data, in particular where data comes from different sources, some of which may be unreliable or uncertain, or for exploiting optional external concept set labels to guide the clustering of the main data set in its original domain. We additionally take into account possible incompatibilities in the data via an automated inter-domain compatibility analysis. Our results in clustering real data sets with mixed numerical, categorical, visual and text attributes show that the proposed IDS clustering framework gives improved clustering results compared to conventional methods, over a wide range of parameters. Thus the automatically extracted knowledge, in the form of seeds or constraints, obtained from clustering one domain, can provide additional knowledge to guide the clustering in another domain. Additional empirical evaluations further show that our approach, especially when using selective mutual guidance between domains, outperforms common baselines such as clustering either domain on its own or clustering all domains converted to a single target domain. Our approach also outperforms other specialized multiple clustering methods, such as the fully independent ensemble clustering and the tightly coupled multiview clustering, after they were adapted to the task of clustering mixed data. Finally, we present a real life application of our IDS approach to the cluster-based automated image annotation problem and present evaluation results on a benchmark data set, consisting of images described with their visual content along with noisy text descriptions, generated by users on the social media sharing website, Flickr
    corecore