1,460 research outputs found

    Combinatorial algorithm for counting small induced graphs and orbits

    Full text link
    Graphlet analysis is an approach to network analysis that is particularly popular in bioinformatics. We show how to set up a system of linear equations that relate the orbit counts and can be used in an algorithm that is significantly faster than the existing approaches based on direct enumeration of graphlets. The algorithm requires existence of a vertex with certain properties; we show that such vertex exists for graphlets of arbitrary size, except for complete graphs and C4C_4, which are treated separately. Empirical analysis of running time agrees with the theoretical results

    Attribute Interactions in Medical Data Analysis

    Get PDF
    There is much empirical evidence about the success of naive Bayesian classification (NBC) in medical applications of attribute-based machine learning. NBC assumes conditional independence between attributes. In classification, such classifiers sum up the pieces of class-related evidence from individual attributes, independently of other attributes. The performance, however, deteriorates significantly when the “interactions” between attributes become critical. We propose an approach to handling attribute interactions within the framework of “voting” classifiers, such as NBC. We propose an operational test for detecting interactions in learning data and a procedure that takes the detected interactions into account while learning. This approach induces a structuring of the domain of attributes, it may lead to improved classifier’s performance and may provide useful novel information for the domain expert when interpreting the results of learning. We report on its application in data analysis and model construction for the prediction of clinical outcome in hip arthroplasty

    Sequential Symbolic Regression with Genetic Programming

    Get PDF
    This chapter describes the Sequential Symbolic Regression (SSR) method, a new strategy for function approximation in symbolic regression. The SSR method is inspired by the sequential covering strategy from machine learning, but instead of sequentially reducing the size of the problem being solved, it sequentially transforms the original problem into potentially simpler problems. This transformation is performed according to the semantic distances between the desired and obtained outputs and a geometric semantic operator. The rationale behind SSR is that, after generating a suboptimal function f via symbolic regression, the output errors can be approximated by another function in a subsequent iteration. The method was tested in eight polynomial functions, and compared with canonical genetic programming (GP) and geometric semantic genetic programming (SGP). Results showed that SSR significantly outperforms SGP and presents no statistical difference to GP. More importantly, they show the potential of the proposed strategy: an effective way of applying geometric semantic operators to combine different (partial) solutions, avoiding the exponential growth problem arising from the use of these operators

    Seed selection for information cascade in multilayer networks

    Full text link
    Information spreading is an interesting field in the domain of online social media. In this work, we are investigating how well different seed selection strategies affect the spreading processes simulated using independent cascade model on eighteen multilayer social networks. Fifteen networks are built based on the user interaction data extracted from Facebook public pages and tree of them are multilayer networks downloaded from public repository (two of them being Twitter networks). The results indicate that various state of the art seed selection strategies for single-layer networks like K-Shell or VoteRank do not perform so well on multilayer networks and are outperformed by Degree Centrality

    Synthesis and Crystal Structure of Tetraethylammonium Di-ÎĽ-fluoro-bis[aquadifluoro-oxovanadate(IV)]

    Get PDF
    The title complex, [NEt4h [V202F6(H20h], has been isolated from an aqueous solution of VOF2 and [NEt4]F. The crystal stru..:. cture has been determined from threedimensional counter X-ray data. It crystallizes in the monoclinic space group P21/n with a = 708.8(1), b = 1316.6(2), c = 1362.4(2) pm, p = 97.58(1)0 and Z = 2. Least-squares refinement of the structure based on 1523 observations led to final discrepancy indices of R = 0.058 and Rw = = 0.067. The structure consists of discrete dinuclear u11l:ci [V202F5(H20h]2- with a crystallogra<phic centre of inversion. Dimeric units are linked into chains by hydrogen bonds [O-H ... F 259.1(5) and 267.9(5) pm]. The geometry around vanadium is distorted octahedral with V-F distances from 192.0(4) to 220.9(3) pm, V-0 159.4(4) pm, V-OH2 208.3(4) pm, V-V 332.8(1) pm, and V-F-V 106.0(2)

    Automatic detection of potentially illegal online sales of elephant ivory via data mining

    Get PDF
    In this work, we developed an automated system to detect potentially illegal elephant ivory items for sale on eBay. Two law enforcement experts, with specific knowledge of elephant ivory identification, manually classified items on sale in the Antiques section of eBay UK over an 8 week period. This set the “Gold Standard” that we aim to emulate using data-mining. We achieved close to 93% accuracy with less data than the experts, as we relied entirely on metadata, but did not employ item descriptions or associated images, thus proving the potential and generality of our approach. The reported accuracy may be improved with the addition of text mining techniques for the analysis of the item description, and by applying image classification for the detection of Schreger lines, indicative of elephant ivory. However, any solution relying on images or text description could not be employed on other wildlife illegal markets where pictures can be missing or misleading and text absent (e.g., Instagram). In our setting, we gave human experts all available information while only using minimal information for our analysis. Despite this, we succeeded at achieving a very high accuracy. This work is an important first step in speeding up the laborious, tedious and expensive task of expert discovery of illegal trade over the internet. It will also allow for faster reporting to law enforcement and better accountability. We hope this will also contribute to reducing poaching, by making this illegal trade harder and riskier for those involved

    Quantifying gaze and mouse interactions on spatial visual interfaces with a new movement analytics methodology

    Get PDF
    This research was supported by the Royal Society International Exchange Programme (grant no. IE120643).Eye movements provide insights into what people pay attention to, and therefore are commonly included in a variety of human-computer interaction studies. Eye movement recording devices (eye trackers) produce gaze trajectories, that is, sequences of gaze location on the screen. Despite recent technological developments that enabled more affordable hardware, gaze data are still costly and time consuming to collect, therefore some propose using mouse movements instead. These are easy to collect automatically and on a large scale. If and how these two movement types are linked, however, is less clear and highly debated. We address this problem in two ways. First, we introduce a new movement analytics methodology to quantify the level of dynamic interaction between the gaze and the mouse pointer on the screen. Our method uses volumetric representation of movement, the space-time densities, which allows us to calculate interaction levels between two physically different types of movement. We describe the method and compare the results with existing dynamic interaction methods from movement ecology. The sensitivity to method parameters is evaluated on simulated trajectories where we can control interaction levels. Second, we perform an experiment with eye and mouse tracking to generate real data with real levels of interaction, to apply and test our new methodology on a real case. Further, as our experiment tasks mimics route-tracing when using a map, it is more than a data collection exercise and it simultaneously allows us to investigate the actual connection between the eye and the mouse. We find that there seem to be natural coupling when eyes are not under conscious control, but that this coupling breaks down when instructed to move them intentionally. Based on these observations, we tentatively suggest that for natural tracing tasks, mouse tracking could potentially provide similar information as eye-tracking and therefore be used as a proxy for attention. However, more research is needed to confirm this.Publisher PDFPeer reviewe

    Computation of Graphlet Orbits for Nodes and Edges in Sparse Graphs

    Get PDF
    Graphlet analysis is a useful tool for describing local network topology around individual nodes or edges. A node or an edge can be described by a vector containing the counts of different kinds of graphlets (small induced subgraphs) in which it appears, or the "roles" (orbits) it has within these graphlets. We implemented an R package with functions for fast computation of such counts on sparse graphs. Instead of enumerating all induced graphlets, our algorithm is based on the derived relations between the counts, which decreases the time complexity by an order of magnitude in comparison with past approaches

    Synthesis and Crystal Structure of Tetraethylammonium Di-ÎĽ-fluoro-bis[aquadifluoro-oxovanadate(IV)]

    Get PDF
    The title complex, [NEt4h [V202F6(H20h], has been isolated from an aqueous solution of VOF2 and [NEt4]F. The crystal stru..:. cture has been determined from threedimensional counter X-ray data. It crystallizes in the monoclinic space group P21/n with a = 708.8(1), b = 1316.6(2), c = 1362.4(2) pm, p = 97.58(1)0 and Z = 2. Least-squares refinement of the structure based on 1523 observations led to final discrepancy indices of R = 0.058 and Rw = = 0.067. The structure consists of discrete dinuclear u11l:ci [V202F5(H20h]2- with a crystallogra<phic centre of inversion. Dimeric units are linked into chains by hydrogen bonds [O-H ... F 259.1(5) and 267.9(5) pm]. The geometry around vanadium is distorted octahedral with V-F distances from 192.0(4) to 220.9(3) pm, V-0 159.4(4) pm, V-OH2 208.3(4) pm, V-V 332.8(1) pm, and V-F-V 106.0(2)
    • …
    corecore