1,460 research outputs found
Combinatorial algorithm for counting small induced graphs and orbits
Graphlet analysis is an approach to network analysis that is particularly
popular in bioinformatics. We show how to set up a system of linear equations
that relate the orbit counts and can be used in an algorithm that is
significantly faster than the existing approaches based on direct enumeration
of graphlets. The algorithm requires existence of a vertex with certain
properties; we show that such vertex exists for graphlets of arbitrary size,
except for complete graphs and , which are treated separately. Empirical
analysis of running time agrees with the theoretical results
Attribute Interactions in Medical Data Analysis
There is much empirical evidence about the success of naive Bayesian classification (NBC) in medical applications of attribute-based machine learning. NBC assumes conditional independence between attributes. In classification, such classifiers sum up the pieces of class-related evidence from individual attributes, independently of other attributes. The performance, however, deteriorates significantly when the “interactions” between attributes become critical. We propose an approach to handling attribute interactions within the framework of “voting” classifiers, such as NBC. We propose an operational test for detecting interactions in learning data and a procedure that takes the detected interactions into account while learning. This approach induces a structuring of the domain of attributes, it may lead to improved classifier’s performance and may provide useful novel information for the domain expert when interpreting the results of learning. We report on its application in data analysis and model construction for the prediction of clinical outcome in hip arthroplasty
Sequential Symbolic Regression with Genetic Programming
This chapter describes the Sequential Symbolic Regression (SSR) method, a new strategy for function approximation in symbolic regression. The SSR method is inspired by the sequential covering strategy from machine learning, but instead of sequentially reducing the size of the
problem being solved, it sequentially transforms the original problem into potentially simpler problems. This transformation is performed according to the semantic distances between the desired and obtained outputs and a geometric semantic operator. The rationale behind SSR is that, after generating a suboptimal function f via symbolic regression, the output errors can be approximated by another function in a subsequent iteration. The method was tested in eight polynomial functions, and compared with canonical genetic programming (GP) and geometric semantic genetic programming (SGP). Results showed that SSR significantly outperforms SGP and presents no statistical difference to GP. More importantly, they show the potential of the proposed strategy: an effective way of applying geometric semantic operators to combine different (partial) solutions, avoiding the exponential growth problem arising from the use of these operators
Seed selection for information cascade in multilayer networks
Information spreading is an interesting field in the domain of online social
media. In this work, we are investigating how well different seed selection
strategies affect the spreading processes simulated using independent cascade
model on eighteen multilayer social networks. Fifteen networks are built based
on the user interaction data extracted from Facebook public pages and tree of
them are multilayer networks downloaded from public repository (two of them
being Twitter networks). The results indicate that various state of the art
seed selection strategies for single-layer networks like K-Shell or VoteRank do
not perform so well on multilayer networks and are outperformed by Degree
Centrality
Synthesis and Crystal Structure of Tetraethylammonium Di-ÎĽ-fluoro-bis[aquadifluoro-oxovanadate(IV)]
The title complex, [NEt4h [V202F6(H20h], has been isolated
from an aqueous solution of VOF2 and [NEt4]F. The crystal stru..:.
cture has been determined from threedimensional counter X-ray
data. It crystallizes in the monoclinic space group P21/n with
a = 708.8(1), b = 1316.6(2), c = 1362.4(2) pm, p = 97.58(1)0 and Z = 2.
Least-squares refinement of the structure based on 1523 observations
led to final discrepancy indices of R = 0.058 and Rw =
= 0.067. The structure consists of discrete dinuclear u11l:ci
[V202F5(H20h]2- with a crystallogra<phic centre of inversion. Dimeric
units are linked into chains by hydrogen bonds [O-H ... F 259.1(5)
and 267.9(5) pm]. The geometry around vanadium is distorted octahedral
with V-F distances from 192.0(4) to 220.9(3) pm, V-0
159.4(4) pm, V-OH2 208.3(4) pm, V-V 332.8(1) pm, and V-F-V
106.0(2)
Automatic detection of potentially illegal online sales of elephant ivory via data mining
In this work, we developed an automated system to detect potentially illegal elephant ivory items for sale on eBay. Two law enforcement experts, with specific knowledge of elephant ivory identification, manually classified items on sale in the Antiques section of eBay UK over an 8 week period. This set the “Gold Standard” that we aim to emulate using data-mining. We achieved close to 93% accuracy with less data than the experts, as we relied entirely on metadata, but did not employ item descriptions or associated images, thus proving the potential and generality of our approach. The reported accuracy may be improved with the addition of text mining techniques for the analysis of the item description, and by applying image classification for the detection of Schreger lines, indicative of elephant ivory. However, any solution relying on images or text description could not be employed on other wildlife illegal markets where pictures can be missing or misleading and text absent (e.g., Instagram). In our setting, we gave human experts all available information while only using minimal information for our analysis. Despite this, we succeeded at achieving a very high accuracy. This work is an important first step in speeding up the laborious, tedious and expensive task of expert discovery of illegal trade over the internet. It will also allow for faster reporting to law enforcement and better accountability. We hope this will also contribute to reducing poaching, by making this illegal trade harder and riskier for those involved
Quantifying gaze and mouse interactions on spatial visual interfaces with a new movement analytics methodology
This research was supported by the Royal Society International Exchange Programme (grant no. IE120643).Eye movements provide insights into what people pay attention to, and therefore are commonly included in a variety of human-computer interaction studies. Eye movement recording devices (eye trackers) produce gaze trajectories, that is, sequences of gaze location on the screen. Despite recent technological developments that enabled more affordable hardware, gaze data are still costly and time consuming to collect, therefore some propose using mouse movements instead. These are easy to collect automatically and on a large scale. If and how these two movement types are linked, however, is less clear and highly debated. We address this problem in two ways. First, we introduce a new movement analytics methodology to quantify the level of dynamic interaction between the gaze and the mouse pointer on the screen. Our method uses volumetric representation of movement, the space-time densities, which allows us to calculate interaction levels between two physically different types of movement. We describe the method and compare the results with existing dynamic interaction methods from movement ecology. The sensitivity to method parameters is evaluated on simulated trajectories where we can control interaction levels. Second, we perform an experiment with eye and mouse tracking to generate real data with real levels of interaction, to apply and test our new methodology on a real case. Further, as our experiment tasks mimics route-tracing when using a map, it is more than a data collection exercise and it simultaneously allows us to investigate the actual connection between the eye and the mouse. We find that there seem to be natural coupling when eyes are not under conscious control, but that this coupling breaks down when instructed to move them intentionally. Based on these observations, we tentatively suggest that for natural tracing tasks, mouse tracking could potentially provide similar information as eye-tracking and therefore be used as a proxy for attention. However, more research is needed to confirm this.Publisher PDFPeer reviewe
Computation of Graphlet Orbits for Nodes and Edges in Sparse Graphs
Graphlet analysis is a useful tool for describing local network topology around individual nodes or edges. A node or an edge can be described by a vector containing the counts of different kinds of graphlets (small induced subgraphs) in which it appears, or the "roles" (orbits) it has within these graphlets. We implemented an R package with functions for fast computation of such counts on sparse graphs. Instead of enumerating all induced graphlets, our algorithm is based on the derived relations between the counts, which decreases the time complexity by an order of magnitude in comparison with past approaches
Synthesis and Crystal Structure of Tetraethylammonium Di-ÎĽ-fluoro-bis[aquadifluoro-oxovanadate(IV)]
The title complex, [NEt4h [V202F6(H20h], has been isolated
from an aqueous solution of VOF2 and [NEt4]F. The crystal stru..:.
cture has been determined from threedimensional counter X-ray
data. It crystallizes in the monoclinic space group P21/n with
a = 708.8(1), b = 1316.6(2), c = 1362.4(2) pm, p = 97.58(1)0 and Z = 2.
Least-squares refinement of the structure based on 1523 observations
led to final discrepancy indices of R = 0.058 and Rw =
= 0.067. The structure consists of discrete dinuclear u11l:ci
[V202F5(H20h]2- with a crystallogra<phic centre of inversion. Dimeric
units are linked into chains by hydrogen bonds [O-H ... F 259.1(5)
and 267.9(5) pm]. The geometry around vanadium is distorted octahedral
with V-F distances from 192.0(4) to 220.9(3) pm, V-0
159.4(4) pm, V-OH2 208.3(4) pm, V-V 332.8(1) pm, and V-F-V
106.0(2)
- …