6,774 research outputs found
Constrained speaker linking
In this paper we study speaker linking (a.k.a.\ partitioning) given
constraints of the distribution of speaker identities over speech recordings.
Specifically, we show that the intractable partitioning problem becomes
tractable when the constraints pre-partition the data in smaller cliques with
non-overlapping speakers. The surprisingly common case where speakers in
telephone conversations are known, but the assignment of channels to identities
is unspecified, is treated in a Bayesian way. We show that for the Dutch CGN
database, where this channel assignment task is at hand, a lightweight speaker
recognition system can quite effectively solve the channel assignment problem,
with 93% of the cliques solved. We further show that the posterior distribution
over channel assignment configurations is well calibrated.Comment: Submitted to Interspeech 2014, some typos fixe
On combinatorial optimisation in analysis of protein-protein interaction and protein folding networks
Abstract: Protein-protein interaction networks and protein folding networks represent prominent research topics at the intersection of bioinformatics and network science. In this paper, we present a study of these networks from combinatorial optimisation point of view. Using a combination of classical heuristics and stochastic optimisation techniques, we were able to identify several interesting combinatorial properties of biological networks of the COSIN project. We obtained optimal or near-optimal solutions to maximum clique and chromatic number problems for these networks. We also explore patterns of both non-overlapping and overlapping cliques in these networks. Optimal or near-optimal solutions to partitioning of these networks into non-overlapping cliques and to maximum independent set problem were discovered. Maximal cliques are explored by enumerative techniques. Domination in these networks is briefly studied, too. Applications and extensions of our findings are discussed
On a registration-based approach to sensor network localization
We consider a registration-based approach for localizing sensor networks from
range measurements. This is based on the assumption that one can find
overlapping cliques spanning the network. That is, for each sensor, one can
identify geometric neighbors for which all inter-sensor ranges are known. Such
cliques can be efficiently localized using multidimensional scaling. However,
since each clique is localized in some local coordinate system, we are required
to register them in a global coordinate system. In other words, our approach is
based on transforming the localization problem into a problem of registration.
In this context, the main contributions are as follows. First, we describe an
efficient method for partitioning the network into overlapping cliques. Second,
we study the problem of registering the localized cliques, and formulate a
necessary rigidity condition for uniquely recovering the global sensor
coordinates. In particular, we present a method for efficiently testing
rigidity, and a proposal for augmenting the partitioned network to enforce
rigidity. A recently proposed semidefinite relaxation of global registration is
used for registering the cliques. We present simulation results on random and
structured sensor networks to demonstrate that the proposed method compares
favourably with state-of-the-art methods in terms of run-time, accuracy, and
scalability
Partitioning networks into cliques: a randomized heuristic approach
In the context of community detection in social networks, the term community can be grounded in the strict way that simply everybody should know each other within the community. We consider the corresponding community detection problem. We search for a partitioning of a network into the minimum number of non-overlapping cliques, such that the cliques cover all vertices. This problem is called the clique covering problem (CCP) and is one of the classical NP-hard problems. For CCP, we propose a randomized heuristic approach. To construct a high quality solution to CCP, we present an iterated greedy (IG) algorithm. IG can also be combined with a heuristic used to determine how far the algorithm is from the optimum in the worst case. Randomized local search (RLS) for maximum independent set was proposed to find such a bound. The experimental results of IG and the bounds obtained by RLS indicate that IG is a very suitable technique for solving CCP in real-world graphs. In addition, we summarize our basic rigorous results, which were developed for analysis of IG and understanding of its behavior on several relevant graph classes
Robust Group Linkage
We study the problem of group linkage: linking records that refer to entities
in the same group. Applications for group linkage include finding businesses in
the same chain, finding conference attendees from the same affiliation, finding
players from the same team, etc. Group linkage faces challenges not present for
traditional record linkage. First, although different members in the same group
can share some similar global values of an attribute, they represent different
entities so can also have distinct local values for the same or different
attributes, requiring a high tolerance for value diversity. Second, groups can
be huge (with tens of thousands of records), requiring high scalability even
after using good blocking strategies.
We present a two-stage algorithm: the first stage identifies cores containing
records that are very likely to belong to the same group, while being robust to
possible erroneous values; the second stage collects strong evidence from the
cores and leverages it for merging more records into the same group, while
being tolerant to differences in local values of an attribute. Experimental
results show the high effectiveness and efficiency of our algorithm on various
real-world data sets
Every property is testable on a natural class of scale-free multigraphs
In this paper, we introduce a natural class of multigraphs called
hierarchical-scale-free (HSF) multigraphs, and consider constant-time
testability on the class. We show that a very wide subclass, specifically, that
in which the power-law exponent is greater than two, of HSF is hyperfinite.
Based on this result, an algorithm for a deterministic partitioning oracle can
be constructed. We conclude by showing that every property is constant-time
testable on the above subclass of HSF. This algorithm utilizes findings by
Newman and Sohler of STOC'11. However, their algorithm is based on the
bounded-degree model, while it is known that actual scale-free networks usually
include hubs, which have a very large degree. HSF is based on scale-free
properties and includes such hubs. This is the first universal result of
constant-time testability on the general graph model, and it has the potential
to be applicable on a very wide range of scale-free networks.Comment: 13 pages, one figure. Difference from ver. 1: Definitions of HSF and
SF become more general. Typos were fixe
- …