947 research outputs found
Improved Theoretical and Practical Guarantees for Chromatic Correlation Clustering
We study a natural generalization of the correlation cluster-ing problem to graphs in which the pairwise relations be-tween objects are categorical instead of binary. This prob-lem was recently introduced by Bonchi et al. under the name of chromatic correlation clustering, and is motivated by many real-world applications in data-mining and social networks, including community detection, link classification, and entity de-duplication. Our main contribution is a fast and easy-to-implement constant approximation framework for the problem, which builds on a novel reduction of the problem to that of cor-relation clustering. This result significantly progresses the current state of knowledge for the problem, improving on a previous result that only guaranteed linear approximation in the input size. We complement the above result by devel-oping a linear programming-based algorithm that achieves an improved approximation ratio of 4. Although this al-gorithm cannot be considered to be practical, it further ex-tends our theoretical understanding of chromatic correlation clustering. We also present a fast heuristic algorithm that is motivated by real-life scenarios in which there is a ground-truth clustering that is obscured by noisy observations. We test our algorithms on both synthetic and real datasets, like social networks data. Our experiments reinforce the theoret-ical findings by demonstrating that our algorithms generally outperform previous approaches, both in terms of solution cost and reconstruction of an underlying ground-truth clus-tering
Overlapping and Robust Edge-Colored Clustering in Hypergraphs
A recent trend in data mining has explored (hyper)graph clustering algorithms
for data with categorical relationship types. Such algorithms have applications
in the analysis of social, co-authorship, and protein interaction networks, to
name a few. Many such applications naturally have some overlap between
clusters, a nuance which is missing from current combinatorial models.
Additionally, existing models lack a mechanism for handling noise in datasets.
We address these concerns by generalizing Edge-Colored Clustering, a recent
framework for categorical clustering of hypergraphs. Our generalizations allow
for a budgeted number of either (a) overlapping cluster assignments or (b) node
deletions. For each new model we present a greedy algorithm which approximately
minimizes an edge mistake objective, as well as bicriteria approximations where
the second approximation factor is on the budget. Additionally, we address the
parameterized complexity of each problem, providing FPT algorithms and hardness
results
On the External Validity of Average-Case Analyses of Graph Algorithms
The number one criticism of average-case analysis is that we do not actually
know the probability distribution of real-world inputs. Thus, analyzing an
algorithm on some random model has no implications for practical performance.
At its core, this criticism doubts the existence of external validity, i.e., it
assumes that algorithmic behavior on the somewhat simple and clean models does
not translate beyond the models to practical performance real-world input. With
this paper, we provide a first step towards studying the question of external
validity systematically. To this end, we evaluate the performance of six graph
algorithms on a collection of 2745 sparse real-world networks depending on two
properties; the heterogeneity (variance in the degree distribution) and
locality (tendency of edges to connect vertices that are already close). We
compare this with the performance on generated networks with varying locality
and heterogeneity. We find that the performance in the idealized setting of
network models translates surprisingly well to real-world networks. Moreover,
heterogeneity and locality appear to be the core properties impacting the
performance of many graph algorithms.Comment: 42 pages, 19 figures, preprint (full version
Digital Color Imaging
This paper surveys current technology and research in the area of digital
color imaging. In order to establish the background and lay down terminology,
fundamental concepts of color perception and measurement are first presented
us-ing vector-space notation and terminology. Present-day color recording and
reproduction systems are reviewed along with the common mathematical models
used for representing these devices. Algorithms for processing color images for
display and communication are surveyed, and a forecast of research trends is
attempted. An extensive bibliography is provided
Deep learning systems as complex networks
Thanks to the availability of large scale digital datasets and massive
amounts of computational power, deep learning algorithms can learn
representations of data by exploiting multiple levels of abstraction. These
machine learning methods have greatly improved the state-of-the-art in many
challenging cognitive tasks, such as visual object recognition, speech
processing, natural language understanding and automatic translation. In
particular, one class of deep learning models, known as deep belief networks,
can discover intricate statistical structure in large data sets in a completely
unsupervised fashion, by learning a generative model of the data using
Hebbian-like learning mechanisms. Although these self-organizing systems can be
conveniently formalized within the framework of statistical mechanics, their
internal functioning remains opaque, because their emergent dynamics cannot be
solved analytically. In this article we propose to study deep belief networks
using techniques commonly employed in the study of complex networks, in order
to gain some insights into the structural and functional properties of the
computational graph resulting from the learning process.Comment: 20 pages, 9 figure
A Tutorial on Clique Problems in Communications and Signal Processing
Since its first use by Euler on the problem of the seven bridges of
K\"onigsberg, graph theory has shown excellent abilities in solving and
unveiling the properties of multiple discrete optimization problems. The study
of the structure of some integer programs reveals equivalence with graph theory
problems making a large body of the literature readily available for solving
and characterizing the complexity of these problems. This tutorial presents a
framework for utilizing a particular graph theory problem, known as the clique
problem, for solving communications and signal processing problems. In
particular, the paper aims to illustrate the structural properties of integer
programs that can be formulated as clique problems through multiple examples in
communications and signal processing. To that end, the first part of the
tutorial provides various optimal and heuristic solutions for the maximum
clique, maximum weight clique, and -clique problems. The tutorial, further,
illustrates the use of the clique formulation through numerous contemporary
examples in communications and signal processing, mainly in maximum access for
non-orthogonal multiple access networks, throughput maximization using index
and instantly decodable network coding, collision-free radio frequency
identification networks, and resource allocation in cloud-radio access
networks. Finally, the tutorial sheds light on the recent advances of such
applications, and provides technical insights on ways of dealing with mixed
discrete-continuous optimization problems
NCUWM Talk Abstracts 2010
Dr. Bryna Kra, Northwestern University
“From Ramsey Theory to Dynamical
Systems and Back”
Dr. Karen Vogtmann, Cornell University
“Ping-Pong in Outer Space”
Lindsay Baun, College of St. Benedict
Danica Belanus, University of North Dakota
Hayley Belli, University of Oregon
Tiffany Bradford, Saint Francis University
Kathryn Bryant, Northern Arizona University
Laura Buggy, College of St. Benedict
Katharina Carella, Ithaca College
Kathleen Carroll, Wheaton College
Elizabeth Collins-Wildman, Carleton College
Rebecca Dorff, Brigham Young University
Melisa Emory, University of Nebraska at Omaha
Avis Foster, George Mason University
Xiaojing Fu, Clarkson University
Jennifer Garbett, Kenyon College
Nicki Gaswick, University of Nebraska-Lincoln
Rita Gnizak, Fort Hays State University
Kailee Gray, University of South Dakota
Samantha Hilker, Sam Houston State University
Ruthi Hortsch, University of Michigan
Jennifer Iglesias, Harvey Mudd College
Laura Janssen, University of Nebraska-Lincoln
Laney Kuenzel, Stanford University
Ellen Le, Pomona College
Thu Le, University of the South
Shauna Leonard, Arkansas State University
Tova Lindberg, Bethany Lutheran College
Lisa Moats, Concordia College
Kaitlyn McConville, Westminster College
Jillian Neeley, Ithaca College
Marlene Ouayoro, George Mason University
Kelsey Quarton, Bradley University
Brooke Quisenberry, Hope College
Hannah Ross, Kenyon College
Karla Schommer, College of St. Benedict
Rebecca Scofield, University of Iowa
April Scudere, Westminster College
Natalie Sheils, Seattle University
Kaitlin Speer, Baylor University
Meredith Stevenson, Murray State University
Kiri Sunde, University of North Carolina
Kaylee Sutton, John Carroll University
Frances Tirado, University of Florida
Anna Tracy, University of the South
Kelsey Uherka, Morningside College
Danielle Wheeler, Coe College
Lindsay Willett, Grove City College
Heather Williamson, Rice University
Chengcheng Yang, Rice University
Jie Zeng, Michigan Technological Universit
The Maximum Clique Problem: Algorithms, Applications, and Implementations
Computationally hard problems are routinely encountered during the course of solving practical problems. This is commonly dealt with by settling for less than optimal solutions, through the use of heuristics or approximation algorithms. This dissertation examines the alternate possibility of solving such problems exactly, through a detailed study of one particular problem, the maximum clique problem. It discusses algorithms, implementations, and the application of maximum clique results to real-world problems. First, the theoretical roots of the algorithmic method employed are discussed. Then a practical approach is described, which separates out important algorithmic decisions so that the algorithm can be easily tuned for different types of input data. This general and modifiable approach is also meant as a tool for research so that different strategies can easily be tried for different situations. Next, a specific implementation is described. The program is tuned, by use of experiments, to work best for two different graph types, real-world biological data and a suite of synthetic graphs. A parallel implementation is then briefly discussed and tested. After considering implementation, an example of applying these clique-finding tools to a specific case of real-world biological data is presented. Results are analyzed using both statistical and biological metrics. Then the development of practical algorithms based on clique-finding tools is explored in greater detail. New algorithms are introduced and preliminary experiments are performed. Next, some relaxations of clique are discussed along with the possibility of developing new practical algorithms from these variations. Finally, conclusions and future research directions are given
- …