Search CORE

1,602 research outputs found

Learning from networked examples

Author: Guo Zheng-Chu
Ramon Jan
Wang Yuyi
Publication venue
Publication date: 03/06/2017
Field of study

Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample error bounds in this networked setting, providing significantly improved results. An important component of our approach is formed by efficient sample weighting schemes, which leads to novel concentration inequalities

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Algorithms to Approximate Column-Sparse Packing Problems

Author: Brubach Brian
Sankararaman Karthik Abinav
Srinivasan Aravind
Xu Pan
Publication venue
Publication date: 05/08/2019
Field of study

Column-sparse packing problems arise in several contexts in both deterministic and stochastic discrete optimization. We present two unifying ideas, (non-uniform) attenuation and multiple-chance algorithms, to obtain improved approximation algorithms for some well-known families of such problems. As three main examples, we attain the integrality gap, up to lower-order terms, for known LP relaxations for k-column sparse packing integer programs (Bansal et al., Theory of Computing, 2012) and stochastic k-set packing (Bansal et al., Algorithmica, 2012), and go "half the remaining distance" to optimal for a major integrality-gap conjecture of Furedi, Kahn and Seymour on hypergraph matching (Combinatorica, 1993).Comment: Extended abstract appeared in SODA 2018. Full version in ACM Transactions of Algorithm

arXiv.org e-Print Archive

Crossref

An introduction to Graph Data Management

Author: A Dries
A Gutiérrez
A Iosup
A Morari
A Poulovassilis
AD Zhu
AO Mendelzon
B Amann
B Elser
C Berge
C Vicknair
C Watters
C Weiss
CS Chang
D Conte
D Dominguez-Sal
D Theodoratos
DC Faye
DW Shipman
EF Codd
FW Tompa
G Malewicz
GM Kuper
H He
HS Kunii
IF Cruz
IF Cruz
J Hidders
J Paredaens
J Peckham
J. Hidders
Jonathan Hayes
K Zeng
L Kowalik
L Zou
M Atre
M Ciglan
M Consens
M Gemis
M Gyssens
M Han
M Levene
M Levene
M Levene
M Mainguenaud
M Schmidt
M Yannakakis
MA Bornea
MA Rodriguez
MA Rodriguez
Marc Andries
MP Consens
MP Consens
N Kiesel
N Roussopoulos
O Erling
P Barceló Baeza
P Buneman
P Yuan
Philippe Cudré-Mauroux
PPS Chen
PT Wood
PT Wood
R Agrawal
R Angles
R Angles
R Brijder
R Ronen
RH Güting
RS Xin
S Abiteboul
S Abiteboul
T Neumann
W Fan
W Kim
Y Guo
Y Low
Y Papakonstantinou
Y Tian
Y Zhao
YA Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/12/2017
Field of study

A graph database is a database where the data structures for the schema and/or instances are modeled as a (labeled)(directed) graph or generalizations of it, and where querying is expressed by graph-oriented operations and type constructors. In this article we present the basic notions of graph databases, give an historical overview of its main development, and study the main current systems that implement them

arXiv.org e-Print Archive

Crossref

Nonnegative k-sums, fractional covers, and probability of small deviations

Author: Alon Noga
Huang Hao
Sudakov Benny
Publication venue
Publication date: 07/08/2011
Field of study

More than twenty years ago, Manickam, Mikl\'{o}s, and Singhi conjectured that for any integers

n, k

satisfying

n \geq 4k

, every set of

n

real numbers with nonnegative sum has at least

\binom{n-1}{k-1}

k

-element subsets whose sum is also nonnegative. In this paper we discuss the connection of this problem with matchings and fractional covers of hypergraphs, and with the question of estimating the probability that the sum of nonnegative independent random variables exceeds its expectation by a given amount. Using these connections together with some probabilistic techniques, we verify the conjecture for

n \geq 33k^2

. This substantially improves the best previously known exponential lower bound

n \geq e^{ck \log\log k}

. In addition we prove a tight stability result showing that for every

k

and all sufficiently large

n

, every set of

n

reals with a nonnegative sum that does not contain a member whose sum with any other

k-1

members is nonnegative, contains at least

\binom{n-1}{k-1}+\binom{n-k-1}{k-1}-1

subsets of cardinality

k

with nonnegative sum.Comment: 15 pages, a section of Hilton-Milner type result adde

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Asymmetric Lee Distance Codes for DNA-Based Storage

Author: Gabrys Ryan
Kiah Han Mao
Milenkovic Olgica
Publication venue
Publication date: 14/12/2016
Field of study

We consider a new family of codes, termed asymmetric Lee distance codes, that arise in the design and implementation of DNA-based storage systems and systems with parallel string transmission protocols. The codewords are defined over a quaternary alphabet, although the results carry over to other alphabet sizes; furthermore, symbol confusability is dictated by their underlying binary representation. Our contributions are two-fold. First, we demonstrate that the new distance represents a linear combination of the Lee and Hamming distance and derive upper bounds on the size of the codes under this metric based on linear programming techniques. Second, we propose a number of code constructions which imply lower bounds

arXiv.org e-Print Archive

CiteSeerX

On complexity of optimized crossover for binary representations

Author: Eremeev Anton
Publication venue
Publication date: 01/01/2006
Field of study

We consider the computational complexity of producing the best possible offspring in a crossover, given two solutions of the parents. The crossover operators are studied on the class of Boolean linear programming problems, where the Boolean vector of variables is used as the solution representation. By means of efficient reductions of the optimized gene transmitting crossover problems (OGTC) we show the polynomial solvability of the OGTC for the maximum weight set packing problem, the minimum weight set partition problem and for one of the versions of the simple plant location problem. We study a connection between the OGTC for linear Boolean programming problem and the maximum weight independent set problem on 2-colorable hypergraph and prove the NP-hardness of several special cases of the OGTC problem in Boolean linear programming.Comment: Dagstuhl Seminar 06061 "Theory of Evolutionary Algorithms", 200

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server