Search CORE

2,618 research outputs found

Flat-containing and shift-blocking sets in $F_2^r$

Author: Blokhuis Aart
Lev Vsevolod F.
Publication venue
Publication date: 01/01/2013
Field of study

For non-negative integers

r\ge d

, how small can a subset

C\subset F_2^r

be, given that for any

v\in F_2^r

there is a

d

-flat passing through

v

and contained in

C\cup\{v\}

? Equivalently, how large can a subset

B\subset F_2^r

be, given that for any

v\in F_2^r

there is a linear

d

-subspace not blocked non-trivially by the translate

B+v

? A number of lower and upper bounds are obtained

arXiv.org e-Print Archive

Repository TU/e

Centroidal bases in graphs

Author: Foucaud Florent
Klasing Ralf
Slater Peter J.
Publication venue: 'Wiley'
Publication date: 01/01/2014
Field of study

We introduce the notion of a centroidal locating set of a graph

G

, that is, a set

L

of vertices such that all vertices in

G

are uniquely determined by their relative distances to the vertices of

L

. A centroidal locating set of

G

of minimum size is called a centroidal basis, and its size is the centroidal dimension

CD(G)

. This notion, which is related to previous concepts, gives a new way of identifying the vertices of a graph. The centroidal dimension of a graph

G

is lower- and upper-bounded by the metric dimension and twice the location-domination number of

G

, respectively. The latter two parameters are standard and well-studied notions in the field of graph identification. We show that for any graph

G

with

n

vertices and maximum degree at least~2,

(1+o(1))\frac{\ln n}{\ln\ln n}\leq CD(G) \leq n-1

. We discuss the tightness of these bounds and in particular, we characterize the set of graphs reaching the upper bound. We then show that for graphs in which every pair of vertices is connected via a bounded number of paths,

CD(G)=\Omega\left(\sqrt{|E(G)|}\right)

, the bound being tight for paths and cycles. We finally investigate the computational complexity of determining

CD(G)

for an input graph

G

, showing that the problem is hard and cannot even be approximated efficiently up to a factor of

o(\log n)

. We also give an

O\left(\sqrt{n\ln n}\right)

-approximation algorithm

arXiv.org e-Print Archive

CiteSeerX

Rank Minimization over Finite Fields: Fundamental Limits and Coding-Theoretic Interpretations

Author: Laura Balzano
Stark C. Draper
Student Member
Vincent Y. F. Tan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2011
Field of study

This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random sensing matrices. Necessary and sufficient conditions on the number of measurements required are provided. It is shown that these conditions are sharp and the minimum-rank decoder is asymptotically optimal. The reliability function of this decoder is also derived by appealing to de Caen's lower bound on the probability of a union. The sufficient condition also holds when the sensing matrices are sparse - a scenario that may be amenable to efficient decoding. More precisely, it is shown that if the n\times n-sensing matrices contain, on average, \Omega(nlog n) entries, the number of measurements required is the same as that when the sensing matrices are dense and contain entries drawn uniformly at random from the field. Analogies are drawn between the above results and rank-metric codes in the coding theory literature. In fact, we are also strongly motivated by understanding when minimum rank distance decoding of random rank-metric codes succeeds. To this end, we derive distance properties of equiprobable and sparse rank-metric codes. These distance properties provide a precise geometric interpretation of the fact that the sparse ensemble requires as few measurements as the dense one. Finally, we provide a non-exhaustive procedure to search for the unknown low-rank matrix.Comment: Accepted to the IEEE Transactions on Information Theory; Presented at IEEE International Symposium on Information Theory (ISIT) 201

arXiv.org e-Print Archive

CiteSeerX

ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks

Author: A Criminisi
A Krause
AL Yuille
BK Sriperumbudur
G Huang
G Nemhauser
GA Watson
J Masci
J Shotton
JR Quinlan
K He
L Breiman
M Aharon
ME Hellman
Q Qiu
Y Bengio
Y Lecun
Publication venue
Publication date: 27/07/2018
Field of study

Hash codes are efficient data representations for coping with the ever growing amounts of data. In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees. We start with a simple hashing scheme, where random trees in a forest act as hashing functions by setting `1' for the visited tree leaf, and `0' for the rest. We show that traditional random forests fail to generate hashes that preserve the underlying similarity between the trees, rendering the random forests approach to hashing challenging. To address this, we propose to first randomly group arriving classes at each tree split node into two groups, obtaining a significantly simplified two-class classification problem, which can be handled using a light-weight CNN weak learner. Such random class grouping scheme enables code uniqueness by enforcing each class to share its code with different classes in different trees. A non-conventional low-rank loss is further adopted for the CNN weak learners to encourage code consistency by minimizing intra-class variations and maximizing inter-class distance for the two random class groups. Finally, we introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. The proposed approach significantly outperforms state-of-the-art hashing methods for image retrieval tasks on large-scale public datasets, while performing at the level of other state-of-the-art image classification techniques while utilizing a more compact and efficient scalable representation. This work proposes a principled and robust procedure to train and deploy in parallel an ensemble of light-weight CNNs, instead of simply going deeper.Comment: Accepted to ECCV 201

arXiv.org e-Print Archive