2,618 research outputs found

    Flat-containing and shift-blocking sets in F2rF_2^r

    Get PDF
    For non-negative integers rdr\ge d, how small can a subset CF2rC\subset F_2^r be, given that for any vF2rv\in F_2^r there is a dd-flat passing through vv and contained in C{v}C\cup\{v\}? Equivalently, how large can a subset BF2rB\subset F_2^r be, given that for any vF2rv\in F_2^r there is a linear dd-subspace not blocked non-trivially by the translate B+vB+v? A number of lower and upper bounds are obtained

    Centroidal bases in graphs

    Get PDF
    We introduce the notion of a centroidal locating set of a graph GG, that is, a set LL of vertices such that all vertices in GG are uniquely determined by their relative distances to the vertices of LL. A centroidal locating set of GG of minimum size is called a centroidal basis, and its size is the centroidal dimension CD(G)CD(G). This notion, which is related to previous concepts, gives a new way of identifying the vertices of a graph. The centroidal dimension of a graph GG is lower- and upper-bounded by the metric dimension and twice the location-domination number of GG, respectively. The latter two parameters are standard and well-studied notions in the field of graph identification. We show that for any graph GG with nn vertices and maximum degree at least~2, (1+o(1))lnnlnlnnCD(G)n1(1+o(1))\frac{\ln n}{\ln\ln n}\leq CD(G) \leq n-1. We discuss the tightness of these bounds and in particular, we characterize the set of graphs reaching the upper bound. We then show that for graphs in which every pair of vertices is connected via a bounded number of paths, CD(G)=Ω(E(G))CD(G)=\Omega\left(\sqrt{|E(G)|}\right), the bound being tight for paths and cycles. We finally investigate the computational complexity of determining CD(G)CD(G) for an input graph GG, showing that the problem is hard and cannot even be approximated efficiently up to a factor of o(logn)o(\log n). We also give an O(nlnn)O\left(\sqrt{n\ln n}\right)-approximation algorithm

    Rank Minimization over Finite Fields: Fundamental Limits and Coding-Theoretic Interpretations

    Full text link
    This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random sensing matrices. Necessary and sufficient conditions on the number of measurements required are provided. It is shown that these conditions are sharp and the minimum-rank decoder is asymptotically optimal. The reliability function of this decoder is also derived by appealing to de Caen's lower bound on the probability of a union. The sufficient condition also holds when the sensing matrices are sparse - a scenario that may be amenable to efficient decoding. More precisely, it is shown that if the n\times n-sensing matrices contain, on average, \Omega(nlog n) entries, the number of measurements required is the same as that when the sensing matrices are dense and contain entries drawn uniformly at random from the field. Analogies are drawn between the above results and rank-metric codes in the coding theory literature. In fact, we are also strongly motivated by understanding when minimum rank distance decoding of random rank-metric codes succeeds. To this end, we derive distance properties of equiprobable and sparse rank-metric codes. These distance properties provide a precise geometric interpretation of the fact that the sparse ensemble requires as few measurements as the dense one. Finally, we provide a non-exhaustive procedure to search for the unknown low-rank matrix.Comment: Accepted to the IEEE Transactions on Information Theory; Presented at IEEE International Symposium on Information Theory (ISIT) 201

    ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks

    Full text link
    Hash codes are efficient data representations for coping with the ever growing amounts of data. In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees. We start with a simple hashing scheme, where random trees in a forest act as hashing functions by setting `1' for the visited tree leaf, and `0' for the rest. We show that traditional random forests fail to generate hashes that preserve the underlying similarity between the trees, rendering the random forests approach to hashing challenging. To address this, we propose to first randomly group arriving classes at each tree split node into two groups, obtaining a significantly simplified two-class classification problem, which can be handled using a light-weight CNN weak learner. Such random class grouping scheme enables code uniqueness by enforcing each class to share its code with different classes in different trees. A non-conventional low-rank loss is further adopted for the CNN weak learners to encourage code consistency by minimizing intra-class variations and maximizing inter-class distance for the two random class groups. Finally, we introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. The proposed approach significantly outperforms state-of-the-art hashing methods for image retrieval tasks on large-scale public datasets, while performing at the level of other state-of-the-art image classification techniques while utilizing a more compact and efficient scalable representation. This work proposes a principled and robust procedure to train and deploy in parallel an ensemble of light-weight CNNs, instead of simply going deeper.Comment: Accepted to ECCV 201
    corecore