543 research outputs found

    Graph-based techniques for compression and reconstruction of sparse sources

    Get PDF
    The main goal of this thesis is to develop lossless compression schemes for analog and binary sources. All the considered compression schemes have as common feature that the encoder can be represented by a graph, so they can be studied employing tools from modern coding theory. In particular, this thesis is focused on two compression problems: the group testing and the noiseless compressed sensing problems. Although both problems may seem unrelated, in the thesis they are shown to be very close. Furthermore, group testing has the same mathematical formulation as non-linear binary source compression schemes that use the OR operator. In this thesis, the similarities between these problems are exploited. The group testing problem is aimed at identifying the defective subjects of a population with as few tests as possible. Group testing schemes can be divided into two groups: adaptive and non-adaptive group testing schemes. The former schemes generate tests sequentially and exploit the partial decoding results to attempt to reduce the overall number of tests required to label all members of the population, whereas non-adaptive schemes perform all the test in parallel and attempt to label as many subjects as possible. Our contributions to the group testing problem are both theoretical and practical. We propose a novel adaptive scheme aimed to efficiently perform the testing process. Furthermore, we develop tools to predict the performance of both adaptive and non-adaptive schemes when the number of subjects to be tested is large. These tools allow to characterize the performance of adaptive and non-adaptive group testing schemes without simulating them. The goal of the noiseless compressed sensing problem is to retrieve a signal from its lineal projection version in a lower-dimensional space. This can be done only whenever the amount of null components of the original signal is large enough. Compressed sensing deals with the design of sampling schemes and reconstruction algorithms that manage to reconstruct the original signal vector with as few samples as possible. In this thesis we pose the compressed sensing problem within a probabilistic framework, as opposed to the classical compression sensing formulation. Recent results in the state of the art show that this approach is more efficient than the classical one. Our contributions to noiseless compressed sensing are both theoretical and practical. We deduce a necessary and sufficient matrix design condition to guarantee that the reconstruction is lossless. Regarding the design of practical schemes, we propose two novel reconstruction algorithms based on message passing over the sparse representation of the matrix, one of them with very low computational complexity.El objetivo principal de la tesis es el desarrollo de esquemas de compresión sin pérdidas para fuentes analógicas y binarias. Los esquemas analizados tienen en común la representación del compresor mediante un grafo; esto ha permitido emplear en su estudio las herramientas de codificación modernas. Más concretamente la tesis estudia dos problemas de compresión en particular: el diseño de experimentos de testeo comprimido de poblaciones (de sangre, de presencia de elementos contaminantes, secuenciado de ADN, etcétera) y el muestreo comprimido de señales reales en ausencia de ruido. A pesar de que a primera vista parezcan problemas totalmente diferentes, en la tesis mostramos que están muy relacionados. Adicionalmente, el problema de testeo comprimido de poblaciones tiene una formulación matemática idéntica a los códigos de compresión binarios no lineales basados en puertas OR. En la tesis se explotan las similitudes entre todos estos problemas. Existen dos aproximaciones al testeo de poblaciones: el testeo adaptativo y el no adaptativo. El primero realiza los test de forma secuencial y explota los resultados parciales de estos para intentar reducir el número total de test necesarios, mientras que el segundo hace todos los test en bloque e intenta extraer el máximo de datos posibles de los test. Nuestras contribuciones al problema de testeo comprimido han sido tanto teóricas como prácticas. Hemos propuesto un nuevo esquema adaptativo para realizar eficientemente el proceso de testeo. Además hemos desarrollado herramientas que permiten predecir el comportamiento tanto de los esquemas adaptativos como de los esquemas no adaptativos cuando el número de sujetos a testear es elevado. Estas herramientas permiten anticipar las prestaciones de los esquemas de testeo sin necesidad de simularlos. El objetivo del muestreo comprimido es recuperar una señal a partir de su proyección lineal en un espacio de menor dimensión. Esto sólo es posible si se asume que la señal original tiene muchas componentes que son cero. El problema versa sobre el diseño de matrices y algoritmos de reconstrucción que permitan implementar esquemas de muestreo y reconstrucción con un número mínimo de muestras. A diferencia de la formulación clásica de muestreo comprimido, en esta tesis se ha empleado un modelado probabilístico de la señal. Referencias recientes en la literatura demuestran que este enfoque permite conseguir esquemas de compresión y descompresión más eficientes. Nuestras contribuciones en el campo de muestreo comprimido de fuentes analógicas dispersas han sido también teóricas y prácticas. Por un lado, la deducción de la condición necesaria y suficiente que debe garantizar la matriz de muestreo para garantizar que se puede reconstruir unívocamente la secuencia de fuente. Por otro lado, hemos propuesto dos algoritmos, uno de ellos de baja complejidad computacional, que permiten reconstruir la señal original basados en paso de mensajes entre los nodos de la representación gráfica de la matriz de proyección.Postprint (published version

    READUP BUILDUP. Thync - instant α-readings

    Get PDF

    Regularity Properties and Pathologies of Position-Space Renormalization-Group Transformations

    Full text link
    We reconsider the conceptual foundations of the renormalization-group (RG) formalism, and prove some rigorous theorems on the regularity properties and possible pathologies of the RG map. Regarding regularity, we show that the RG map, defined on a suitable space of interactions (= formal Hamiltonians), is always single-valued and Lipschitz continuous on its domain of definition. This rules out a recently proposed scenario for the RG description of first-order phase transitions. On the pathological side, we make rigorous some arguments of Griffiths, Pearce and Israel, and prove in several cases that the renormalized measure is not a Gibbs measure for any reasonable interaction. This means that the RG map is ill-defined, and that the conventional RG description of first-order phase transitions is not universally valid. For decimation or Kadanoff transformations applied to the Ising model in dimension d3d \ge 3, these pathologies occur in a full neighborhood {β>β0,h<ϵ(β)}\{ \beta > \beta_0 ,\, |h| < \epsilon(\beta) \} of the low-temperature part of the first-order phase-transition surface. For block-averaging transformations applied to the Ising model in dimension d2d \ge 2, the pathologies occur at low temperatures for arbitrary magnetic-field strength. Pathologies may also occur in the critical region for Ising models in dimension d4d \ge 4. We discuss in detail the distinction between Gibbsian and non-Gibbsian measures, and give a rather complete catalogue of the known examples. Finally, we discuss the heuristic and numerical evidence on RG pathologies in the light of our rigorous theorems.Comment: 273 pages including 14 figures, Postscript, See also ftp.scri.fsu.edu:hep-lat/papers/9210/9210032.ps.

    Kernel methods in machine learning

    Full text link
    We review machine learning methods employing positive definite kernels. These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of functions defined on the data domain, expanded in terms of a kernel. Working in linear spaces of function has the benefit of facilitating the construction and analysis of learning algorithms while at the same time allowing large classes of functions. The latter include nonlinear functions as well as functions defined on nonvectorial data. We cover a wide range of methods, ranging from binary classifiers to sophisticated methods for estimation with structured data.Comment: Published in at http://dx.doi.org/10.1214/009053607000000677 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A precise bare simulation approach to the minimization of some distances. Foundations

    Full text link
    In information theory -- as well as in the adjacent fields of statistics, machine learning, artificial intelligence, signal processing and pattern recognition -- many flexibilizations of the omnipresent Kullback-Leibler information distance (relative entropy) and of the closely related Shannon entropy have become frequently used tools. To tackle corresponding constrained minimization (respectively maximization) problems by a newly developed dimension-free bare (pure) simulation method, is the main goal of this paper. Almost no assumptions (like convexity) on the set of constraints are needed, within our discrete setup of arbitrary dimension, and our method is precise (i.e., converges in the limit). As a side effect, we also derive an innovative way of constructing new useful distances/divergences. To illustrate the core of our approach, we present numerous examples. The potential for widespread applicability is indicated, too; in particular, we deliver many recent references for uses of the involved distances/divergences and entropies in various different research fields (which may also serve as an interdisciplinary interface)

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    Distributional Robust Batch Contextual Bandits

    Full text link
    Policy learning using historical observational data is an important problem that has found widespread applications. Examples include selecting offers, prices, advertisements to send to customers, as well as selecting which medication to prescribe to a patient. However, existing literature rests on the crucial assumption that the future environment where the learned policy will be deployed is the same as the past environment that has generated the data--an assumption that is often false or too coarse an approximation. In this paper, we lift this assumption and aim to learn a distributional robust policy with incomplete (bandit) observational data. We propose a novel learning algorithm that is able to learn a robust policy to adversarial perturbations and unknown covariate shifts. We first present a policy evaluation procedure in the ambiguous environment and then give a performance guarantee based on the theory of uniform convergence. Additionally, we also give a heuristic algorithm to solve the distributional robust policy learning problems efficiently.Comment: The short version has been accepted in ICML 202

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience
    corecore