8,644 research outputs found

    Optimal Renormalization Group Transformation from Information Theory

    Full text link
    Recently a novel real-space RG algorithm was introduced, identifying the relevant degrees of freedom of a system by maximizing an information-theoretic quantity, the real-space mutual information (RSMI), with machine learning methods. Motivated by this, we investigate the information theoretic properties of coarse-graining procedures, for both translationally invariant and disordered systems. We prove that a perfect RSMI coarse-graining does not increase the range of interactions in the renormalized Hamiltonian, and, for disordered systems, suppresses generation of correlations in the renormalized disorder distribution, being in this sense optimal. We empirically verify decay of those measures of complexity, as a function of information retained by the RG, on the examples of arbitrary coarse-grainings of the clean and random Ising chain. The results establish a direct and quantifiable connection between properties of RG viewed as a compression scheme, and those of physical objects i.e. Hamiltonians and disorder distributions. We also study the effect of constraints on the number and type of coarse-grained degrees of freedom on a generic RG procedure.Comment: Updated manuscript with new results on disordered system

    Interactive Machine Learning with Applications in Health Informatics

    Full text link
    Recent years have witnessed unprecedented growth of health data, including millions of biomedical research publications, electronic health records, patient discussions on health forums and social media, fitness tracker trajectories, and genome sequences. Information retrieval and machine learning techniques are powerful tools to unlock invaluable knowledge in these data, yet they need to be guided by human experts. Unlike training machine learning models in other domains, labeling and analyzing health data requires highly specialized expertise, and the time of medical experts is extremely limited. How can we mine big health data with little expert effort? In this dissertation, I develop state-of-the-art interactive machine learning algorithms that bring together human intelligence and machine intelligence in health data mining tasks. By making efficient use of human expert's domain knowledge, we can achieve high-quality solutions with minimal manual effort. I first introduce a high-recall information retrieval framework that helps human users efficiently harvest not just one but as many relevant documents as possible from a searchable corpus. This is a common need in professional search scenarios such as medical search and literature review. Then I develop two interactive machine learning algorithms that leverage human expert's domain knowledge to combat the curse of "cold start" in active learning, with applications in clinical natural language processing. A consistent empirical observation is that the overall learning process can be reliably accelerated by a knowledge-driven "warm start", followed by machine-initiated active learning. As a theoretical contribution, I propose a general framework for interactive machine learning. Under this framework, a unified optimization objective explains many existing algorithms used in practice, and inspires the design of new algorithms.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147518/1/raywang_1.pd
    • …
    corecore