46 research outputs found

    Intelligent Sensor Networks

    Get PDF
    In the last decade, wireless or wired sensor networks have attracted much attention. However, most designs target general sensor network issues including protocol stack (routing, MAC, etc.) and security issues. This book focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on their world-class research, the authors present the fundamentals of intelligent sensor networks. They cover sensing and sampling, distributed signal processing, and intelligent signal learning. In addition, they present cutting-edge research results from leading experts

    Exploiting Latent Features of Text and Graphs

    Get PDF
    As the size and scope of online data continues to grow, new machine learning techniques become necessary to best capitalize on the wealth of available information. However, the models that help convert data into knowledge require nontrivial processes to make sense of large collections of text and massive online graphs. In both scenarios, modern machine learning pipelines produce embeddings --- semantically rich vectors of latent features --- to convert human constructs for machine understanding. In this dissertation we focus on information available within biomedical science, including human-written abstracts of scientific papers, as well as machine-generated graphs of biomedical entity relationships. We present the Moliere system, and our method for identifying new discoveries through the use of natural language processing and graph mining algorithms. We propose heuristically-based ranking criteria to augment Moliere, and leverage this ranking to identify a new gene-treatment target for HIV-associated Neurodegenerative Disorders. We additionally focus on the latent features of graphs, and propose a new bipartite graph embedding technique. Using our graph embedding, we advance the state-of-the-art in hypergraph partitioning quality. Having newfound intuition of graph embeddings, we present Agatha, a deep-learning approach to hypothesis generation. This system learns a data-driven ranking criteria derived from the embeddings of our large proposed biomedical semantic graph. To produce human-readable results, we additionally propose CBAG, a technique for conditional biomedical abstract generation

    Joint Discourse-aware Concept Disambiguation and Clustering

    Get PDF
    This thesis addresses the tasks of concept disambiguation and clustering. Concept disambiguation is the task of linking common nouns and proper names in a text – henceforth called mentions – to their corresponding concepts in a predefined inventory. Concept clustering is the task of clustering mentions, so that all mentions in one cluster denote the same concept. In this thesis, we investigate concept disambiguation and clustering from a discourse perspective and propose a discourse-aware approach for joint concept disambiguation and clustering in the framework of Markov logic. The contributions of this thesis are fourfold: Joint Concept Disambiguation and Clustering. In previous approaches, concept disambiguation and concept clustering have been considered as two separate tasks (Schütze, 1998; Ji & Grishman, 2011). We analyze the relationship between concept disambiguation and concept clustering and argue that these two tasks can mutually support each other. We propose the – to our knowledge – first joint approach for concept disambiguation and clustering. Discourse-Aware Concept Disambiguation. One of the determining factors for concept disambiguation and clustering is the context definition. Most previous approaches use the same context definition for all mentions (Milne & Witten, 2008b; Kulkarni et al., 2009; Ratinov et al., 2011, inter alia). We approach the question which context is relevant to disambiguate a mention from a discourse perspective and state that different mentions require different notions of contexts. We state that the context that is relevant to disambiguate a mention depends on its embedding into discourse. However, how a mention is embedded into discourse depends on its denoted concept. Hence, the identification of the denoted concept and the relevant concept mutually depend on each other. We propose a binwise approach with three different context definitions and model the selection of the context definition and the disambiguation jointly. Modeling Interdependencies with Markov Logic. To model the interdependencies between concept disambiguation and concept clustering as well as the interdependencies between the context definition and the disambiguation, we use Markov logic (Domingos & Lowd, 2009). Markov logic combines first order logic with probabilities and allows us to concisely formalize these interdependencies. We investigate how we can balance between linguistic appropriateness and time efficiency and propose a hybrid approach that combines joint inference with aggregation techniques. Concept Disambiguation and Clustering beyond English: Multi- and Cross-linguality. Given the vast amount of texts written in different languages, the capability to extend an approach to cope with other languages than English is essential. We thus analyze how our approach copes with other languages than English and show that our approach largely scales across languages, even without retraining. Our approach is evaluated on multiple data sets originating from different sources (e.g. news, web) and across multiple languages. As an inventory, we use Wikipedia. We compare our approach to other approaches and show that it achieves state-of-the-art results. Furthermore, we show that joint concept disambiguating and clustering as well as joint context selection and disambiguation leads to significant improvements ceteris paribus

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    High-Quality Hypergraph Partitioning

    Get PDF
    This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer kk, partition its vertex set into kk disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric. Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution. With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) nn levels, where nn is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the nn-level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve kk-way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine kk-way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space. All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time

    The 1992 Goddard Conference on Space Applications of Artificial Intelligence

    Get PDF
    The purpose of this conference is to provide a forum in which current research and development directed at space applications of artificial intelligence can be presented and discussed. The papers fall into the following areas: planning and scheduling, control, fault monitoring/diagnosis and recovery, information management, tools, neural networks, and miscellaneous applications
    corecore