1,715 research outputs found

    Advances in quantum machine learning

    Get PDF
    Here we discuss advances in the field of quantum machine learning. The following document offers a hybrid discussion; both reviewing the field as it is currently, and suggesting directions for further research. We include both algorithms and experimental implementations in the discussion. The field's outlook is generally positive, showing significant promise. However, we believe there are appreciable hurdles to overcome before one can claim that it is a primary application of quantum computation.Comment: 38 pages, 17 Figure

    A survey of statistical network models

    Full text link
    Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference

    Graph matching with a dual-step EM algorithm

    Get PDF
    This paper describes a new approach to matching geometric structure in 2D point-sets. The novel feature is to unify the tasks of estimating transformation geometry and identifying point-correspondence matches. Unification is realized by constructing a mixture model over the bipartite graph representing the correspondence match and by affecting optimization using the EM algorithm. According to our EM framework, the probabilities of structural correspondence gate contributions to the expected likelihood function used to estimate maximum likelihood transformation parameters. These gating probabilities measure the consistency of the matched neighborhoods in the graphs. The recovery of transformational geometry and hard correspondence matches are interleaved and are realized by applying coupled update operations to the expected log-likelihood function. In this way, the two processes bootstrap one another. This provides a means of rejecting structural outliers. We evaluate the technique on two real-world problems. The first involves the matching of different perspective views of 3.5-inch floppy discs. The second example is furnished by the matching of a digital map against aerial images that are subject to severe barrel distortion due to a line-scan sampling process. We complement these experiments with a sensitivity study based on synthetic data

    Data based identification and prediction of nonlinear and complex dynamical systems

    Get PDF
    We thank Dr. R. Yang (formerly at ASU), Dr. R.-Q. Su (formerly at ASU), and Mr. Zhesi Shen for their contributions to a number of original papers on which this Review is partly based. This work was supported by ARO under Grant No. W911NF-14-1-0504. W.-X. Wang was also supported by NSFC under Grants No. 61573064 and No. 61074116, as well as by the Fundamental Research Funds for the Central Universities, Beijing Nova Programme.Peer reviewedPostprin

    Algorithmic Techniques in Gene Expression Processing. From Imputation to Visualization

    Get PDF
    The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.Siirretty Doriast

    Fuzzy approach for Arabic character recognition

    Get PDF
    Pattern recognition/classification is increasingly drawing the attention of scientific research because of its important roll in automation and human-machine communication. Even though many models have been introduced to deal with classification, because of the inherited imprecision and ambiguity, these models did not tackle the problem in an efficient way. Traditional models deal only with statistical uncertainty (randomness) but not with the non-statistical uncertainty (vagueness). Fuzzy set theory allows us to better understand imprecision in both of its categories: vagueness and randomness. The incorporation of fuzzy set theory in existing algorithms helped in many cases to improve the performance and increase the efficiency of those algorithms. This thesis will explore fuzzy logic as it pertains to pattern recognition. In order to demonstrate fuzzy logic, the problem of recognizing the Arabic alphabet is discussed. In this problem moments and central moments were used as discriminating features. A fuzzy classifier was designed in a way that incorporated some statistical knowledge of the problem in hand. Performance of this classifier was compared to a Bayesian classifier and a neural network classifier. Performance, evaluation, and advantages and disadvantages of each classifier is reported and discussed

    Energy-based Self-attentive Learning of Abstractive Communities for Spoken Language Understanding

    Full text link
    Abstractive community detection is an important spoken language understanding task, whose goal is to group utterances in a conversation according to whether they can be jointly summarized by a common abstractive sentence. This paper provides a novel approach to this task. We first introduce a neural contextual utterance encoder featuring three types of self-attention mechanisms. We then train it using the siamese and triplet energy-based meta-architectures. Experiments on the AMI corpus show that our system outperforms multiple energy-based and non-energy based baselines from the state-of-the-art. Code and data are publicly available.Comment: Update baseline

    The Bayesian Spatial Bradley--Terry Model: Urban Deprivation Modeling in Tanzania

    Full text link
    Identifying the most deprived regions of any country or city is key if policy makers are to design successful interventions. However, locating areas with the greatest need is often surprisingly challenging in developing countries. Due to the logistical challenges of traditional household surveying, official statistics can be slow to be updated; estimates that exist can be coarse, a consequence of prohibitive costs and poor infrastructures; and mass urbanisation can render manually surveyed figures rapidly out-of-date. Comparative judgement models, such as the Bradley--Terry model, offer a promising solution. Leveraging local knowledge, elicited via comparisons of different areas' affluence, such models can both simplify logistics and circumvent biases inherent to house-hold surveys. Yet widespread adoption remains limited, due to the large amount of data existing approaches still require. We address this via development of a novel Bayesian Spatial Bradley--Terry model, which substantially decreases the amount of data comparisons required for effective inference. This model integrates a network representation of the city or country, along with assumptions of spatial smoothness that allow deprivation in one area to be informed by neighbouring areas. We demonstrate the practical effectiveness of this method, through a novel comparative judgement data set collected in Dar es Salaam, Tanzania.Comment: 23 pages, 7 figures, to be published in the journal of the Royal Statistical Society: Series

    Dynamic temporary blood facility location-allocation during and post-disaster periods

    Get PDF
    The key objective of this study is to develop a tool (hybridization or integration of different techniques) for locating the temporary blood banks during and post-disaster conditions that could serve the hospitals with minimum response time. We have used temporary blood centers, which must be located in such a way that it is able to serve the demand of hospitals in nearby region within a shorter duration. We are locating the temporary blood centres for which we are minimizing the maximum distance with hospitals. We have used Tabu search heuristic method to calculate the optimal number of temporary blood centres considering cost components. In addition, we employ Bayesian belief network to prioritize the factors for locating the temporary blood facilities. Workability of our model and methodology is illustrated using a case study including blood centres and hospitals surrounding Jamshedpur city. Our results shows that at-least 6 temporary blood facilities are required to satisfy the demand of blood during and post-disaster periods in Jamshedpur. The results also show that that past disaster conditions, response time and convenience for access are the most important factors for locating the temporary blood facilities during and post-disaster periods
    • …
    corecore