58,053 research outputs found

    A study on model selection of binary and non-Gaussian factor analysis.

    Get PDF
    An, Yujia.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 71-76).Abstracts in English and Chinese.Abstract --- p.iiAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Background --- p.1Chapter 1.1.1 --- Review on BFA --- p.2Chapter 1.1.2 --- Review on NFA --- p.3Chapter 1.1.3 --- Typical model selection criteria --- p.5Chapter 1.1.4 --- New model selection criterion and automatic model selection --- p.6Chapter 1.2 --- Our contributions --- p.7Chapter 1.3 --- Thesis outline --- p.8Chapter 2 --- Combination of B and BI architectures for BFA with automatic model selection --- p.10Chapter 2.1 --- Implementation of BFA using BYY harmony learning with au- tomatic model selection --- p.11Chapter 2.1.1 --- Basic issues of BFA --- p.11Chapter 2.1.2 --- B-architecture for BFA with automatic model selection . --- p.12Chapter 2.1.3 --- BI-architecture for BFA with automatic model selection . --- p.14Chapter 2.2 --- Local minima in B-architecture and BI-architecture --- p.16Chapter 2.2.1 --- Local minima in B-architecture --- p.16Chapter 2.2.2 --- One unstable result in BI-architecture --- p.21Chapter 2.3 --- Combination of B- and BI-architecture for BFA with automatic model selection --- p.23Chapter 2.3.1 --- Combine B-architecture and BI-architecture --- p.23Chapter 2.3.2 --- Limitations of BI-architecture --- p.24Chapter 2.4 --- Experiments --- p.25Chapter 2.4.1 --- Frequency of local minima occurring in B-architecture --- p.25Chapter 2.4.2 --- Performance comparison for several methods in B-architecture --- p.26Chapter 2.4.3 --- Comparison of local minima in B-architecture and BI- architecture --- p.26Chapter 2.4.4 --- Frequency of unstable cases occurring in BI-architecture --- p.27Chapter 2.4.5 --- Comparison of performance of three strategies --- p.27Chapter 2.4.6 --- Limitations of BI-architecture --- p.28Chapter 2.5 --- Summary --- p.29Chapter 3 --- A Comparative Investigation on Model Selection in Binary Factor Analysis --- p.31Chapter 3.1 --- Binary Factor Analysis and ML Learning --- p.32Chapter 3.2 --- Hidden Factors Number Determination --- p.33Chapter 3.2.1 --- Using Typical Model Selection Criteria --- p.33Chapter 3.2.2 --- Using BYY harmony Learning --- p.34Chapter 3.3 --- Empirical Comparative Studies --- p.36Chapter 3.3.1 --- Effects of Sample Size --- p.37Chapter 3.3.2 --- Effects of Data Dimension --- p.37Chapter 3.3.3 --- Effects of Noise Variance --- p.39Chapter 3.3.4 --- Effects of hidden factor number --- p.43Chapter 3.3.5 --- Computing Costs --- p.43Chapter 3.4 --- Summary --- p.46Chapter 4 --- A Comparative Investigation on Model Selection in Non-gaussian Factor Analysis --- p.47Chapter 4.1 --- Non-Gaussian Factor Analysis and ML Learning --- p.48Chapter 4.2 --- Hidden Factor Determination --- p.51Chapter 4.2.1 --- Using typical model selection criteria --- p.51Chapter 4.2.2 --- BYY harmony Learning --- p.52Chapter 4.3 --- Empirical Comparative Studies --- p.55Chapter 4.3.1 --- Effects of Sample Size on Model Selection Criteria --- p.56Chapter 4.3.2 --- Effects of Data Dimension on Model Selection Criteria --- p.60Chapter 4.3.3 --- Effects of Noise Variance on Model Selection Criteria --- p.64Chapter 4.3.4 --- Discussion on Computational Cost --- p.64Chapter 4.4 --- Summary --- p.68Chapter 5 --- Conclusions --- p.69Bibliography --- p.7

    Detecting Family Resemblance: Automated Genre Classification.

    Get PDF
    This paper presents results in automated genre classification of digital documents in PDF format. It describes genre classification as an important ingredient in contextualising scientific data and in retrieving targetted material for improving research. The current paper compares the role of visual layout, stylistic features and language model features in clustering documents and presents results in retrieving five selected genres (Scientific Article, Thesis, Periodicals, Business Report, and Form) from a pool of materials populated with documents of the nineteen most popular genres found in our experimental data set.

    Anomaly Detection Based on Indicators Aggregation

    Full text link
    Automatic anomaly detection is a major issue in various areas. Beyond mere detection, the identification of the source of the problem that produced the anomaly is also essential. This is particularly the case in aircraft engine health monitoring where detecting early signs of failure (anomalies) and helping the engine owner to implement efficiently the adapted maintenance operations (fixing the source of the anomaly) are of crucial importance to reduce the costs attached to unscheduled maintenance. This paper introduces a general methodology that aims at classifying monitoring signals into normal ones and several classes of abnormal ones. The main idea is to leverage expert knowledge by generating a very large number of binary indicators. Each indicator corresponds to a fully parametrized anomaly detector built from parametric anomaly scores designed by experts. A feature selection method is used to keep only the most discriminant indicators which are used at inputs of a Naive Bayes classifier. This give an interpretable classifier based on interpretable anomaly detectors whose parameters have been optimized indirectly by the selection process. The proposed methodology is evaluated on simulated data designed to reproduce some of the anomaly types observed in real world engines.Comment: International Joint Conference on Neural Networks (IJCNN 2014), Beijing : China (2014). arXiv admin note: substantial text overlap with arXiv:1407.088

    Learning Determinantal Point Processes

    Get PDF
    Determinantal point processes (DPPs), which arise in random matrix theory and quantum physics, are natural models for subset selection problems where diversity is preferred. Among many remarkable properties, DPPs offer tractable algorithms for exact inference, including computing marginal probabilities and sampling; however, an important open question has been how to learn a DPP from labeled training data. In this paper we propose a natural feature-based parameterization of conditional DPPs, and show how it leads to a convex and efficient learning formulation. We analyze the relationship between our model and binary Markov random fields with repulsive potentials, which are qualitatively similar but computationally intractable. Finally, we apply our approach to the task of extractive summarization, where the goal is to choose a small subset of sentences conveying the most important information from a set of documents. In this task there is a fundamental tradeoff between sentences that are highly relevant to the collection as a whole, and sentences that are diverse and not repetitive. Our parameterization allows us to naturally balance these two characteristics. We evaluate our system on data from the DUC 2003/04 multi-document summarization task, achieving state-of-the-art results
    • …
    corecore