451 research outputs found

    Robust Model-free Variable Screening, Double-parallel Monte Carlo and Average Bayesian Information Criterion

    Get PDF
    Big data analysis and high dimensional data analysis are two popular and challenging topics in current statistical research. They bring us a lot of opportunities as well as many challenges. For big data, traditional methods are generally not efficient enough to handle them, from both time perspective and space perspective. For high dimensional data, most traditional methods can’t be implemented, let alone maintain their desirable properties, such as consistency. In this disseration, three new strategies are proposed to solve these issues. HZSIS is a robust model-free variable screening method and possesses sure screening property under the ultrahigh-dimensional setting. It works based on the nonparanormal transformation and Henze-Zirkler’s test. The numerical results indicate that, compared to the existing methods, the proposed method is more robust to the data generated from heavy-tailed distributions and/or complex models with interaction variables. Double Parallel Monte Carlo is a simple, practical and efficient MCMC algorithm for Bayesian analysis of big data. The proposed algorithm suggests to divide the big dataset into some smaller subsets and provides a simple method to aggregate the subset posteriors to approximate the full data posterior. To further speed up computation, the proposed algorithm employs the population stochastic approximation Monte Carlo (Pop-SAMC) algorithm, a parallel MCMC algorithm, to simulate from each subset posterior. Since the proposed algorithm consists of two levels of parallel, data parallel and simulation parallel, it is coined as “Double Parallel Monte Carlo”. The validity of the proposed algorithm is justified both mathematically and numerically. Average Bayesian Information Criterion (ABIC) and its high-dimensional variant Average Extended Bayesian Information Criterion (AEBIC) led to an innovative way to use posterior samples to conduct model selection. The consistency of this method is established for the high-dimensional generalized linear model under some sparsity and regularity conditions. The numerical results also indicate that, when the sample size is large enough, this method can accurately select the smallest true model with high probability

    Tensor-based regression models and applications

    Get PDF
    Tableau d’honneur de la Faculté des études supérieures et postdoctorales, 2017-2018Avec l’avancement des technologies modernes, les tenseurs d’ordre élevé sont assez répandus et abondent dans un large éventail d’applications telles que la neuroscience informatique, la vision par ordinateur, le traitement du signal et ainsi de suite. La principale raison pour laquelle les méthodes de régression classiques ne parviennent pas à traiter de façon appropriée des tenseurs d’ordre élevé est due au fait que ces données contiennent des informations structurelles multi-voies qui ne peuvent pas être capturées directement par les modèles conventionnels de régression vectorielle ou matricielle. En outre, la très grande dimensionnalité de l’entrée tensorielle produit une énorme quantité de paramètres, ce qui rompt les garanties théoriques des approches de régression classique. De plus, les modèles classiques de régression se sont avérés limités en termes de difficulté d’interprétation, de sensibilité au bruit et d’absence d’unicité. Pour faire face à ces défis, nous étudions une nouvelle classe de modèles de régression, appelés modèles de régression tensor-variable, où les prédicteurs indépendants et (ou) les réponses dépendantes prennent la forme de représentations tensorielles d’ordre élevé. Nous les appliquons également dans de nombreuses applications du monde réel pour vérifier leur efficacité et leur efficacité.With the advancement of modern technologies, high-order tensors are quite widespread and abound in a broad range of applications such as computational neuroscience, computer vision, signal processing and so on. The primary reason that classical regression methods fail to appropriately handle high-order tensors is due to the fact that those data contain multiway structural information which cannot be directly captured by the conventional vector-based or matrix-based regression models, causing substantial information loss during the regression. Furthermore, the ultrahigh dimensionality of tensorial input produces huge amount of parameters, which breaks the theoretical guarantees of classical regression approaches. Additionally, the classical regression models have also been shown to be limited in terms of difficulty of interpretation, sensitivity to noise and absence of uniqueness. To deal with these challenges, we investigate a novel class of regression models, called tensorvariate regression models, where the independent predictors and (or) dependent responses take the form of high-order tensorial representations. We also apply them in numerous real-world applications to verify their efficiency and effectiveness. Concretely, we first introduce hierarchical Tucker tensor regression, a generalized linear tensor regression model that is able to handle potentially much higher order tensor input. Then, we work on online local Gaussian process for tensor-variate regression, an efficient nonlinear GPbased approach that can process large data sets at constant time in a sequential way. Next, we present a computationally efficient online tensor regression algorithm with general tensorial input and output, called incremental higher-order partial least squares, for the setting of infinite time-dependent tensor streams. Thereafter, we propose a super-fast sequential tensor regression framework for general tensor sequences, namely recursive higher-order partial least squares, which addresses issues of limited storage space and fast processing time allowed by dynamic environments. Finally, we introduce kernel-based multiblock tensor partial least squares, a new generalized nonlinear framework that is capable of predicting a set of tensor blocks by merging a set of tensor blocks from different sources with a boosted predictive power

    Aeronautical engineering: A continuing bibliography with indexes (supplement 256)

    Get PDF
    This bibliography lists 426 reports, articles, and other documents introduced into the NASA scientific and technical information system in August 1990. Subject coverage includes: design, construction and testing of aircraft and aircraft engines; aircraft components, equipment and systems; ground support systems; and theoretical and applied aspects of aerodynamics and general fluid dynamics

    Extending Structural Learning Paradigms for High-Dimensional Machine Learning and Analysis

    Get PDF
    Structure-based machine-learning techniques are frequently used in extensions of supervised learning, such as active, semi-supervised, multi-modal, and multi-task learning. A common step in many successful methods is a structure-discovery process that is made possible through the addition of new information, which can be user feedback, unlabeled data, data from similar tasks, alternate views of the problem, etc. Learning paradigms developed in the above-mentioned fields have led to some extremely flexible, scalable, and successful multivariate analysis approaches. This success and flexibility offer opportunities to expand the use of machine learning paradigms to more complex analyses. In particular, while information is often readily available concerning complex problems, the relationships among the information rarely follow the simple labeled-example-based setup that supervised learning is based upon. Even when it is possible to incorporate additional data in such forms, the result is often an explosion in the dimensionality of the input space, such that both sample complexity and computational complexity can limit real-world success. In this work, we review many of the latest structural learning approaches for dealing with sample complexity. We expand their use to generate new paradigms for combining some of these learning strategies to address more complex problem spaces. We overview extreme-scale data analysis problems where sample complexity is a much more limiting factor than computational complexity, and outline new structural-learning approaches for dealing jointly with both. We develop and demonstrate a method for dealing with sample complexity in complex systems that leads to a more scalable algorithm than other approaches to large-scale multi-variate analysis. This new approach reflects the underlying problem structure more accurately by using interdependence to address sample complexity, rather than ignoring it for the sake of tractability

    Analog Photonics Computing for Information Processing, Inference and Optimisation

    Full text link
    This review presents an overview of the current state-of-the-art in photonics computing, which leverages photons, photons coupled with matter, and optics-related technologies for effective and efficient computational purposes. It covers the history and development of photonics computing and modern analogue computing platforms and architectures, focusing on optimization tasks and neural network implementations. The authors examine special-purpose optimizers, mathematical descriptions of photonics optimizers, and their various interconnections. Disparate applications are discussed, including direct encoding, logistics, finance, phase retrieval, machine learning, neural networks, probabilistic graphical models, and image processing, among many others. The main directions of technological advancement and associated challenges in photonics computing are explored, along with an assessment of its efficiency. Finally, the paper discusses prospects and the field of optical quantum computing, providing insights into the potential applications of this technology.Comment: Invited submission by Journal of Advanced Quantum Technologies; accepted version 5/06/202

    Laboratory directed research and development. FY 1995 progress report

    Full text link
    • …
    corecore