Search CORE

21 research outputs found

Tensor-based regression models and applications

Author: Hou Ming
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2017
Field of study

Tableau d’honneur de la Faculté des études supérieures et postdoctorales, 2017-2018Avec l’avancement des technologies modernes, les tenseurs d’ordre élevé sont assez répandus et abondent dans un large éventail d’applications telles que la neuroscience informatique, la vision par ordinateur, le traitement du signal et ainsi de suite. La principale raison pour laquelle les méthodes de régression classiques ne parviennent pas à traiter de façon appropriée des tenseurs d’ordre élevé est due au fait que ces données contiennent des informations structurelles multi-voies qui ne peuvent pas être capturées directement par les modèles conventionnels de régression vectorielle ou matricielle. En outre, la très grande dimensionnalité de l’entrée tensorielle produit une énorme quantité de paramètres, ce qui rompt les garanties théoriques des approches de régression classique. De plus, les modèles classiques de régression se sont avérés limités en termes de difficulté d’interprétation, de sensibilité au bruit et d’absence d’unicité. Pour faire face à ces défis, nous étudions une nouvelle classe de modèles de régression, appelés modèles de régression tensor-variable, où les prédicteurs indépendants et (ou) les réponses dépendantes prennent la forme de représentations tensorielles d’ordre élevé. Nous les appliquons également dans de nombreuses applications du monde réel pour vérifier leur efficacité et leur efficacité.With the advancement of modern technologies, high-order tensors are quite widespread and abound in a broad range of applications such as computational neuroscience, computer vision, signal processing and so on. The primary reason that classical regression methods fail to appropriately handle high-order tensors is due to the fact that those data contain multiway structural information which cannot be directly captured by the conventional vector-based or matrix-based regression models, causing substantial information loss during the regression. Furthermore, the ultrahigh dimensionality of tensorial input produces huge amount of parameters, which breaks the theoretical guarantees of classical regression approaches. Additionally, the classical regression models have also been shown to be limited in terms of difficulty of interpretation, sensitivity to noise and absence of uniqueness. To deal with these challenges, we investigate a novel class of regression models, called tensorvariate regression models, where the independent predictors and (or) dependent responses take the form of high-order tensorial representations. We also apply them in numerous real-world applications to verify their efficiency and effectiveness. Concretely, we first introduce hierarchical Tucker tensor regression, a generalized linear tensor regression model that is able to handle potentially much higher order tensor input. Then, we work on online local Gaussian process for tensor-variate regression, an efficient nonlinear GPbased approach that can process large data sets at constant time in a sequential way. Next, we present a computationally efficient online tensor regression algorithm with general tensorial input and output, called incremental higher-order partial least squares, for the setting of infinite time-dependent tensor streams. Thereafter, we propose a super-fast sequential tensor regression framework for general tensor sequences, namely recursive higher-order partial least squares, which addresses issues of limited storage space and fast processing time allowed by dynamic environments. Finally, we introduce kernel-based multiblock tensor partial least squares, a new generalized nonlinear framework that is capable of predicting a set of tensor blocks by merging a set of tensor blocks from different sources with a boosted predictive power

CorpusUL

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Lee N.
Mandic D.
Oseledets I. V.
Phan A-H.
Sugiyama M.
Zhao Q.
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

arXiv.org e-Print Archive

Crossref

CERN Document Server

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Phan A-H.
Zhao Q.
Lee N.
Oseledets I. V.
Sugiyama M.
Mandic D.
Publication venue
Publication date: 01/01/2017
Field of study

arXiv.org e-Print Archive

Crossref

FigShare

A Review of Kernel Methods for Feature Extraction in Nonlinear Process Monitoring

Author: Bishop
Chakour
Chiang
Cristianini
Domingos
Gönen
Halim
Hastie
Kolesnikov
Melis
Murphy
Shawe-Taylor
Vachtsevanos
Wilson
Yang
Publication venue: 'MDPI AG'
Publication date: 23/12/2019
Field of study

Kernel methods are a class of learning machines for the fast recognition of nonlinear patterns in any data set. In this paper, the applications of kernel methods for feature extraction in industrial process monitoring are systematically reviewed. First, we describe the reasons for using kernel methods and contextualize them among other machine learning tools. Second, by reviewing a total of 230 papers, this work has identified 12 major issues surrounding the use of kernel methods for nonlinear feature extraction. Each issue was discussed as to why they are important and how they were addressed through the years by many researchers. We also present a breakdown of the commonly used kernel functions, parameter selection routes, and case studies. Lastly, this review provides an outlook into the future of kernel-based process monitoring, which can hopefully instigate more advanced yet practical solutions in the process industries

Multidisciplinary Digital Publishing Institute

Crossref

Cranfield CERES

Kent Academic Repository

Networkmetrics: Multivariate Big Data Analysis in the Context of the Internet

Author: Acar
Al-Karaki
Arteaga
Arteaga
Barker
Bin
Brereton
Broadhursta
Callado
Camacho
Camacho
Camacho
Camacho
Camacho
Camacho
Camacho
Chatzigiannakis
Chen
Dainotti
Dayal
Dengiz
Dunia
Ellis
Ferrer
Flores-Cerrillo
García-Teodoro
Guo
Guo
Gurden
Halko
Hotelling
Jackson
Jackson
Keralapura
Kourti
Kresta
Lakhina
Lakhina
Lakhina
Li
Lundstedt
MacGregor
Magán-Carrión
Magán-Carrión
Marini
Marty
Mehmood
Nelson
Nguyen
Nomikos
Nomikos
Nucci
Pióro
Schupke
Schupke
Sen
Soule
Sousal
Tracy
Vardi
Vogt
Westerhuis
William
Wise
Wood
Yabuki
Yerima
Yick
Zhang
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Multivariate problems are found in all areas of knowledge. In chemistry and related disciplines, the chemometric community was developed in a joint effort to understand and solve problems mainly from a multivariate and exploratory perspective. This perspective is, indeed, of broader applicability, even in areas of knowledge far from chemistry. In this paper, we focus on the Internet: the net of devices that allow an interconnected world where all types of data can be shared and unprecedented communication services can be provided. Problems in the Internet, or in general in networking, are not very different from chemometric problems. Building on this parallelism, we review four classes of problems in networking: estimation, anomaly detection, optimization and classification. We present an illustrative set of problems and show how a multivariate perspective may lead to significant improvements from stateof-the-art techniques. In absence of a better name we call the approach of treating these problems from that multivariate perspective networkmetrics. Networkmetric problems have their own specificities, mainly their typical Big Data nature and the presence of unstructured data. We argue that multivariate analysis is, indeed, useful to tackle these specificities

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositorio Institucional Universidad de Granada

Cancer Subtyping Detection using Biomarker Discovery in Multi-Omics Tensor Datasets

Author: Koleini Farnoosh
Publication venue: 'East Carolina University'
Publication date: 12/09/2023
Field of study

This thesis begins with a thorough review of research trends from 2015 to 2022, examining the challenges and issues related to biomarker discovery in multi-omics datasets. The review covers areas of application, proposed methodologies, evaluation criteria used to assess performance, as well as limitations and drawbacks that require further investigation and improvement. This comprehensive overview serves to provide a deeper understanding of the current state of research in this field and the opportunities for future research. It will be particularly useful for those who are interested in this area of study and seeking to expand their knowledge. In the second part of this thesis, a novel methodology is proposed for the identification of significant biomarkers in a multi-omics colon cancer dataset. The integration of clinical features with biomarker discovery has the potential to facilitate the early identification of mortality risk and the development of personalized therapies for a range of diseases, including cancer and stroke. Recent advancements in â€œomicsâ€� technologies have opened up new avenues for researchers to identify disease biomarkers through system-level analysis. Machine learning methods, particularly those based on tensor decomposition techniques, have gained popularity due to the challenges associated with integrative analysis of multi-omics data owing to the complexity of biological systems. Despite extensive efforts towards discovering disease-associated biomolecules by analyzing data from various â€œomicsâ€� experiments, such as genomics, transcriptomics, and metabolomics, the poor integration of diverse forms of 'omics' data has made the integrative analysis of multi-omics data a daunting task. Our research includes ANOVA simultaneous component analysis (ASCA) and Tucker3 modeling to analyze a multivariate dataset with an underlying experimental design. By comparing the spaces spanned by different model components we showed how the two methods can be used for confirmatory analysis and provide complementary information. we demonstrated the novel use of ASCA to analyze the residuals of Tucker3 models to find the optimum one. Increasing the model complexity to more factors removed the last remaining ASCA detectable structure in the residuals. Bootstrap analysis of the core matrix values of the Tucker3 models used to check that additional triads of eigenvectors were needed to describe the remaining structure in the residuals. Also, we developed a new simple, novel strategy for aligning Tucker3 bootstrap models with the Tucker3 model of the original data so that eigenvectors of the three modes, the order of the values in the core matrix, and their algebraic signs match the original Tucker3 model without the need for complicated bookkeeping strategies or performing rotational transformations. Additionally, to avoid getting an overparameterized Tucker3 model, we used the bootstrap method to determine 95% confidence intervals of the loadings and core values. Also, important variables for classification were identified by inspection of loading confidence intervals. The experimental results obtained using the colon cancer dataset demonstrate that our proposed methodology is effective in improving the performance of biomarker discovery in a multi-omics cancer dataset. Overall, our study highlights the potential of integrating multi-omics data with machine learning methods to gain deeper insights into the complex biological mechanisms underlying cancer and other diseases. The experimental results using NIH colon cancer dataset demonstrate that the successful application of our proposed methodology in cancer subtype classification provides a foundation for further investigation into its utility in other disease areas

ScholarShip

SIS 2017. Statistics and Data Science: new challenges, new generations

Author
Publication venue: 'Firenze University Press'
Publication date: 31/05/2022
Field of study

The 2017 SIS Conference aims to highlight the crucial role of the Statistics in Data Science. In this new domain of ‘meaning’ extracted from the data, the increasing amount of produced and available data in databases, nowadays, has brought new challenges. That involves different fields of statistics, machine learning, information and computer science, optimization, pattern recognition. These afford together a considerable contribute in the analysis of ‘Big data’, open data, relational and complex data, structured and no-structured. The interest is to collect the contributes which provide from the different domains of Statistics, in the high dimensional data quality validation, sampling extraction, dimensional reduction, pattern selection, data modelling, testing hypotheses and confirming conclusions drawn from the data

Directory of Open Access Books (DOAB)

Recommended from our members

Identification of brain epileptiform discharges from electroencephalograms

Author: Abdi-Sargezeh B
Publication venue
Publication date: 01/10/2022
Field of study

Brain interictal epileptiform discharges (IEDs), as the fundamental indicators of seizure, are transient events occurring between two or before seizure onsets, captured using electroencephalogram (EEG). For epilepsy diagnosis and localization of seizure sources, both interictal and ictal recordings are extremely informative. Accurate detection of IEDs from over the scalp helps faster diagnosis of epilepsy. The scalp EEG (sEEG) suffers from a low signal-to-noise ratio and high attenuation of IEDs due to the high skull electrical impedance. On the other hand, the intracranial EEG (iEEG) recorded using implanted electrodes enjoys high temporal-spatial resolution and enables capturing most IEDs. Therefore, in this thesis, the focus is on the identification of IEDs from the concurrent scalp and intracranial EEGs. Multi-way analysis provides an opportunity to jointly analyse the data in different domains. IEDs may share some features within and between the segments. We have developed methods based on multi-way analysis and tensor factorization to detect the IEDs from the concurrent sEEG in both segmented and real-time approaches. The diversities in IED morphology, strength, and source location within the brain cause a great deal of uncertainty in their labeling by clinicians. We have exploited and incorporated this uncertainty (the probability of the waveform being an IED) in an IED detection system. Furthermore, IEDs are naturally sparse. We have benefited from the sparsity of IED waveforms in developing an algorithm to exploit sparse common features among the IED segments, referred to as sparse common feature analysis. By mapping sEEG to iEEG, the sEEG quality is improved. In this thesis, the proposed tensor factorization maps the time-frequency features of sEEG to those of iEEG to detect the IEDs from over the scalp with high sensitivity. We have concatenated time, frequency, and channel modes of iEEG recordings into a tensor. After decomposing the tensor into temporal, spectral, and spatial components, the EEG time-frequency features have been extracted and projected onto the temporal components. Furthermore, we have developed two novel algorithms based on generative adversarial networks to map the raw sEEG to iEEG. As a result of this work, the visibility of IEDs from sEEG has over 4-fold improvement. Additionally, the outcome paves the path for future research in epilepsy prediction, seizure source localisation, and modeling the brain seizure pathways

Nottingham Trent Institutional Repository (IRep)

Common and Discriminative Subspace Kernel-Based Multiblock Tensor Partial Least Squares Regression

Author: Chaib-draa Brahim
Cichocki Andrzej
Hou Ming
Zhao Qibin
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 21/02/2016
Field of study

In this work, we introduce a new generalized nonlinear tensor regression framework called kernel-based multiblock tensor partial least squares (KMTPLS) for predicting a set of dependent tensor blocks from a set of independent tensor blocks through the extraction of a small number of common and discriminative latent components. By considering both common and discriminative features, KMTPLS effectively fuses the information from multiple tensorial data sources and unifies the single and multiblock tensor regression scenarios into one general model. Moreover, in contrast to multilinear model, KMTPLS successfully addresses the nonlinear dependencies between multiple response and predictor tensor blocks by combining kernel machines with joint Tucker decomposition, resulting in a significant performance gain in terms of predictability. An efficient learning algorithm for KMTPLS based on sequentially extracting common and discriminative latent vectors is also presented. Finally, to show the effectiveness and advantages of our approach, we test it on the real-life regression task in computer vision, i.e., reconstruction of human pose from multiview video sequences

Association for the Advancement of Artificial Intelligence: AAAI Publications