106 research outputs found
Non-linear ICA based on Cramer-Wold metric
Non-linear source separation is a challenging open problem with many
applications. We extend a recently proposed Adversarial Non-linear ICA (ANICA)
model, and introduce Cramer-Wold ICA (CW-ICA). In contrast to ANICA we use a
simple, closed--form optimization target instead of a discriminator--based
independence measure. Our results show that CW-ICA achieves comparable results
to ANICA, while foregoing the need for adversarial training
Classification and Separation of Audio and Music Signals
This chapter addresses the topic of classification and separation of audio and music signals. It is a very important and a challenging research area. The importance of classification process of a stream of sounds come up for the sake of building two different libraries: speech library and music library. However, the separation process is needed sometimes in a cocktail-party problem to separate speech from music and remove the undesired one. In this chapter, some existed algorithms for the classification process and the separation process are presented and discussed thoroughly. The classification algorithms will be divided into three categories. The first category includes most of the real time approaches. The second category includes most of the frequency domain approaches. However, the third category introduces some of the approaches in the time-frequency distribution. The approaches of time domain discussed in this chapter are the short-time energy (STE), the zero-crossing rate (ZCR), modified version of the ZCR and the STE with positive derivative, the neural networks, and the roll-off variance. The approaches of the frequency spectrum are specifically the roll-off of the spectrum, the spectral centroid and the variance of the spectral centroid, the spectral flux and the variance of the spectral flux, the cepstral residual, and the delta pitch. The time-frequency domain approaches have not been yet tested thoroughly in the process of classification and separation of audio and music signals. Therefore, the spectrogram and the evolutionary spectrum will be introduced and discussed. In addition, some algorithms for separation and segregation of music and audio signals, like the independent Component Analysis, the pitch cancelation and the artificial neural networks will be introduced
On-line quality control in polymer processing using hyperspectral imaging
Lâindustrie du plastique se tourne de plus en plus vers les matĂ©riaux composites afin dâĂ©conomiser de la matiĂšre et/ou dâutiliser des matiĂšres premiĂšres Ă moindres coĂ»ts, tout en conservant de bonnes propriĂ©tĂ©s. Lâimpressionnante adaptabilitĂ© des matĂ©riaux composites provient du fait que le manufacturier peut modifier le choix des matĂ©riaux utilisĂ©s, la proportion selon laquelle ils sont mĂ©langĂ©s, ainsi que la mĂ©thode de mise en Ćuvre utilisĂ©e. La principale difficultĂ© associĂ©e au dĂ©veloppement de ces matĂ©riaux est lâhĂ©tĂ©rogĂ©nĂ©itĂ© de composition ou de structure, qui entraĂźne gĂ©nĂ©ralement des dĂ©faillances mĂ©caniques. La qualitĂ© des prototypes est normalement mesurĂ©e en laboratoire, Ă partir de tests destructifs et de mĂ©thodes nĂ©cessitant la prĂ©paration des Ă©chantillons. La mesure en-ligne de la qualitĂ© permettrait une rĂ©troaction quasi-immĂ©diate sur les conditions dâopĂ©ration des Ă©quipements, en plus dâĂȘtre directement utilisable pour le contrĂŽle de la qualitĂ© dans une situation de production industrielle. Lâobjectif de la recherche proposĂ©e consiste Ă dĂ©velopper un outil de contrĂŽle de qualitĂ© pour la qualitĂ© des matĂ©riaux plastiques de tout genre. Quelques sondes de type proche infrarouge ou ultrasons existent prĂ©sentement pour la mesure de la composition en-ligne, mais celles-ci ne fournissent quâune valeur ponctuelle Ă chaque acquisition. Ce type de mĂ©thode est donc mal adaptĂ© pour identifier la distribution des caractĂ©ristiques de surface de la piĂšce (i.e. homogĂ©nĂ©itĂ©, orientation, dispersion). Afin dâatteindre cet objectif, un systĂšme dâimagerie hyperspectrale est proposĂ©. Ă lâaide de cet appareil, il est possible de balayer la surface de la piĂšce et dâobtenir une image hyperspectrale, câest-Ă -dire une image formĂ©e de lâintensitĂ© lumineuse Ă des centaines de longueurs dâonde et ce, pour chaque pixel de lâimage. Lâapplication de mĂ©thodes chimiomĂ©triques permettent ensuite dâextraire les caractĂ©ristiques spatiales et spectrales de lâĂ©chantillon prĂ©sentes dans ces images. Finalement, les mĂ©thodes de rĂ©gression multivariĂ©e permettent dâĂ©tablir un modĂšle liant les caractĂ©ristiques identifiĂ©es aux propriĂ©tĂ©s de la piĂšce. La construction dâun modĂšle mathĂ©matique forme donc lâoutil dâanalyse en-ligne de la qualitĂ© des piĂšces qui peut Ă©galement prĂ©dire et optimiser les conditions de fabrication.The use of plastic composite materials has been increasing in recent years in order to reduce the amount of material used and/or use more economic materials, all of which without compromising the properties. The impressive adaptability of these composite materials comes from the fact that the manufacturer can choose the raw materials, the proportion in which they are blended as well as the processing conditions. However, these materials tend to suffer from heterogeneous compositions and structures, which lead to mechanical weaknesses. Product quality is generally measured in the laboratory, using destructive tests often requiring extensive sample preparation. On-line quality control would allow near-immediate feedback on the operating conditions and may be transferrable to an industrial production context. The proposed research consists of developing an on-line quality control tool adaptable to plastic materials of all types. A number of infrared and ultrasound probes presently exist for on-line composition estimation, but only provide single-point values at each acquisition. These methods are therefore less adapted for identifying the spatial distribution of a sampleâs surface characteristics (e.g. homogeneity, orientation, dispersion). In order to achieve this objective, a hyperspectral imaging system is proposed. Using this tool, it is possible to scan the surface of a sample and obtain a hyperspectral image, that is to say an image in which each pixel captures the light intensity at hundreds of wavelengths. Chemometrics methods can then be applied to this image in order to extract the relevant spatial and spectral features. Finally, multivariate regression methods are used to build a model between these features and the properties of the sample. This mathematical model forms the backbone of an on-line quality assessment tool used to predict and optimize the operating conditions under which the samples are processed
Advances in independent component analysis and nonnegative matrix factorization
A fundamental problem in machine learning research, as well as in many other disciplines, is finding a suitable representation of multivariate data, i.e. random vectors. For reasons of computational and conceptual simplicity, the representation is often sought as a linear transformation of the original data. In other words, each component of the representation is a linear combination of the original variables. Well-known linear transformation methods include principal component analysis (PCA), factor analysis, and projection pursuit. In this thesis, we consider two popular and widely used techniques: independent component analysis (ICA) and nonnegative matrix factorization (NMF).
ICA is a statistical method in which the goal is to find a linear representation of nongaussian data so that the components are statistically independent, or as independent as possible. Such a representation seems to capture the essential structure of the data in many applications, including feature extraction and signal separation. Starting from ICA, several methods of estimating the latent structure in different problem settings are derived and presented in this thesis. FastICA as one of most efficient and popular ICA algorithms has been reviewed and discussed. Its local and global convergence and statistical behavior have been further studied. A nonnegative FastICA algorithm is also given in this thesis.
Nonnegative matrix factorization is a recently developed technique for finding parts-based, linear representations of non-negative data. It is a method for dimensionality reduction that respects the nonnegativity of the input data while constructing a low-dimensional approximation. The non-negativity constraints make the representation purely additive (allowing no subtractions), in contrast to many other linear representations such as principal component analysis and independent component analysis. A literature survey of Nonnegative matrix factorization is given in this thesis, and a novel method called Projective Nonnegative matrix factorization (P-NMF) and its applications are provided
Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows
Sliced-Wasserstein Flow (SWF) is a promising approach to nonparametric
generative modeling but has not been widely adopted due to its suboptimal
generative quality and lack of conditional modeling capabilities. In this work,
we make two major contributions to bridging this gap. First, based on a
pleasant observation that (under certain conditions) the SWF of joint
distributions coincides with those of conditional distributions, we propose
Conditional Sliced-Wasserstein Flow (CSWF), a simple yet effective extension of
SWF that enables nonparametric conditional modeling. Second, we introduce
appropriate inductive biases of images into SWF with two techniques inspired by
local connectivity and multiscale representation in vision research, which
greatly improve the efficiency and quality of modeling images. With all the
improvements, we achieve generative performance comparable with many deep
parametric generative models on both conditional and unconditional tasks in a
purely nonparametric fashion, demonstrating its great potential.Comment: ICML 202
Statistical Methods to Enhance Clinical Prediction with High-Dimensional Data and Ordinal Response
Der technologische Fortschritt ermöglicht es heute, die moleculare
Konfiguration einzelner Zellen oder ganzer Gewebeproben zu
untersuchen. Solche in groĂen Mengen produzierten
hochdimensionalen Omics-Daten aus der Molekularbiologie lassen sich
zu immer niedrigeren Kosten erzeugen und werden so immer
hÀufiger auch in klinischen Fragestellungen eingesetzt.
Personalisierte Diagnose oder auch die Vorhersage eines
Behandlungserfolges auf der Basis solcher Hochdurchsatzdaten stellen
eine moderne Anwendung von Techniken aus dem maschinellen Lernen dar.
In der Praxis werden klinische Parameter, wie etwa der
Gesundheitszustand oder die Nebenwirkungen einer Therapie, hÀufig auf
einer ordinalen Skala erhoben (beispielsweise gut, normal,
schlecht).
Es ist verbreitet, Klassifikationsproblme mit ordinal skaliertem
Endpunkt wie generelle Mehrklassenproblme zu behandeln und somit die
Information, die in der Ordnung zwischen den Klassen enthalten ist, zu
ignorieren. Allerdings kann das VernachlÀssigen dieser Information zu
einer verminderten KlassifikationsgĂŒte fĂŒhren oder sogar eine
ungĂŒnstige ungeordnete Klassifikation erzeugen.
Klassische AnsÀtze, einen ordinal skalierten Endpunkt direkt zu
modellieren, wie beispielsweise mit einem kumulativen Linkmodell,
lassen sich typischerweise nicht auf hochdimensionale Daten anwenden.
Wir prÀsentieren in dieser Arbeit hierarchical twoing (hi2) als
einen Algorithmus fĂŒr die Klassifikation hochdimensionler Daten in
ordinal Skalierte Kategorien. hi2 nutzt die MĂ€chtigkeit der
sehr gut verstandenen binÀren Klassifikation, um auch in ordinale
Kategorien zu klassifizieren. Eine Opensource-Implementierung von
hi2 ist online verfĂŒgbar.
In einer Vergleichsstudie zur Klassifikation von echten wie von
simulierten Daten mit ordinalem Endpunkt produzieren etablierte
Methoden, die speziell fĂŒr geordnete Kategorien entworfen wurden,
nicht generell bessere Ergebnisse als state-of-the-art
nicht-ordinale Klassifikatoren. Die FĂ€higkeit eines Algorithmus, mit
hochdimensionalen Daten umzugehen, dominiert die
Klassifikationsleisting. Wir zeigen, dass unser Algorithmus hi2
konsistent gute Ergebnisse erzielt und in vielen FĂ€llen besser
abschneidet als die anderen Methoden
- âŠ