646 research outputs found
Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds
In this paper we address the problems of modeling the acoustic space
generated by a full-spectrum sound source and of using the learned model for
the localization and separation of multiple sources that simultaneously emit
sparse-spectrum sounds. We lay theoretical and methodological grounds in order
to introduce the binaural manifold paradigm. We perform an in-depth study of
the latent low-dimensional structure of the high-dimensional interaural
spectral data, based on a corpus recorded with a human-like audiomotor robot
head. A non-linear dimensionality reduction technique is used to show that
these data lie on a two-dimensional (2D) smooth manifold parameterized by the
motor states of the listener, or equivalently, the sound source directions. We
propose a probabilistic piecewise affine mapping model (PPAM) specifically
designed to deal with high-dimensional data exhibiting an intrinsic piecewise
linear structure. We derive a closed-form expectation-maximization (EM)
procedure for estimating the model parameters, followed by Bayes inversion for
obtaining the full posterior density function of a sound source direction. We
extend this solution to deal with missing data and redundancy in real world
spectrograms, and hence for 2D localization of natural sound sources such as
speech. We further generalize the model to the challenging case of multiple
sound sources and we propose a variational EM framework. The associated
algorithm, referred to as variational EM for source separation and localization
(VESSL) yields a Bayesian estimation of the 2D locations and time-frequency
masks of all the sources. Comparisons of the proposed approach with several
existing methods reveal that the combination of acoustic-space learning with
Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table
High-Dimensional Regression with Gaussian Mixtures and Partially-Latent Response Variables
In this work we address the problem of approximating high-dimensional data
with a low-dimensional representation. We make the following contributions. We
propose an inverse regression method which exchanges the roles of input and
response, such that the low-dimensional variable becomes the regressor, and
which is tractable. We introduce a mixture of locally-linear probabilistic
mapping model that starts with estimating the parameters of inverse regression,
and follows with inferring closed-form solutions for the forward parameters of
the high-dimensional regression problem of interest. Moreover, we introduce a
partially-latent paradigm, such that the vector-valued response variable is
composed of both observed and latent entries, thus being able to deal with data
contaminated by experimental artifacts that cannot be explained with noise
models. The proposed probabilistic formulation could be viewed as a
latent-variable augmentation of regression. We devise expectation-maximization
(EM) procedures based on a data augmentation strategy which facilitates the
maximum-likelihood search over the model parameters. We propose two
augmentation schemes and we describe in detail the associated EM inference
procedures that may well be viewed as generalizations of a number of EM
regression, dimension reduction, and factor analysis algorithms. The proposed
framework is validated with both synthetic and real data. We provide
experimental evidence that our method outperforms several existing regression
techniques
High-Dimensional Regression with Gaussian Mixtures and Partially-Latent Response Variables
International audienceIn this work we address the problem of approximating high-dimensional data with a low-dimensional representation. We make the following contributions. We propose an inverse regression method which exchanges the roles of input and response, such that the low-dimensional variable becomes the regressor, and which is tractable. We introduce a mixture of locally-linear probabilistic mapping model that starts with estimating the parameters of inverse regression, and follows with inferring closed-form solutions for the forward parameters of the high-dimensional regression problem of interest. Moreover, we introduce a partially-latent paradigm, such that the vector-valued response variable is composed of both observed and latent entries, thus being able to deal with data contaminated by experimental artifacts that cannot be explained with noise models. The proposed probabilistic formulation could be viewed as a latent-variable augmentation of regression. We devise expectation-maximization (EM) procedures based on a data augmentation strategy which facilitates the maximum-likelihood search over the model parameters. We propose two augmentation schemes and we describe in detail the associated EM inference procedures that may well be viewed as generalizations of a number of EM regression, dimension reduction, and factor analysis algorithms. The proposed framework is validated with both synthetic and real data. We provide experimental evidence that our method outperforms several existing regression techniques
Econometrics meets sentiment : an overview of methodology and applications
The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software
Deep learning architectures applied to wind time series multi-step forecasting
Forecasting is a critical task for the integration of wind-generated energy into electricity grids. Numerical weather models applied to wind prediction, work with grid sizes too large to reproduce all the local features that influence wind, thus making the use of time series with past observations a necessary tool for wind forecasting. This research work is about the application of deep neural networks to multi-step forecasting using multivariate time series as an input, to forecast wind speed at 12 hours ahead.
Wind time series are sequences of meteorological observations like wind speed, temperature, pressure, humidity, and direction. Wind series have two statistically relevant properties; non-linearity and non-stationarity, which makes the modelling
with traditional statistical tools very inaccurate.
In this thesis we design, test and validate novel deep learning models for the wind energy prediction task, applying new deep architectures to the largest open wind data repository available from the National Renewable Laboratory of the US (NREL) with 126,692 wind sites evenly distributed on the US geography. The heterogeneity of the series, obtained from several data origins, allows us to obtain conclusions about the level of fitness of each model to time series that range from highly stationary locations to variable sites from complex areas.
We propose Multi-Layer, Convolutional and recurrent Networks as basic building blocks, and then combined into heterogeneous architectures with different variants, trained with optimisation strategies like drop and skip connections, early
stopping, adaptive learning rates, filters and kernels of different sizes, between others. The architectures are optimised by the use of structured hyper-parameter setting strategies to obtain the best performing model across the whole dataset.
The learning capabilities of the architectures applied to the various sites find relationships between the site characteristics (terrain complexity, wind variability, geographical location) and the model accuracy, establishing novel measures of site predictability relating the fit of the models with indexes from time series spectral or stationary analysis. The designed methods offer new, and superior, alternatives to traditional methods.La predicció de vent és clau per a la integració de l'energia eòlica en els sistemes elèctrics. Els models meteorològics es fan servir per predicció, però tenen unes graelles geogrà fiques massa grans per a reproduir totes les caracterÃstiques locals que influencien la formació de vent, fent necessà ria la predicció d'acord amb les sèries temporals de mesures passades d'una localització concreta. L'objectiu d'aquest treball d'investigació és l'aplicació de xarxes neuronals profundes a la predicció \textit{multi-step} utilitzant com a entrada series temporals de múltiples variables meteorològiques, per a fer prediccions de vent d'acà a 12 hores. Les sèries temporals de vent són seqüències d'observacions meteorològiques tals com, velocitat del vent, temperatura, humitat, pressió baromètrica o direcció. Les sèries temporals de vent tenen dues propietats estadÃstiques rellevants, que són la no linearitat i la no estacionalitat, que fan que la modelització amb eines estadÃstiques sigui poc precisa. En aquesta tesi es validen i proven models de deep learning per la predicció de vent, aquests models d'arquitectures d'autoaprenentatge s'apliquen al conjunt de dades de vent més gran del món, que ha produït el National Renewable Laboratory dels Estats Units (NREL) i que té 126,692 ubicacions fÃsiques de vent distribuïdes per total la geografia de nord Amèrica. L'heterogeneïtat d'aquestes sèries de dades permet establir conclusions fermes en la precisió de cada mètode aplicat a sèries temporals generades en llocs geogrà ficament molt diversos. Proposem xarxes neuronals profundes de tipus multi-capa, convolucionals i recurrents com a blocs bà sics sobre els quals es fan combinacions en arquitectures heterogènies amb variants, que s'entrenen amb estratègies d'optimització com drops, connexions skip, estratègies de parada, filtres i kernels de diferents mides entre altres. Les arquitectures s'optimitzen amb algorismes de selecció de parà metres que permeten obtenir el model amb el millor rendiment, en totes les dades. Les capacitats d'aprenentatge de les arquitectures aplicades a ubicacions heterogènies permet establir relacions entre les caracterÃstiques d'un lloc (complexitat del terreny, variabilitat del vent, ubicació geogrà fica) i la precisió dels models, establint mesures de predictibilitat que relacionen la capacitat dels models amb les mesures definides a partir d'anà lisi espectral o d'estacionalitat de les sèries temporals. Els mètodes desenvolupats ofereixen noves i superiors alternatives als algorismes estadÃstics i mètodes tradicionals.Arquitecturas de aprendizaje profundo aplicadas a la predición en múltiple escalón de series temporales de viento.
La predicción de viento es clave para la integración de esta energÃa eólica en los sistemas eléctricos. Los modelos meteorológicos tienen una resolución geográfica demasiado amplia que no reproduce todas las caracterÃsticas locales que influencian en la formación del viento, haciendo necesaria la predicción en base a series temporales de cada ubicación concreta.
El objetivo de este trabajo de investigación es la aplicación de redes neuronales profundas a la predicción multi-step usando como entrada series temporales de múltiples variables meteorológicas, para realizar predicciones de viento a 12 horas.
Las series temporales de viento son secuencias de observaciones meteorológicas tales como, velocidad de viento, temperatura, humedad, presión barométrica o dirección. Las series temporales de viento tienen dos propiedades estadÃsticas relevantes, que son la no linealidad y la no estacionalidad, lo que implica que su modelización con herramientas estadÃsticas sea poco precisa.
En esta tesis se validan y verifican modelos de aprendizaje profundo para la predicción de viento, estos modelos de arquitecturas de aprendizaje automático se aplican al conjunto de datos de viento más grande del mundo, que ha sido generado por el National Renewable Laboratory de los Estados Unidos (NREL) y que tiene 126,682 ubicaciones fÃsicas de viento distribuidas por toda la geografÃa de Estados Unidos. La heterogeneidad de estas series de datos permite establecer conclusiones válidas sobre la validez de cada método al ser aplicado en series temporales generadas en ubicaciones fÃsicas muy diversas.
Proponemos redes neuronales profundas de tipo multi capa, convolucionales y recurrentes como tipos básicos, sobre los que se han construido combinaciones en arquitecturas heterogéneas con variantes de entrenamiento como drops, conexiones skip, estrategias de parada, filtros y kernels de distintas medidas, entre otros. Las arquitecturas se optimizan con algoritmos de selección de parámetros que permiten obtener el mejor modelo buscando el mejor rendimiento, incluyendo todos los datos.
Las capacidades de aprendizaje de las arquitecturas aplicadas a localizaciones fÃsicas muy variadas permiten establecer relaciones entre las caracterÃsticas de una ubicación (complejidad del terreno, variabilidad de viento, ubicación geográfica) y la precisión de los modelos, estableciendo medidas de predictibilidad que relacionan la capacidad de los algoritmos con Ãndices que se definen a partir del análisis espectral o de estacionalidad de las series temporales. Los métodos desarrollados ofrecen nuevas alternativas a los algoritmos estadÃsticos tradicionales.Postprint (published version
Enhancing Prediction Efficacy with High-Dimensional Input Via Structural Mixture Modeling of Local Linear Mappings
Regression is a widely used statistical tool to discover associations between variables. Estimated relationships can be further utilized for predicting new observations. Obtaining reliable prediction outcomes is a challenging task. When building a regression model, several difficulties such as high dimensionality in predictors, non-linearity of the associations and outliers could reduce the quality of results. Furthermore, the prediction error increases if the newly acquired data is not processed carefully. In this dissertation, we aim at improving prediction performance by enhancing the model robustness at the training stage and duly handling the query data at the testing stage. We propose two methods to build robust models. One focuses on adopting a parsimonious model to limit the number of parameters and a refinement technique to enhance model robustness. We design the procedure to be carried out on parallel systems and further extend their ability to handle complex and large-scale datasets. The other method restricts the parameter space to avoid the singularity issue and takes up trimming techniques to limit the influence of outlying observations. We build both approaches by using the mixture-modeling principle to accommodate data heterogeneity without uncontrollably increasing model complexity. The proposed procedures for suitably choosing tuning parameters further enhance the ability to determine the sizes of the models according to the richness of the available data. Both methods show their ability to improve prediction performance, compared to existing approaches, in applications such as magnetic resonance vascular fingerprinting and source separation in single-channel polyphonic music, among others. To evaluate model robustness, we develop an efficient approach to generating adversarial samples, which could induce large prediction errors yet are difficult to detect visually. Finally, we propose a preprocessing system to detect and repair different kinds of abnormal testing samples for prediction efficacy, when testing samples are either corrupted or adversarially perturbed.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/149938/1/timtu_1.pd
Classification of Animal Sound Using Convolutional Neural Network
Recently, labeling of acoustic events has emerged as an active topic covering a wide range of applications. High-level semantic inference can be conducted based on main audioeffects to facilitate various content-based applications for analysis, efficient recovery and content management. This paper proposes a flexible Convolutional neural network-based framework for animal audio classification. The work takes inspiration from various deep neural network developed for multimedia classification recently. The model is driven by the ideology of identifying the animal sound in the audio file by forcing the network to pay attention to core audio effect present in the audio to generate Mel-spectrogram. The designed framework achieves an accuracy of 98% while classifying the animal audio on weekly labelled datasets. The state-of-the-art in this research is to build a framework which could even run on the basic machine and do not necessarily require high end devices to run the classification
Spatial statistical modelling of epigenomic variability
Each cell in our body carries the same genetic information encoded in the DNA, yet the human
organism contains hundreds of cell types which differ substantially in physiology and functionality.
This variability stems from the existence of regulatory mechanisms that control gene expression,
and hence phenotype. The field of epigenetics studies how changes in biochemical factors, other
than the DNA sequence itself, might affect gene regulation. The advent of high throughput
sequencing platforms has enabled the profiling of different epigenetic marks on a genome-wide
scale; however, bespoke computational methods are required to interpret these high-dimensional
data and investigate the coupling between the epigenome and transcriptome.
This thesis contributes to the development of statistical models to capture spatial correlations
of epigenetic marks, with the main focus being DNA methylation. To this end, we developed
BPRMeth (Bayesian Probit Regression for Methylation), a probabilistic model for extracting
higher order methylation features that precisely quantify the spatial variability of bulk DNA
methylation patterns. Using such features, we constructed an accurate machine learning
predictor of gene expression from DNA methylation and identified prototypical methylation
profiles that explain most of the variability across promoter regions. The BPRMeth model, and
its algorithmic implementation, were subsequently substantially extended both to accommodate
different data types, and to improve the scalability of the algorithm.
Bulk experiments have paved the way for mapping the epigenetic landscape, nonetheless,
they fall short of explaining the epigenetic heterogeneity and quantifying its dynamics, which
inherently occur at the single cell level. Single cell bisulfite sequencing protocols have been
recently developed, however, due to intrinsic limitations of the technology they result in
extremely sparse coverage of CpG sites, effectively limiting the analysis repertoire to a semi-quantitative
level. To overcome these difficulties we developed Melissa (MEthyLation Inference
for Single cell Analysis), a Bayesian hierarchical model that leverages local correlations between
neighbouring CpGs and similarity between individual cells to jointly impute missing methylation
states, and cluster cells based on their genome-wide methylation profiles.
A recent experimental innovation enables the parallel profiling of DNA methylation, transcription
and chromatin accessibility (scNMT-seq), making it possible to link transcriptional
and epigenetic heterogeneity at the single cell resolution. For the scNMT-seq study, we applied
the extended BPRMeth model to quantify cell-to-cell chromatin accessibility heterogeneity
around promoter regions and subsequently link it to transcript abundance. This revealed that
genes with conserved accessibility profiles are associated with higher average expression levels.
In summary, this thesis proposes statistical methods to model and interpret epigenomic data
generated from high throughput sequencing experiments. Due to their statistical power and
flexibility we anticipate that these methods will be applicable to future sequencing technologies
and become widespread tools in the high throughput bioinformatics workbench for performing
biomedical data analysis
- …