42 research outputs found
A Geometric Deep Learning Approach to Sound Source Localization and Tracking
La localización y el tracking de fuentes sonoras mediante agrupaciones de micrófonos es un problema que, pese a llevar décadas siendo estudiado, permanece abierto. En los últimos años, modelos basados en deep learning han superado el estado del arte que había sido establecido por las técnicas clásicas de procesado de señal, pero estos modelos todavía presentan problemas para trabajar en espacios con alta reverberación o para realizar el tracking de varias fuentes sonoras, especialmente cuando no es posible aplicar ningún criterio para clasificarlas u ordenarlas. En esta tesis, se proponen nuevos modelos que, basados en las ideas del Geometric Deep Learning, suponen un avance en el estado del arte para las situaciones mencionadas previamente.Los modelos propuestos utilizan como entrada mapas de potencia acústica calculados con el algoritmo SRP-PHAT, una técnica clásica de procesado de señal que permite estimar la energía acústica recibida desde cualquier dirección del espacio. Además, también proponemos una nueva técnica para suprimir analíticamente el efecto de una fuente en las funciones de correlación cruzada usadas para calcular los mapas SRP-PHAT. Basándonos en técnicas de banda estrecha, se demuestra que es posible proyectar las funciones de correlación cruzada de las señales capturadas por una agrupación de micrófonos a un espacio ortogonal a una dirección dada simplemente usando una combinación lineal de las funciones originales con retardos temporales. La técnica propuesta puede usarse para diseñar sistemas iterativos de localización de múltiples fuentes que, tras localizar la fuente con mayor energía en las funciones de correlación cruzada o en los mapas SRP-PHAT, la cancelen para poder encontrar otras fuentes que estuvieran enmascaradas por ella.Antes de poder entrenar modelos de deep learning necesitamos datos. Esto, en el caso de seguir un esquema de aprendizaje supervisado, supone un dataset de grabaciones de audio multicanal con la posición de las fuentes etiquetada con precisión. Pese a que existen algunos datasets con estas características, estos no son lo suficientemente extensos para entrenar una red neuronal y los entornos acústicos que incluyen no son suficientemente variados. Para solventar el problema de la falta de datos, presentamos una técnica para simular escenas acústicas con una o varias fuentes en movimiento y, para realizar estas simulaciones conforme son necesarias durante el entrenamiento de la red, presentamos la que es, que sepamos, la primera librería de software libre para la simulación de acústica de salas con aceleración por GPU. Tal y como queda demostrado en esta tesis, esta librería es más de dos órdenes de magnitud más rápida que otras librerías del estado del arte.La idea principal del Geometric Deep Learning es que los modelos deberían compartir las simetrías (i.e. las invarianzas y equivarianzas) de los datos y el problema que se quiere resolver. Para la estimación de la dirección de llegada de una única fuente, el uso de mapas SRP-PHAT como entrada de nuestros modelos hace que la equivarianza a las rotaciones sea obvia y, tras presentar una primera aproximación usando redes convolucionales tridimensionales, presentamos un modelo basado en convoluciones icosaédricas que son capaces de aproximar la equivarianza al grupo continuo de rotaciones esféricas por la equivarianza al grupo discreto de las 60 simetrías del icosaedro. En la tesis se demuestra que los mapas SRP-PHAT son una característica de entrada mucho más robusta que los espectrogramas que se usan típicamente en muchos modelos del estado del arte y que el uso de las convoluciones icosaédricas, combinado con una nueva función softargmax que obtiene una salida de regresión a partir del resultado de una red convolucional interpretándolo como una distribución de probabilidad y calculando su valor esperado, permite reducir enormemente el número de parámetros entrenables de los modelos sin reducir la precisión de sus estimaciones.Cuando queremos realizar el tracking de varias fuentes en movimiento y no podemos aplicar ningún criterio para ordenarlas o clasificarlas, el problema se vuelve invariante a las permutaciones de las estimaciones, por lo que no podemos compararlas directamente con las etiquetas de referencia dado que no podemos esperar que sigan el mismo orden. Este tipo de modelos se han entrenado típicamente usando estrategias de entrenamiento invariantes a las permutaciones, pero estas normalmente no penalizan los cambios de identidad por lo que los modelos entrenados con ellas no mantienen la identidad de cada fuente de forma consistente. Para resolver este problema, en esta tesis proponemos una nueva estrategia de entrenamiento, a la que llamamos sliding permutation invariant training (sPIT), que es capaz de optimizar todas las características que podemos esperar de un sistema de tracking de múltiples fuentes: la precisión de sus estimaciones de dirección de llegada, la exactitud de sus detecciones y la consistencia de las identidades asignadas a cada fuente.Finalmente, proponemos un nuevo tipo de red recursiva que usa conjuntos de vectores en lugar de vectores para representar su entrada y su estado y que es invariante a las permutaciones de los elementos del conjunto de entrada y equivariante a las del conjunto de estado. En esta tesis se muestra como este es el comportamiento que deberíamos esperar de un sistema de tracking que toma como entradas las estimaciones de un modelo de localización multifuente y se compara el rendimiento de estas redes recursivas invariantes a las permutaciones con redes recursivas GRU convencionales para aplicaciones de tracking de fuentes sonoras.The localization and tracking of sound sources using microphone arrays is a problem that, even if it has attracted attention from the signal processing research community for decades, remains open. In recent years, deep learning models have surpassed the state-of-the-art that had been established by classic signal processing techniques, but these models still struggle with handling rooms with strong reverberations or tracking multiple sources that dynamically appear and disappear, especially when we cannot apply any criteria to classify or order them. In this thesis, we follow the ideas of the Geometric Deep Learning framework to propose new models and techniques that mean an advance of the state-of-the-art in the aforementioned scenarios. As the input of our models, we use acoustic power maps computed using the SRP-PHAT algorithm, a classic signal processing technique that allows us to estimate the acoustic energy received from any direction of the space and, therefore, compute arbitrary-shaped power maps. In addition, we also propose a new technique to analytically cancel a source from the generalized cross-correlations used to compute the SRP-PHAT maps. Based on previous narrowband cancellation techniques, we prove that we can project the cross-correlation functions of the signals captured by a microphone array into a space orthogonal to a given direction by just computing a linear combination of time-shifted versions of the original cross-correlations. The proposed cancellation technique can be used to design iterative multi-source localization systems where, after having found the strongest source in the generalized cross-correlation functions or in the SRP-PHAT maps, we can cancel it and find new sources that were previously masked by thefirst source. Before being able to train deep learning models we need data, which, in the case of following a supervised learning approach, means a dataset of multichannel recordings with the position of the sources accurately labeled. Although there exist some datasets like this, they are not large enough to train a neural network and the acoustic environments they include are not diverse enough. To overcome this lack of real data, we present a technique to simulate acoustic scenes with one or several moving sound sources and, to be able to perform these simulations as they are needed during the training, we present what is, to the best of our knowledge, the first free and open source room acoustics simulation library with GPU acceleration. As we prove in this thesis, the presented library is more than two orders of magnitude faster than other state-of-the-art CPU libraries. The main idea of the Geometric Deep Learning philosophy is that the models should fit the symmetries (i.e. the invariances and equivariances) of the data and the problem we want to solve. For single-source direction of arrival estimation, the use of SRP-PHAT maps as inputs of our models makes the rotational equivariance of the problem undeniably clear and, after a first approach using 3D convolutional neural networks, we present a model using icosahedral convolutions that approximate the equivariance to the continuous group of spherical rotations by the discrete group of the 60 icosahedral symmetries. We prove that the SRP-PHAT maps are a much more robust input feature than the spectrograms typically used in many state-of-the-art models and that the use of the icosahedral convolutions, combined with a new soft-argmax function that obtains a regression output from the output of the convolutional neural network by interpreting it as a probability distribution and computing its expected value, allows us to dramatically reduce the number of trainable parameters of the models without losing accuracy in their estimations. When we want to track multiple moving sources and we cannot use any criteria to order or classify them, the problem becomes invariant to the permutations of the estimates, so we cannot directly compare them with the ground truth labels since we cannot expect them to be in the same order. This kind of models has typically been trained using permutation invariant training strategies, but these strategies usually do not penalize the identity switches and the models trained with them do not keep the identity of every source consistent during the tracking. To solve this issue, we propose a new training strategy, which we call sliding permutation invariant training, that is able to optimize all the features that we could expect from a multi-source tracking system: the precision of the direction of arrival estimates, the accuracy of the source detections, and the consistency of the assigned identities. Finally, we propose a new kind of recursive neural network that, instead of using vectors as their input and their state, uses sets of vectors and is invariant to the permutation of the elements of the input set and equivariant to the permutations of the elements of the state set. We show how this is the behavior that we should expect from a tracking model which takes as inputs the estimates of a multi-source localization model and compare these permutation-invariant recursive neural networks with the conventional gated recurrent units for sound source tracking applications.<br /
Crystallization kinetics of a commercial poly(lactic acid) based on characteristic crystallization time and optimal crystallization temperature
This version of the article has been accepted for publication, after peer review and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s10973-020-10081-7[Abstract]: A model is proposed to fit differential scanning calorimetry (DSC) isothermal crystallization curves obtained from the molten state at different temperatures. A commercial 3D printing polylactic acid (PLA) sample is used to test the method. All DSC curves are fitted by a mixture of two simultaneous functions, one of them being a time derivative generalized logistic accounting for the exothermic effect and the other, a generalized logistic, accounting for the baseline. There is a rate parameter, which is allowed to vary across different temperatures. The rate parameter values obtained at different temperatures were jointly explained as a result of three crystallization processes, each one defined by a characteristic crystallization time, a characteristic temperature, and a dispersion or width factor. Apart from the very good fittings obtained at all temperatures, the results agree with the existence of a few crystal forms of PLA, which were demonstrated by other authors. Thus, the main significance of this work consists in providing a new approach in order to mathematically describe the isothermal crystallization kinetics of a polymer from the melt. Such a kinetic description is needed in order to predict the extent of a crystallization process as a
function of time at any isothermal temperature. The approach used here allows to understand the overall crystallization of the PLA used in this work as the sum
of three crystallization processes, each of them corresponding to a different crystal form. Each experimental crystallization exotherm, which may include more than one crystal form, can be reproduced by a generalized logistic function. The overall rate factor at a given temperature is the weighted sum of the rate factors of the different crystal structures at that temperature. The rate factor of each of these three processes is described by a Gaussian function whose parameters are a crystallization time, a characteristic temperature and a temperature dispersion factor. Therefore, the crystallization rate for each crystal form
can be interpreted as a relative likelihood to crystallize at a given temperature. On the other hand, the characteristic crystallization time parameter refers to the time needed for a given crystal structure to be formed at the temperature at which the relative likelihood to crystallize of that form is highestThis research has been supported by the Spanish Ministry of Science and Innovation, MINECO Grant MTM2017–82724-
Fillers and methods to improve the effective (out-plane) thermal conductivity of polymeric thermal interface materials – A review
[Abstract]: The internet of things and growing demand for smaller and more advanced devices has created
the problem of high heat production in electronic equipment, which greatly reduces the work
performance and life of the electronic instruments. Thermal interface material (TIM) is placed in
between heat generating micro-chip and the heat dissipater to conduct all the produced heat to
the heat sink. The development of suitable TIM with excellent thermal conductivity (TC) in both
in-plane and through-plane directions is a very important need at present. For efficient thermal
management, polymer composites are potential candidates. But in general, their thermal conductivity
is low compared to that of metals. The filler integration into the polymer matrix is one of
the two approaches used to increase the thermal conductivity of polymer composites and is also
easy to scale up for industrial production. Another way to achieve this is to change the structure
of polymer chains, which fall out of the scope of this work. In this review, considering the first
approach, the authors have summarized recent developments in many types of fillers with
different scenarios by providing multiple cases with successful strategies to improve throughplane
thermal conductivity (TPTC) (k⊥). For a better understanding of TC, a comprehensive
background is presented. Several methods to improve the effective (out-plane) thermal conductivity
of polymer composites and different theoretical models for the calculation of TC are also
discussed. In the end, it is given a detailed conclusion that provides drawbacks of some fillers,
multiple significant routes recommended by other researchers to build thermally conductive
polymer composites, future aspects along with direction so that the researchers can get a
guideline to design an effective polymer-based thermal interface material.This research was funded by Ministry of Science and Technology of the People’s Republic of China, “Light Shipbuilding Fire-
Resistant Sandwich Panels with Improved Balance of Acoustic Insulation, Mechanical and Environmentally-Friendly Properties”,
grant number 2019YFE0124000.China. Ministry of Science and Technolog
Comparison by thermal analysis of Joule‑cured versus oven‑cured composites
Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract]: The current technology for curing high-performance composites, such as those used in industries like such as aeronautics and
the automotive industry, is based on the use of autoclaves, where the material is cured by external heating, in large ovens.
This type of curing requires enormous amounts of energy, of which only a small part is invested in the actual curing of the
material, and the rest is mainly used for heating and maintaining the temperature of the autoclave. An alternative method that
entails a lower energy cost compared to the traditional methodology is curing through the Joule effect, in which an electric
current is passed through the material, so that it acquires temperature from the inside due to the passage of current through
the carbon fibres, triggering and accelerating the curing process of the composite. While Joule curing may provide a much
more efficient and faster curing, a control technology is needed to ensure that temperatures all throughout the composite
match the temperature programme. In this work, a procedure has been developed to control the Joule effect curing of carbon
fibre/epoxy composites in order to compare, by means of differential scanning calorimetry (DSC) and dynamic mechanical
analysis (DMA), the curing obtained by this method with that obtained by the traditional oven curing method.We acknowledge the financial support provided
by the Ministerio de Ciencia e Innovación (Spain), under grant
PID2020-113578RB-100, and the Programa de Doutoramento Industrial
2022, funded by Xunta de Galicia, through the grant number
07_IN606D_2022_2695330.Xunta de Galicia: 07_IN606D_2022_269533
Thermal and Rheological Properties of Fischer–Tropsch Wax/High-Flow LLDPE Blends
[Abstract]: Waxes find use as processing aids in filled compounds and
polyethylene-based masterbatches. In such applications, the thermal and
physical property changes they impart to the polymer matrix are important.
Therefore, this study details results obtained for blends prepared by mixing a
Fischer–Tropsch (F–T) wax with a high-flow linear low-density polyethylene
(LLDPE). The melting and crystallization behavior are studied using hot-stage
polarized optical microscopy (POM) and differential scanning calorimetry
(DSC). The calorimetry results are consistent with partial cocrystallization of
the two components. The melting and crystallization exo- and endotherms for
the wax- and LLDPE-rich phases remained separate. However, they change in
shape and shift toward higher- and lower temperature ranges, respectively. It
is found that increasing the wax content delays the crystallization, decreases
the overall crystallinity, and reduces the size of the crystallites of the
polyethylene-rich phase. Rotational viscosity is measured at 170 °C in the
Newtonian shear-rate range. The variation of the zero-shear viscosity with
blend composition is consistent with the assumption of a homogeneous melt
in which the chains are in an entangled state. Therefore, it is concluded that
the wax and LLDPE are, in effect, miscible in the melt and partially compatible
in the solid state.Generous financial support from Sasol is gratefully acknowledged. Sasol
research grant agreement SAP No. 126/20 G
A Logistic Approach for Kinetics of Isothermal Pyrolysis of Cellulose
[Abstract] A kinetic model is proposed to fit isothermal thermogravimetric data obtained from cellulose in an inert atmosphere at different temperatures. The method used here to evaluate the model involves two steps: (1) fitting of single time-derivative thermogravimetric curves (DTG) obtained at different temperatures versus time, and (2) fitting of the rate parameter values obtained at different temperatures versus temperature. The first step makes use of derivative of logistic functions. For the second step, the dependence of the rate factor on temperature is evaluated. That separation of the curve fitting from the analysis of the rate factor resulted to be very flexible since it proved to work for previous crystallization studies and now for thermal degradation of celluloseMinisterio de Asuntos Económicos y Transformación Digital; MTM2017-82724-RXunta de Galicia; ED431C-2020-14Xunta de Galicia; ED431G 2019/0