32 research outputs found

    Timage -- A Robust Time Series Classification Pipeline

    Full text link
    Time series are series of values ordered by time. This kind of data can be found in many real world settings. Classifying time series is a difficult task and an active area of research. This paper investigates the use of transfer learning in Deep Neural Networks and a 2D representation of time series known as Recurrence Plots. In order to utilize the research done in the area of image classification, where Deep Neural Networks have achieved very good results, we use a Residual Neural Networks architecture known as ResNet. As preprocessing of time series is a major part of every time series classification pipeline, the method proposed simplifies this step and requires only few parameters. For the first time we propose a method for multi time series classification: Training a single network to classify all datasets in the archive with one network. We are among the first to evaluate the method on the latest 2018 release of the UCR archive, a well established time series classification benchmarking dataset.Comment: ICANN19, 28th International Conference on Artificial Neural Network

    Scalable precision wide-field imaging in radio interferometry: I. uSARA validated on ASKAP data

    Full text link
    As Part I of a paper series showcasing a new imaging framework, we consider the recently proposed unconstrained Sparsity Averaging Reweighted Analysis (uSARA) optimisation algorithm for wide-field, high-resolution, high-dynamic range, monochromatic intensity imaging. We reconstruct images from real radio-interferometric observations obtained with the Australian Square Kilometre Array Pathfinder (ASKAP) and present these results in comparison to the widely-used, state-of-the-art imager WSClean. Selected fields come from the ASKAP Early Science and Evolutionary Map of the Universe (EMU) Pilot surveys and contain several complex radio sources: the merging cluster system Abell 3391-95, the merging cluster SPT-CL 2023-5535, and many extended, or bent-tail, radio galaxies, including the X-shaped radio galaxy PKS 2014-558 and the ``dancing ghosts'', known collectively as PKS 2130-538. The modern framework behind uSARA utilises parallelisation and automation to solve for the w-effect and efficiently compute the measurement operator, allowing for wide-field reconstruction over the full field-of-view of individual ASKAP beams (up to 3.3 deg each). The precision capability of uSARA produces images with both super-resolution and enhanced sensitivity to diffuse components, surpassing traditional CLEAN algorithms which typically require a compromise between such yields. Our resulting monochromatic uSARA-ASKAP images of the selected data highlight both extended, diffuse emission and compact, filamentary emission at very high resolution (up to 2.2 arcsec), revealing never-before-seen structure. Here we present a validation of our uSARA-ASKAP images by comparing the morphology of reconstructed sources, measurements of diffuse flux, and spectral index maps with those obtained from images made with WSClean.Comment: Accepted for publication in MNRA

    Rethinking Attention Mechanism in Time Series Classification

    Full text link
    Attention-based models have been widely used in many areas, such as computer vision and natural language processing. However, relevant applications in time series classification (TSC) have not been explored deeply yet, causing a significant number of TSC algorithms still suffer from general problems of attention mechanism, like quadratic complexity. In this paper, we promote the efficiency and performance of the attention mechanism by proposing our flexible multi-head linear attention (FMLA), which enhances locality awareness by layer-wise interactions with deformable convolutional blocks and online knowledge distillation. What's more, we propose a simple but effective mask mechanism that helps reduce the noise influence in time series and decrease the redundancy of the proposed FMLA by masking some positions of each given series proportionally. To stabilize this mechanism, samples are forwarded through the model with random mask layers several times and their outputs are aggregated to teach the same model with regular mask layers. We conduct extensive experiments on 85 UCR2018 datasets to compare our algorithm with 11 well-known ones and the results show that our algorithm has comparable performance in terms of top-1 accuracy. We also compare our model with three Transformer-based models with respect to the floating-point operations per second and number of parameters and find that our algorithm achieves significantly better efficiency with lower complexity

    A comprehensive framework for evaluation of high pacing frequency and arrhythmic optical mapping signals

    Get PDF
    Introduction: High pacing frequency or irregular activity due to arrhythmia produces complex optical mapping signals and challenges for processing. The objective is to establish an automated activation time-based analytical framework applicable to optical mapping images of complex electrical behavior.Methods: Optical mapping signals with varying complexity from sheep (N = 7) ventricular preparations were examined. Windows of activation centered on each action potential upstroke were derived using Hilbert transform phase. Upstroke morphology was evaluated for potential multiple activation components and peaks of upstroke signal derivatives defined activation time. Spatially and temporally clustered activation time points were grouped in to wave fronts for individual processing. Each activation time point was evaluated for corresponding repolarization times. Each wave front was subsequently classified based on repetitive or non-repetitive events. Wave fronts were evaluated for activation time minima defining sites of wave front origin. A visualization tool was further developed to probe dynamically the ensemble activation sequence.Results: Our framework facilitated activation time mapping during complex dynamic events including transitions to rotor-like reentry and ventricular fibrillation. We showed that using fixed AT windows to extract AT maps can impair interpretation of the activation sequence. However, the phase windowing of action potential upstrokes enabled accurate recapitulation of repetitive behavior, providing spatially coherent activation patterns. We further demonstrate that grouping the spatio-temporal distribution of AT points in to coherent wave fronts, facilitated interpretation of isolated conduction events, such as conduction slowing, and to derive dynamic changes in repolarization properties. Focal origins precisely detected sites of stimulation origin and breakthrough for individual wave fronts. Furthermore, a visualization tool to dynamically probe activation time windows during reentry revealed a critical single static line of conduction slowing associated with the rotation core.Conclusion: This comprehensive analytical framework enables detailed quantitative assessment and visualization of complex electrical behavior

    Robust learning algorithms for spiking and rate-based neural networks

    Get PDF
    Inspired by the remarkable properties of the human brain, the fields of machine learning, computational neuroscience and neuromorphic engineering have achieved significant synergistic progress in the last decade. Powerful neural network models rooted in machine learning have been proposed as models for neuroscience and for applications in neuromorphic engineering. However, the aspect of robustness is often neglected in these models. Both biological and engineered substrates show diverse imperfections that deteriorate the performance of computation models or even prohibit their implementation. This thesis describes three projects aiming at implementing robust learning with local plasticity rules in neural networks. First, we demonstrate the advantages of neuromorphic computations in a pilot study on a prototype chip. Thereby, we quantify the speed and energy consumption of the system compared to a software simulation and show how on-chip learning contributes to the robustness of learning. Second, we present an implementation of spike-based Bayesian inference on accelerated neuromorphic hardware. The model copes, via learning, with the disruptive effects of the imperfect substrate and benefits from the acceleration. Finally, we present a robust model of deep reinforcement learning using local learning rules. It shows how backpropagation combined with neuromodulation could be implemented in a biologically plausible framework. The results contribute to the pursuit of robust and powerful learning networks for biological and neuromorphic substrates

    Advanced algorithms for audio and image processing

    Get PDF
    The objective of the thesis is the development of a set of innovative algorithms around the topic of beamforming in the field of acoustic imaging, audio and image processing, aimed at significantly improving the performance of devices that exploit these computational approaches. Therefore the context is the improvement of devices (ultrasound machines and video/audio devices) already on the market or the development of new ones which, through the proposed studies, can be introduced on new the markets with the launch of innovative high-tech start-ups. This is the motivation and the leitmotiv behind the doctoral work carried out. In fact, in the first part of the work an innovative image reconstruction algorithm in the field of ultrasound biomedical imaging is presented, which is connected to the development of such equipment that exploits the computing opportunities currently offered nowadays at low cost by GPUs (Moore\u2019s law). The proposed target is to obtain a new pipeline of the reconstruction of the image abandoning the architecture of such hardware based In the first part of the thesis I faced the topic of the reconstruction of ultrasound images for applications hypothesized on a software based device through image reconstruction algorithms processed in the frequency domain. An innovative beamforming algorithm based on seismic migration is presented, in which a transformation of the RF data is carried out and the reconstruction algorithm can evaluate a masking of the k-space of the data, speeding up the reconstruction process and reducing the computational burden. The analysis and development of the algorithms responsible for carrying out the thesis has been approached from a feasibility point in an off-line context and on the Matlab platform, processing both synthetic simulated generated data and real RF data: the subsequent development of these algorithms within of the future ultrasound biomedical equipment will exploit an high-performance computing framework capable of processing customized kernel pipelines (henceforth called \u2019filters\u2019) on CPU/GPU. The type of filters implemented involved the topic of Plane Wave Imaging (PWI), an alternative method of acquiring the ultrasound image compared to the state of the art of the traditional standard B-mode which currently exploit sequential sequence of insonification of the sample under examination through focused beams transmitted by the probe channels. The PWI mode is interesting and opens up new scenarios compared to the usual signal acquisition and processing techniques, with the aim of making signal processing in general and image reconstruction in particular faster and more flexible, and increasing importantly the frame rate opens up and improves clinical applications. The innovative idea is to introduce in an offline seismic reconstruction algorithm for ultrasound imaging a further filter, named masking matrix. The masking matrices can be computed offline knowing the system parameters, since they do not depend from acquired data. Moreover, they can be pre-multiplied to propagation matrices, without affecting the overall computational load. Subsequently in the thesis, the topic of beamforming in audio processing on super-direct linear arrays of microphones is addressed. The aim is to make an in depth analysis of two main families of data-independent approaches and algorithms present in the literature by comparing their performances and the trade-off between directivity and frequency invariance, which is not yet known at to the state-of-the-art. The goal is to validate the best algorithm that allows, from the perspective of an implementation, to experimentally verify performance, correlating it with the characteristics and error statistics. Frequency-invariant beam patterns are often required by systems using an array of sensors to process broadband signals. In some experimental conditions, the array spatial aperture is shorter than the involved wavelengths. In these conditions, superdirective beamforming is essential for an efficient system. I present a comparison between two methods that deal with a data-independent beamformer based on a filter-and-sum structure. Both methods (the first one numerical, the second one analytic) formulate a mathematical convex minimization problem, in which the variables to be optimized are the filters coefficients or frequency responses. In the described simulations, I have chosen a geometry and a set-up of parameters that allows us to make a fair comparison between the performances of the two different design methods analyzed. In particular, I addressed a small linear array for audio capture with different purposes (hearing aids, audio surveillance system, video-conference system, multimedia device, etc.). The research activity carried out has been used for the launch of a high-tech device through an innovative start-up in the field of glasses/audio devices (https://acoesis.com/en/). It has been proven that the proposed algorithm gives the possibility of obtaining higher performances than the state of the art of similar algorithms, additionally providing the possibility of connecting directivity or better generalized directivity to the statistics of phase errors and gain of sensors, extremely important in superdirective arrays in the case of real and industrial implementation. Therefore, the method proposed by the comparison is innovative because it quantitatively links the physical construction characteristics of the array to measurable and experimentally verifiable quantities, making the real implementation process controllable. The third topic faced is the reconstruction of the Room Impluse Response (RIR) using audio processing blind methods. Given an unknown audio source, the estimation of time differences-of-arrivals (TDOAs) can be efficiently and robustly solved using blind channel identification and exploiting the cross-correlation identity (CCI). Prior blind works have improved the estimate of TDOAs by means of different algorithmic solutions and optimization strategies, while always sticking to the case N = 2 microphones. But what if we can obtain a direct improvement in performance by just increasing N? In the fourth Chapter I tried to investigate this direction, showing that, despite the arguable simplicity, this is capable of (sharply) improving upon state-of-the-art blind channel identification methods based on CCI, without modifying the computational pipeline. Inspired by our results, we seek to warm up the community and the practitioners by paving the way (with two concrete, yet preliminary, examples) towards joint approaches in which advances in the optimization are combined with an increased number of microphones, in order to achieve further improvements. Sound source localisation applications can be tackled by inferring the time-difference-of-arrivals (TDOAs) between a sound-emitting source and a set of microphones. Among the referred applications, one can surely list room-aware sound reproduction, room geometry\u2019s estimation, speech enhancement. Despite a broad spectrum of prior works estimate TDOAs from a known audio source, even when the signal emitted from the acoustic source is unknown, TDOAs can be inferred by comparing the signals received at two (or more) spatially separated microphones, using the notion of cross-corrlation identity (CCI). This is the key theoretical tool, not only, to make the ordering of microphones irrelevant during the acquisition stage, but also to solve the problem as blind channel identification, robustly and reliably inferring TDOAs from an unknown audio source. However, when dealing with natural environments, such \u201cmutual agreement\u201d between microphones can be tampered by a variety of audio ambiguities such as ambient noise. Furthermore, each observed signal may contain multiple distorted or delayed replicas of the emitting source due to reflections or generic boundary effects related to the (closed) environment. Thus, robustly estimating TDOAs is surely a challenging problem and CCI-based approaches cast it as single-input/multi-output blind channel identification. Such methods promote robustness in the estimate from the methodological standpoint: using either energy-based regularization, sparsity or positivity constraints, while also pre-conditioning the solution space. Last but not least, the Acoustic Imaging is an imaging modality that exploits the propagation of acoustic waves in a medium to recover the spatial distribution and intensity of sound sources in a given region. Well known and widespread acoustic imaging applications are, for example, sonar and ultrasound. There are active and passive imaging devices: in the context of this thesis I consider a passive imaging system called Dual Cam that does not emit any sound but acquires it from the environment. In an acoustic image each pixel corresponds to the sound intensity of the source, the whose position is described by a particular pair of angles and, in the case in which the beamformer can, as in our case, work in near-field, from a distance on which the system is focused. In the last part of this work I propose the use of a new modality characterized by a richer information content, namely acoustic images, for the sake of audio-visual scene understanding. Each pixel in such images is characterized by a spectral signature, associated to a specific direction in space and obtained by processing the audio signals coming from an array of microphones. By coupling such array with a video camera, we obtain spatio-temporal alignment of acoustic images and video frames. This constitutes a powerful source of self-supervision, which can be exploited in the learning pipeline we are proposing, without resorting to expensive data annotations. However, since 2D planar arrays are cumbersome and not as widespread as ordinary microphones, we propose that the richer information content of acoustic images can be distilled, through a self-supervised learning scheme, into more powerful audio and visual feature representations. The learnt feature representations can then be employed for downstream tasks such as classification and cross-modal retrieval, without the need of a microphone array. To prove that, we introduce a novel multimodal dataset consisting in RGB videos, raw audio signals and acoustic images, aligned in space and synchronized in time. Experimental results demonstrate the validity of our hypothesis and the effectiveness of the proposed pipeline, also when tested for tasks and datasets different from those used for training. Chapter 6 closes the thesis, presenting a development activity of a new Dual Cam POC to build-up from it a spin-off, assuming to apply for an innovation project for hi-tech start- ups (such as a SME instrument H2020) for a 50Keuro grant, following the idea of the technology transfer. A deep analysis of the reference market, technologies and commercial competitors, business model and the FTO of intellectual property is then conducted. Finally, following the latest technological trends (https://www.flir.eu/products/si124/) a new version of the device (planar audio array) with reduced dimensions and improved technical characteristics is simulated, simpler and easier to use than the current one, opening up new interesting possibilities of development not only technical and scientific but also in terms of business fallout

    Scene representation and matching for visual localization in hybrid camera scenarios

    Get PDF
    Scene representation and matching are crucial steps in a variety of tasks ranging from 3D reconstruction to virtual/augmented/mixed reality applications, to robotics, and others. While approaches exist that tackle these tasks, they mostly overlook the issue of efficiency in the scene representation, which is fundamental in resource-constrained systems and for increasing computing speed. Also, they normally assume the use of projective cameras, while performance on systems based on other camera geometries remains suboptimal. This dissertation contributes with a new efficient scene representation method that dramatically reduces the number of 3D points. The approach sets up an optimization problem for the automated selection of the most relevant points to retain. This leads to a constrained quadratic program, which is solved optimally with a newly introduced variant of the sequential minimal optimization method. In addition, a new initialization approach is introduced for the fast convergence of the method. Extensive experimentation on public benchmark datasets demonstrates that the approach produces a compressed scene representation quickly while delivering accurate pose estimates. The dissertation also contributes with new methods for scene matching that go beyond the use of projective cameras. Alternative camera geometries, like fisheye cameras, produce images with very high distortion, making current image feature point detectors and descriptors less efficient, since designed for projective cameras. New methods based on deep learning are introduced to address this problem, where feature detectors and descriptors can overcome distortion effects and more effectively perform feature matching between pairs of fisheye images, and also between hybrid pairs of fisheye and perspective images. Due to the limited availability of fisheye-perspective image datasets, three datasets were collected for training and testing the methods. The results demonstrate an increase of the detection and matching rates which outperform the current state-of-the-art methods
    corecore