194 research outputs found

    Multitrack Detection for Magnetic Recording

    Get PDF
    The thesis develops advanced signal processing algorithms for magnetic recording to increase areal density. The exploding demand for cloud storage is motivating a push for higher areal densities, with narrower track pitches and shorter bit lengths. The resulting increase in interference and media noise requires improvements in read channel signal processing to keep pace. This thesis proposes the multitrack pattern-dependent noise-prediction algorithm as a solution to the joint maximum-likelihood multitrack detection problem in the face of pattern-dependent autoregressive Gaussian noise. The magnetic recording read channel has numerous parameters that must be carefully tuned for best performance; these include not only the equalizer coefficients but also any parameters inside the detector. This thesis proposes two new tuning strategies: one is to minimize the bit-error rate after detection, and the other is to minimize the frame-error rate after error-control decoding. Furthermore, this thesis designs a neural network read channel architecture and compares the performance and complexity with these traditional signal processing techniques.Ph.D

    Trennung und SchĂ€tzung der Anzahl von Audiosignalquellen mit Zeit- und FrequenzĂŒberlappung

    Get PDF
    Everyday audio recordings involve mixture signals: music contains a mixture of instruments; in a meeting or conference, there is a mixture of human voices. For these mixtures, automatically separating or estimating the number of sources is a challenging task. A common assumption when processing mixtures in the time-frequency domain is that sources are not fully overlapped. However, in this work we consider some cases where the overlap is severe — for instance, when instruments play the same note (unison) or when many people speak concurrently ("cocktail party") — highlighting the need for new representations and more powerful models. To address the problems of source separation and count estimation, we use conventional signal processing techniques as well as deep neural networks (DNN). We ïŹrst address the source separation problem for unison instrument mixtures, studying the distinct spectro-temporal modulations caused by vibrato. To exploit these modulations, we developed a method based on time warping, informed by an estimate of the fundamental frequency. For cases where such estimates are not available, we present an unsupervised model, inspired by the way humans group time-varying sources (common fate). This contribution comes with a novel representation that improves separation for overlapped and modulated sources on unison mixtures but also improves vocal and accompaniment separation when used as an input for a DNN model. Then, we focus on estimating the number of sources in a mixture, which is important for real-world scenarios. Our work on count estimation was motivated by a study on how humans can address this task, which lead us to conduct listening experiments, conïŹrming that humans are only able to estimate the number of up to four sources correctly. To answer the question of whether machines can perform similarly, we present a DNN architecture, trained to estimate the number of concurrent speakers. Our results show improvements compared to other methods, and the model even outperformed humans on the same task. In both the source separation and source count estimation tasks, the key contribution of this thesis is the concept of “modulation”, which is important to computationally mimic human performance. Our proposed Common Fate Transform is an adequate representation to disentangle overlapping signals for separation, and an inspection of our DNN count estimation model revealed that it proceeds to ïŹnd modulation-like intermediate features.Im Alltag sind wir von gemischten Signalen umgeben: Musik besteht aus einer Mischung von Instrumenten; in einem Meeting oder auf einer Konferenz sind wir einer Mischung menschlicher Stimmen ausgesetzt. FĂŒr diese Mischungen ist die automatische Quellentrennung oder die Bestimmung der Anzahl an Quellen eine anspruchsvolle Aufgabe. Eine hĂ€uïŹge Annahme bei der Verarbeitung von gemischten Signalen im Zeit-Frequenzbereich ist, dass die Quellen sich nicht vollstĂ€ndig ĂŒberlappen. In dieser Arbeit betrachten wir jedoch einige FĂ€lle, in denen die Überlappung immens ist zum Beispiel, wenn Instrumente den gleichen Ton spielen (unisono) oder wenn viele Menschen gleichzeitig sprechen (Cocktailparty) —, so dass neue Signal-ReprĂ€sentationen und leistungsfĂ€higere Modelle notwendig sind. Um die zwei genannten Probleme zu bewĂ€ltigen, verwenden wir sowohl konventionelle Signalverbeitungsmethoden als auch tiefgehende neuronale Netze (DNN). Wir gehen zunĂ€chst auf das Problem der Quellentrennung fĂŒr Unisono-Instrumentenmischungen ein und untersuchen die speziellen, durch Vibrato ausgelösten, zeitlich-spektralen Modulationen. Um diese Modulationen auszunutzen entwickelten wir eine Methode, die auf Zeitverzerrung basiert und eine SchĂ€tzung der Grundfrequenz als zusĂ€tzliche Information nutzt. FĂŒr FĂ€lle, in denen diese SchĂ€tzungen nicht verfĂŒgbar sind, stellen wir ein unĂŒberwachtes Modell vor, das inspiriert ist von der Art und Weise, wie Menschen zeitverĂ€nderliche Quellen gruppieren (Common Fate). Dieser Beitrag enthĂ€lt eine neuartige ReprĂ€sentation, die die Separierbarkeit fĂŒr ĂŒberlappte und modulierte Quellen in Unisono-Mischungen erhöht, aber auch die Trennung in Gesang und Begleitung verbessert, wenn sie in einem DNN-Modell verwendet wird. Im Weiteren beschĂ€ftigen wir uns mit der SchĂ€tzung der Anzahl von Quellen in einer Mischung, was fĂŒr reale Szenarien wichtig ist. Unsere Arbeit an der SchĂ€tzung der Anzahl war motiviert durch eine Studie, die zeigt, wie wir Menschen diese Aufgabe angehen. Dies hat uns dazu veranlasst, eigene Hörexperimente durchzufĂŒhren, die bestĂ€tigten, dass Menschen nur in der Lage sind, die Anzahl von bis zu vier Quellen korrekt abzuschĂ€tzen. Um nun die Frage zu beantworten, ob Maschinen dies Ă€hnlich gut können, stellen wir eine DNN-Architektur vor, die erlernt hat, die Anzahl der gleichzeitig sprechenden Sprecher zu ermitteln. Die Ergebnisse zeigen Verbesserungen im Vergleich zu anderen Methoden, aber vor allem auch im Vergleich zu menschlichen Hörern. Sowohl bei der Quellentrennung als auch bei der SchĂ€tzung der Anzahl an Quellen ist ein Kernbeitrag dieser Arbeit das Konzept der “Modulation”, welches wichtig ist, um die Strategien von Menschen mittels Computern nachzuahmen. Unsere vorgeschlagene Common Fate Transformation ist eine adĂ€quate Darstellung, um die Überlappung von Signalen fĂŒr die Trennung zugĂ€nglich zu machen und eine Inspektion unseres DNN-ZĂ€hlmodells ergab schließlich, dass sich auch hier modulationsĂ€hnliche Merkmale ïŹnden lassen

    Tracking tracer motion in a 4-D electrical resistivity tomography experiment

    Get PDF
    A new framework for automatically tracking subsurface tracers in electrical resistivity tomography (ERT) monitoring images is presented. Using computer vision and Bayesian inference techniques, in the form of a Kalman filter, the trajectory of a subsurface tracer is monitored by predicting and updating a state model representing its movements. Observations for the Kalman filter are gathered using the maximally stable volumes algorithm, which is used to dynamically threshold local regions of an ERT image sequence to detect the tracer at each time step. The application of the framework to the results of 2-D and 3-D tracer monitoring experiments show that the proposed method is effective for detecting and tracking tracer plumes in ERT images in the presence of noise, without intermediate manual intervention

    ATC Trajectory Reconstruction for Automated Evaluation of Sensor and Tracker Performance

    Get PDF
    Currently most air traffic controller decisions are based on the information provided by the ground support tools provided by automation systems, based on a network of surveillance sensors and the associated tracker. To guarantee surveillance integrity, it is clear that performance assessments of the different elements of the surveillance system are necessary. Due to the evolution suffered by the surveillance processing chain in the recent past, its complexity has been increased by the integration of new sensor types (e.g., automatic dependent surveillance-broadcast [ADS-B], Mode S radars, and wide area multilateration [WAM]), data link applications, and networking technologies. With new sensors, there is a need for system-level performance evaluations as well as methods for establishing assessment at each component of the tracking evaluation.This work was funded by contract EUROCONTROL’s TRES, by the Spanish Ministry of Economy and Competitiveness under grants CICYT TEC2008-06732/TEC and CYCIT TEC2011-28626, and by the Government of Madrid under grant S2009/TIC-1485 (CONTEXTS).Publicad

    Investigating Proton Spin Structure: A Measurement of G_2^p at Low Q^2

    Get PDF
    The g2pg_2^p collaboration performed the first measurement of the reaction p⃗(e⃗,e2˘7)X\vec{p}(\vec{e},e\u27)X in the kinematic range 0.02 3˘c \u3c Q2^2 3˘c \u3c 0.2 GeV2^2 in the resonance region. Experiment E08-027 took place in Hall A at the Thomas Jefferson National Accelerator Facility from March-May of 2012. Data was taken with a longitudinally polarized electron beam, using an NH3_3 target polarized in both parallel and perpendicular configurations. Very preliminary results for g1pg_1^p and g2pg_2^p are shown in this thesis. to extract the spin structure functions, asymmetries are calculated from data taken with a 2.2 GeV electron beam and a 5 T target field, and combined with the Bosted model proton cross section. Preliminary dilution factors and preliminary radiative corrections are included in the asymmetry analysis. Sum rules and χ\chiPT allow us to test the Burkhardtt-Cottingham (BC) sum rule and obtain the spin polarizability quantities Îł0\gamma_0 and ÎŽLT\delta_{LT}. The BC sum rule, valid for all values of Q2Q^2, says that the integral of g2g_2 over all Bjorken xx vanishes. The very preliminary result presented here shows the contribution to the integral from the measured kinematic region. Although the contribution from the resonance region is not consistent with the expected result of zero, an extrapolation to high and low xx must be included to test whether the BC sum rule is satisfied. The difficulty in χ\chiPT calculations of Îł0\gamma_0 and ÎŽLT\delta_{LT} is how to include the resonance contributions, particularly the Δ\Delta-resonance, which dominates. Recent developments have found better agreement with neutron experimental results, however this is little proton data to compare with the calculations, particularly at low Q2Q^2. The very preliminary results shown here do not show agreement with any of the current χ\chiPT predictions. However, as this is only the contribution from the measured kinematic region, it is necessary to include the extrapolation outside the resonance region to draw a stronger conclusion. Further analysis is ongoing, and preliminary results, including a cross section extracted from data instead of a model prediction, are expected within the next year

    Final Research Report on Auto-Tagging of Music

    Get PDF
    The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the “auto-tagging of music”. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5. The research work on auto-tagging has concentrated on four aspects: 1) Further improving IRCAM’s machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! “soft” features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3. 2) Developing two sets of “hard” features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4. 3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5. 4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D

    Geometric deep learning reveals the spatiotemporal fingerprint of microscopic motion

    Full text link
    The characterization of dynamical processes in living systems provides important clues for their mechanistic interpretation and link to biological functions. Thanks to recent advances in microscopy techniques, it is now possible to routinely record the motion of cells, organelles, and individual molecules at multiple spatiotemporal scales in physiological conditions. However, the automated analysis of dynamics occurring in crowded and complex environments still lags behind the acquisition of microscopic image sequences. Here, we present a framework based on geometric deep learning that achieves the accurate estimation of dynamical properties in various biologically-relevant scenarios. This deep-learning approach relies on a graph neural network enhanced by attention-based components. By processing object features with geometric priors, the network is capable of performing multiple tasks, from linking coordinates into trajectories to inferring local and global dynamic properties. We demonstrate the flexibility and reliability of this approach by applying it to real and simulated data corresponding to a broad range of biological experiments.Comment: 17 pages, 5 figure, 2 supplementary figure
    • 

    corecore