15 research outputs found
Tool wear monitoring in milling using aZIBO shape descriptor
En este documento, se lleva a cabo un proceso de monitoreo del desgaste de la herramienta para determinar la condición de desgaste y para asegurar el uso óptimo de las herramientas antes de su reemplazo durante las operaciones de mecanizado de metales. El conjunto de datos se compone de 53 inserciones de corte. Todos ellos fueron preprocesados y el desgaste de los bordes fue segmentado, resultando 212 bordes establecidos. Para describir la forma de desgaste, se usó un descriptor de forma aZIBO y sus resultados se compararon con dos descriptores clásicos, Hu y Flusser. La clasificación se llevó a cabo utilizando kNN con 1, 3, 5, 7, 9 y 11 vecinos y seis distancias: Cosine, Euclidean, IntersetcDist, ChiSquare, SqDist y Cityblock.
Se han llevado a cabo dos clasificaciones: una de ellas con tres clases diferentes (baja, media y alta wexar -L, M y H, respectivamente) y la otra con solo dos clases: baja (L) y alta (H). El descriptor aZIBO ofrece mejores resultados que los clásicos, con una tasa de aciertos del 60,84% y un 81,13% utilizando las etiquetas L-M-H y L-H, respectivamente
Image segmentation and reconstruction of 3D surfaces from carotid ultrasound images
Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200
Progressive Wasserstein Barycenters of Persistence Diagrams
This paper presents an efficient algorithm for the progressive approximation
of Wasserstein barycenters of persistence diagrams, with applications to the
visual analysis of ensemble data. Given a set of scalar fields, our approach
enables the computation of a persistence diagram which is representative of the
set, and which visually conveys the number, data ranges and saliences of the
main features of interest found in the set. Such representative diagrams are
obtained by computing explicitly the discrete Wasserstein barycenter of the set
of persistence diagrams, a notoriously computationally intensive task. In
particular, we revisit efficient algorithms for Wasserstein distance
approximation [12,51] to extend previous work on barycenter estimation [94]. We
present a new fast algorithm, which progressively approximates the barycenter
by iteratively increasing the computation accuracy as well as the number of
persistent features in the output diagram. Such a progressivity drastically
improves convergence in practice and allows to design an interruptible
algorithm, capable of respecting computation time constraints. This enables the
approximation of Wasserstein barycenters within interactive times. We present
an application to ensemble clustering where we revisit the k-means algorithm to
exploit our barycenters and compute, within execution time constraints,
meaningful clusters of ensemble data along with their barycenter diagram.
Extensive experiments on synthetic and real-life data sets report that our
algorithm converges to barycenters that are qualitatively meaningful with
regard to the applications, and quantitatively comparable to previous
techniques, while offering an order of magnitude speedup when run until
convergence (without time constraint). Our algorithm can be trivially
parallelized to provide additional speedups in practice on standard
workstations. [...
From heuristics-based to data-driven audio melody extraction
The identification of the melody from a music recording is a relatively easy task for humans, but very challenging for computational systems. This task is known as "audio melody extraction", more formally defined as the automatic estimation of the pitch sequence of the melody directly from the audio signal of a polyphonic music recording. This thesis investigates the benefits of exploiting knowledge automatically derived from data for audio melody extraction, by combining digital signal processing and machine learning methods. We extend the scope of melody extraction research by working with a varied dataset and multiple definitions of melody. We first present an overview of the state of the art, and perform an evaluation focused on a novel symphonic music dataset. We then propose melody extraction methods based on a source-filter model and pitch contour characterisation and evaluate them on a wide range of music genres. Finally, we explore novel timbre, tonal and spatial features for contour characterisation, and propose a method for estimating multiple melodic lines. The combination of supervised and unsupervised approaches leads to advancements on melody extraction and shows a promising path for future research and applications
2D and 3D digital shape modelling strategies
Image segmentation of organs in medical images using model-based approaches requires a priori information which is often given by manually tagging landmarks on a training set of shapes. This is a tedious, time-consuming, and error prone task. To overcome some of these drawbacks, several automatic methods were devised. Identification of the same homologous set of points in a training set of object shapes is the most crucial step in Active Shape Modelling, which has encountered several challenges. The most crucial among these are: (C1) defining and characterizing landmarks; (C2) obtaining landmarks at the desired level of detail; (C3) ensuring homology; (C4) generalizing to n>2 dimensions; (C5) achieving practical computations. This thesis proposes several novel modelling techniques attempting to meet C1-C5. In this process, this thesis makes the following key contributions: the concept of local scale for shapes; the idea of allowing level of detail for selecting landmarks; the concept of equalization of shape variance for selecting landmarks; the idea of recursively subdividing shapes and letting the sub-shapes guide landmark selection, which is a very general n-dimensional strategy; the idea of virtual landmarks, which may be situated anywhere relative to, not necessarily on, the shape boundary; a new compactness measure that considers both the number of landmarks and the number of modes selected as independent variables.
The first of three methods uses the c-scale shape descriptor, based on the new concept of curvature-scale, to automatically locate mathematical landmarks on the mean of the training shapes. The landmarks are propagated to the training shapes to establish correspondence among shapes. Since all shapes of the same family do not necessarily present exactly the same shape features, another novel method was devised that takes into account the real shape variability existing in the training set and that is guided by the strategy of equalization of the variance observed in the training set for selecting landmarks. By incorporating the above basic concepts into modelling, a third family of methods with numerous possibilities was developed, taking into account shape features, and the variability among shapes, while being easily generalized to the 3D space. Its output is multi-resolutional allowing landmark selection at any lower resolution trivially as a subset of those found at a higher resolution. The best strategy to use within the family will have to be determined according to the clinical application at hand.
All methods were evaluated in terms of compactness on two data sets - 40 CT images of the liver and 40 MR images of the talus bone of the foot. Further, numerous artificial shapes with known salient points were also used for testing the accuracy of the proposed methods. The results show that, for the same number of landmarks, the proposed methods are more compact than manual and equally spaced annotations. Besides, the accuracy (in terms of false positives and negatives and the location of landmarks) of the proposed shape descriptor on artificial shapes is considerably superior to a state-of-the-art scale space approach to finding salient points on shapes
Model-Based Environmental Visual Perception for Humanoid Robots
The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
Automatic transcription of polyphonic music exploiting temporal evolution
PhDAutomatic music transcription is the process of converting an audio recording
into a symbolic representation using musical notation. It has numerous applications
in music information retrieval, computational musicology, and the
creation of interactive systems. Even for expert musicians, transcribing polyphonic
pieces of music is not a trivial task, and while the problem of automatic
pitch estimation for monophonic signals is considered to be solved, the creation
of an automated system able to transcribe polyphonic music without setting
restrictions on the degree of polyphony and the instrument type still remains
open.
In this thesis, research on automatic transcription is performed by explicitly
incorporating information on the temporal evolution of sounds. First efforts address
the problem by focusing on signal processing techniques and by proposing
audio features utilising temporal characteristics. Techniques for note onset and
offset detection are also utilised for improving transcription performance. Subsequent
approaches propose transcription models based on shift-invariant probabilistic
latent component analysis (SI-PLCA), modeling the temporal evolution
of notes in a multiple-instrument case and supporting frequency modulations in
produced notes. Datasets and annotations for transcription research have also
been created during this work. Proposed systems have been privately as well as
publicly evaluated within the Music Information Retrieval Evaluation eXchange
(MIREX) framework. Proposed systems have been shown to outperform several
state-of-the-art transcription approaches.
Developed techniques have also been employed for other tasks related to music
technology, such as for key modulation detection, temperament estimation,
and automatic piano tutoring. Finally, proposed music transcription models
have also been utilized in a wider context, namely for modeling acoustic scenes
Coding of multivariate stimuli and contextual interactions in the visual cortex
The primary visual cortex (V1) has long been considered the main low level visual
analysis area of the brain. The classical view is of a feedfoward system functioning
as an edge detector, in which each cell has a receptive field (RF) and a preferred orientation.
Whilst intuitive, this view is not the whole story. Although stimuli outside
a neuron’s RF do not result in an increased response by themselves, they do modulate
a neuron’s response to what’s inside its RF. We will refer to such extra-RF effects
as contextual modulation. Contextual modulation is thought to underlie several
perceptual phenomena, such as various orientation illusions and saliency of specific
features (such as a contour or differing element). This gives a view of V1 as more
than a collection of edge detectors, with neurons collectively extracting information
beyond their RFs. However, many of the accounts linking psychophysics and physiology
explain only a small subset of the illusions and saliency effects: we would
like to find a common principle. So first, we assume the contextual modulations experienced
by V1 neurons is determined by the elastica model, which describes the
shape of the smoothest curve between two points. This single assumption gives rise
to a wide range of known contextual modulation and psychophysical effects. Next,
we consider the more general problem of encoding and decoding multi-variate stimuli
(such as center surround gratings) in neurons, and how well the stimuli can be decoded
under substantial noise levels with a maximum likelihood decoder. Although the maximum
likelihood decoder is widely considered optimal and unbiased in the limit of no
noise, under higher noise levels it is poorly understood. We show how higher noise
levels lead to highly complex decoding distributions even for simple encoding models,
which provides several psychophysical predictions. We next incorporate more updated
experimental knowledge of contextual modulations. Perhaps the most common form of
contextual modulations is center surround modulation. Here, the response to a center
grating in the RF is modulated by the presence of a surrounding grating (the surround).
Classically this modulation is considered strongest when the surround is aligned with
the preferred orientation, but several studies have shown how many neurons instead
experience strongest modulation whenever center and surround are aligned. We show
how the latter type of modulation gives rise to stronger saliency effects and unbiased
encoding of the center. Finally, we take an experimental perspective. Recently, both
the presence and the underlying mechanisms of contextual modulations has been increasingly
studied in mice using calcium imaging. However, cell signals extracted
with calcium imaging are often highly contaminated by other sources. As contextual
effects beyond center surround modulation can be subtle, a method is needed to remove
the contamination. We present an analysis toolbox to de-contaminate calcium
signals with blind source separation. This thesis thus expands our understanding of
contextual modulation, predicts several new experimental results, and presents a toolbox
to extract signals from calcium imaging data which should allow for more in depth
studies of contextual modulation
Uma abordagem de agrupamento baseada na técnica de divisão e conquista e floresta de caminhos ótimos
Orientador: Alexandre Xavier FalcãoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O agrupamento de dados é um dos principais desafios em problemas de Ciência de Dados. Apesar do seu progresso científico em quase um século de existência, algoritmos de agrupamento ainda falham na identificação de grupos (clusters) naturalmente relacionados com a semântica do problema. Ademais, os avanços das tecnologias de aquisição, comunicação, e armazenamento de dados acrescentam desafios cruciais com o aumento considerável de dados, os quais não são tratados pela maioria das técnicas. Essas questões são endereçadas neste trabalho através da proposta de uma abordagem de divisão e conquista para uma técnica de agrupamento única em encontrar um grupo por domo da função de densidade de probabilidade dos dados --- o algoritmo de agrupamento por floresta de caminhos ótimos (OPF - Optimum-Path Forest). Nesta técnica, amostras são interpretadas como nós de um grafo cujos arcos conectam os -vizinhos mais próximos no espaço de características. Os nós são ponderados pela sua densidade de probabilidade e um mapa de conexidade é maximizado de modo que cada máximo da função densidade de probabilidade se torna a raiz de uma árvore de caminhos ótimos (grupo). O melhor valor de é estimado por otimização em um intervalo de valores dependente da aplicação. O problema com este método é que um número alto de amostras torna o algoritmo inviável, devido ao espaço de memória necessário para armazenar o grafo e o tempo computacional para encontrar o melhor valor de . Visto que as soluções existentes levam a resultados ineficazes, este trabalho revisita o problema através da proposta de uma abordagem de divisão e conquista com dois níveis. No primeiro nível, o conjunto de dados é dividido em subconjuntos (blocos) menores e as amostras pertencentes a cada bloco são agrupadas pelo algoritmo OPF. Em seguida, as amostras representativas de cada grupo (mais especificamente as raízes da floresta de caminhos ótimos) são levadas ao segundo nível, onde elas são agrupadas novamente. Finalmente, os rótulos de grupo obtidos no segundo nível são transferidos para todas as amostras do conjunto de dados através de seus representantes do primeiro nível. Nesta abordagem, todas as amostras, ou pelo menos muitas delas, podem ser usadas no processo de aprendizado não supervisionado, sem afetar a eficácia do agrupamento e, portanto, o procedimento é menos susceptível a perda de informação relevante ao agrupamento. Os resultados mostram agrupamentos satisfatórios em dois cenários, segmentação de imagem e agrupamento de dados arbitrários, tendo como base a comparação com abordagens populares. No primeiro cenário, a abordagem proposta atinge os melhores resultados em todas as bases de imagem testadas. No segundo cenário, os resultados são similares aos obtidos por uma versão otimizada do método original de agrupamento por floresta de caminhos ótimosAbstract: Data clustering is one of the main challenges when solving Data Science problems. Despite its progress over almost one century of research, clustering algorithms still fail in identifying groups naturally related to the semantics of the problem. Moreover, the advances in data acquisition, communication, and storage technologies add crucial challenges with a considerable data increase, which are not handled by most techniques. We address these issues by proposing a divide-and-conquer approach to a clustering technique, which is unique in finding one group per dome of the probability density function of the data --- the Optimum-Path Forest (OPF) clustering algorithm. In the OPF-clustering technique, samples are taken as nodes of a graph whose arcs connect the -nearest neighbors in the feature space. The nodes are weighted by their probability density values and a connectivity map is maximized such that each maximum of the probability density function becomes the root of an optimum-path tree (cluster). The best value of is estimated by optimization within an application-specific interval of values. The problem with this method is that a high number of samples makes the algorithm prohibitive, due to the required memory space to store the graph and the computational time to obtain the clusters for the best value of . Since the existing solutions lead to ineffective results, we decided to revisit the problem by proposing a two-level divide-and-conquer approach. At the first level, the dataset is divided into smaller subsets (blocks) and the samples belonging to each block are grouped by the OPF algorithm. Then, the representative samples (more specifically the roots of the optimum-path forest) are taken to a second level where they are clustered again. Finally, the group labels obtained in the second level are transferred to all samples of the dataset through their representatives of the first level. With this approach, we can use all samples, or at least many samples, in the unsupervised learning process without affecting the grouping performance and, therefore, the procedure is less likely to lose relevant grouping information. We show that our proposal can obtain satisfactory results in two scenarios, image segmentation and the general data clustering problem, in comparison with some popular baselines. In the first scenario, our technique achieves better results than the others in all tested image databases. In the second scenario, it obtains outcomes similar to an optimized version of the traditional OPF-clustering algorithmMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPE