3 research outputs found

    Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach

    No full text
    In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach

    Agrupamento de faces em vídeos digitais.

    Get PDF
    Faces humanas são algumas das entidades mais importantes frequentemente encontradas em vídeos. Devido ao substancial volume de produção e consumo de vídeos digitais na atualidade (tanto vídeos pessoais quanto provenientes das indústrias de comunicação e entretenimento), a extração automática de informações relevantes de tais vídeos se tornou um tema ativo de pesquisa. Parte dos esforços realizados nesta área tem se concentrado no uso do reconhecimento e agrupamento facial para auxiliar o processo de anotação automática de faces em vídeos. No entanto, algoritmos de agrupamento de faces atuais ainda não são robustos às variações de aparência de uma mesma face em situações de aquisição típicas. Neste contexto, o problema abordado nesta tese é o agrupamento de faces em vídeos digitais, com a proposição de nova abordagem com desempenho superior (em termos de qualidade do agrupamento e custo computacional) em relação ao estado-da-arte, utilizando bases de vídeos de referência da literatura. Com fundamentação em uma revisão bibliográfica sistemática e em avaliações experimentais, chegou-se à proposição da abordagem, a qual é constituída por módulos de pré-processamento, detecção de faces, rastreamento, extração de características, agrupamento, análise de similaridade temporal e reagrupamento espacial. A abordagem de agrupamento de faces proposta alcançou os objetivos planejados obtendo resultados superiores (no tocante a diferentes métricas) a métodos avaliados utilizando as bases de vídeos YouTube Celebrities (KIM et al., 2008) e SAIVT-Bnews (GHAEMMAGHAMI, DEAN e SRIDHARAN, 2013).Human faces are some of the most important entities frequently encountered in videos. As a result of the currently high volumes of digital videos production and consumption both personal and profissional videos, automatic extraction of relevant information from those videos has become an active research topic. Many efforts in this area have focused on the use of face clustering and recognition in order to aid with the process of annotating faces in videos. However, current face clustering algorithms are not robust to variations of appearance that a same face may suffer due to typical changes in acquisition scenarios. Hence, this thesis proposes a novel approach to the problem of face clustering in digital videos which achieves superior performance (in terms of clustering quality and computational cost) in comparison to the state-of-the-art, using reference video databases according to the literature. After performing a systematic literature review and experimental evaluations, the current approach has been proposed, which has the following modules: preprocessing, face detection, tracking, feature extraction, clustering, temporal similarity analysis, and spatial reclustering. The proposed approach for face clustering achieved the planned objectives obtaining better results (according to different metrics) than those presented by methods evaluated on the YouTube Celebrities videos dataset (KIM et al., 2008) and SAIVT-Bnews videos dataset (GHAEMMAGHAMI, DEAN e SRIDHARAN, 2013)

    Theoretical Analysis of Hierarchical Clustering and the Shadow Vertex Algorithm

    Get PDF
    Agglomerative clustering (AC) is a very popular greedy method for computing hierarchical clusterings in practice, yet its theoretical properties have been studied relatively little. We consider AC with respect to the most popular objective functions, especially the diameter function, the radius function and the k-means function. Given a finite set P of points in Rd, AC starts with each point from P in a cluster of its own and then iteratively merges two clusters from the current clustering that minimize the respective objective function when merged into a single cluster. We study the problem of partitioning P into k clusters such that the largest diameter of the clusters is minimized and we prove that AC computes an O(1)-approximation for this problem for any metric that is induced by a norm, assuming that the dimension d is a constant. This improves the best previously known bound of O(log k) due to Ackermann et al. Our bound also carries over to the k-center and the continuous k-center problem. Moreover we study the behavior of agglomerative clustering for the hierarchical k-means problem. We show that AC computes a 2-approximation with respect to the k-means objective function if the optimal k-clustering is well separated. If additionally the optimal clustering also satisfies a balance condition, then AC fully recovers the optimum solution. These results hold in arbitrary dimension. We accompany our positive results with a lower bound of Ω((3/2)^d) for data sets in Rd that holds if no separation is guaranteed, and with lower bounds when the guaranteed separation is not sufficiently strong. Finally, we show that AC produces an O(1)-approximative clustering for one-dimensional data sets. Apart from AC we provide improved and in some cases new general upper and lower bounds on the existence of hierarchical clusterings. For the objective function discrete radius we provide a new lower bound of 2 and improve the upper bound of 4. For the k-means objective function we state a lower bound of 32 on the existence of hierarchical clusterings. This improves the best previously known bound of 576. The simplex algorithm is probably the most popular algorithm for solving linear pro grams in practice. It is determined by so called pivot rules. The shadow vertex simplex algorithm is a popular pivot rule which has gained attention in recent years because it was shown to have polynomial running time in the model of smoothed complexity. In the second part of the dissertation we show that the shadow vertex simplex algorithm can be used to solve linear programs in strongly polynomial time with respect to the number n of variables, the number m of constraints, and 1/δ, where δ is a parameter that measures the flatness of the vertices of the polyhedron. This extends a previous result that the shadow vertex algorithm finds paths of polynomial length (w.r.t. n, m, and 1/δ) between two given vertices of a polyhedron. Our result also complements a result due to Eisenbrand and Vempala who have shown that a certain version of the random edge pivot rule solves linear programs with a running time that is strongly polynomial in the number of variables n and 1/δ, but independent of the number m of constraints. Even though the running time of our algorithm depends on m, it is significantly faster for the important special case of totally unimodular linear programs, for which 1/δ is smaller or equal than n and which have only O(n2) constraints
    corecore