498 research outputs found
A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity
Cross-lingual word embeddings encode the meaning of words from different
languages into a shared low-dimensional space. An important requirement for
many downstream tasks is that word similarity should be independent of language
- i.e., word vectors within one language should not be more similar to each
other than to words in another language. We measure this characteristic using
modularity, a network measurement that measures the strength of clusters in a
graph. Modularity has a moderate to strong correlation with three downstream
tasks, even though modularity is based only on the structure of embeddings and
does not require any external resources. We show through experiments that
modularity can serve as an intrinsic validation metric to improve unsupervised
cross-lingual word embeddings, particularly on distant language pairs in
low-resource settings.Comment: Accepted to ACL 2019, camera-read
Computational physics of the mind
In the XIX century and earlier such physicists as Newton, Mayer, Hooke, Helmholtz and Mach were actively engaged in the research on psychophysics, trying to relate psychological sensations to intensities of physical stimuli. Computational physics allows to simulate complex neural processes giving a chance to answer not only the original psychophysical questions but also to create models of mind. In this paper several approaches relevant to modeling of mind are outlined. Since direct modeling of the brain functions is rather limited due to the complexity of such models a number of approximations is introduced. The path from the brain, or computational neurosciences, to the mind, or cognitive sciences, is sketched, with emphasis on higher cognitive functions such as memory and consciousness. No fundamental problems in understanding of the mind seem to arise. From computational point of view realistic models require massively parallel architectures
Generic object classification for autonomous robots
Un dels principals problemes de la interacció dels robots autònoms és el coneixement de l'escena. El reconeixement és fonamental per a solucionar aquest problema i permetre als robots interactuar en un escenari no controlat. En aquest document presentem una aplicació pràctica de la captura d'objectes, de la normalització i de la classificació de senyals triangulars i circulars. El sistema s'introdueix en el robot Aibo de Sony per a millorar-ne la interacció. La metodologia presentada s'ha comprobat en simulacions i problemes de categorització reals, com ara la classificació de senyals de trànsit, amb resultats molt prometedors.Uno de los principales problemas de la interacción de los robots autónomos es el conocimiento de la escena. El reconocimiento es fundamental para solventar este problema y permitir a los robots interactuar en un escenario no controlado. En este documento, presentamos una aplicación práctica de captura del objeto, normalización y clasificación de señales triangulares y circulares. El sistema es introducido en el robot Aibo de Sony para mejorar el comportamiento de la interacción del robot. La metodología presentada ha sido testeada en simulaciones y problemas de categorización reales, como es la clasificación de señales de tráfico, con resultados muy prometedores.One of the main problems of autonomous robots interaction is the scene knowledge. Recognition is concerned to deal with this problem and to allow robots to interact in uncontrolled environments. In this paper, we present a practical application for object fitting, normalization and classification of triangular and circular signs. The system is introduced in the Aibo robot of Sony to increase the robot interaction behaviour. The presented methodology has been tested in real simulations and categorization problems, as the traffic signs classification, with very promising results.Nota: Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia
Probabilistic Image Models and their Massively Parallel Architectures : A Seamless Simulation- and VLSI Design-Framework Approach
Algorithmic robustness in real-world scenarios and real-time processing capabilities are the two essential and at the same time contradictory requirements modern image-processing systems have to fulfill to go significantly beyond state-of-the-art systems. Without suitable image processing and analysis systems at hand, which comply with the before mentioned contradictory requirements, solutions and devices for the application scenarios of the next generation will not become reality. This issue would eventually lead to a serious restraint of innovation for various branches of industry. This thesis presents a coherent approach to the above mentioned problem. The thesis at first describes a massively parallel architecture template and secondly a seamless simulation- and semiconductor-technology-independent design framework for a class of probabilistic image models, which are formulated on a regular Markovian processing grid. The architecture template is composed of different building blocks, which are rigorously derived from Markov Random Field theory with respect to the constraints of \it massively parallel processing \rm and \it technology independence\rm. This systematic derivation procedure leads to many benefits: it decouples the architecture characteristics from constraints of one specific semiconductor technology; it guarantees that the derived massively parallel architecture is in conformity with theory; and it finally guarantees that the derived architecture will be suitable for VLSI implementations. The simulation-framework addresses the unique hardware-relevant simulation needs of MRF based processing architectures. Furthermore the framework ensures a qualified representation for simulation of the image models and their massively parallel architectures by means of their specific simulation modules. This allows for systematic studies with respect to the combination of numerical, architectural, timing and massively parallel processing constraints to disclose novel insights into MRF models and their hardware architectures. The design-framework rests upon a graph theoretical approach, which offers unique capabilities to fulfill the VLSI demands of massively parallel MRF architectures: the semiconductor technology independence guarantees a technology uncommitted architecture for several design steps without restricting the design space too early; the design entry by means of behavioral descriptions allows for a functional representation without determining the architecture at the outset; and the topology-synthesis simplifies and separates the data- and control-path synthesis. Detailed results discussed in the particular chapters together with several additional results collected in the appendix will further substantiate the claims made in this thesis
Neuroengineering of Clustering Algorithms
Cluster analysis can be broadly divided into multivariate data visualization, clustering algorithms, and cluster validation. This dissertation contributes neural network-based techniques to perform all three unsupervised learning tasks. Particularly, the first paper provides a comprehensive review on adaptive resonance theory (ART) models for engineering applications and provides context for the four subsequent papers. These papers are devoted to enhancements of ART-based clustering algorithms from (a) a practical perspective by exploiting the visual assessment of cluster tendency (VAT) sorting algorithm as a preprocessor for ART offline training, thus mitigating ordering effects; and (b) an engineering perspective by designing a family of multi-criteria ART models: dual vigilance fuzzy ART and distributed dual vigilance fuzzy ART (both of which are capable of detecting complex cluster structures), merge ART (aggregates partitions and lessens ordering effects in online learning), and cluster validity index vigilance in fuzzy ART (features a robust vigilance parameter selection and alleviates ordering effects in offline learning). The sixth paper consists of enhancements to data visualization using self-organizing maps (SOMs) by depicting in the reduced dimension and topology-preserving SOM grid information-theoretic similarity measures between neighboring neurons. This visualization\u27s parameters are estimated using samples selected via a single-linkage procedure, thereby generating heatmaps that portray more homogeneous within-cluster similarities and crisper between-cluster boundaries. The seventh paper presents incremental cluster validity indices (iCVIs) realized by (a) incorporating existing formulations of online computations for clusters\u27 descriptors, or (b) modifying an existing ART-based model and incrementally updating local density counts between prototypes. Moreover, this last paper provides the first comprehensive comparison of iCVIs in the computational intelligence literature --Abstract, page iv
Recommended from our members
A High-Performance Domain-Specific Language and Code Generator for General N-body Problems
General N-body problems are a set of problems in which an update to a single element in the system depends on every other element. N-body problems are ubiquitous, with applications in various domains ranging from scientific computing simulations in molecular dynamics, astrophysics, acoustics, and fluid dynamics all the way to computer vision, data mining and machine learning problems. Different N-body algorithms have been designed and implemented in these various fields. However, there is a big gap between the algorithm one designs on paper and the code that runs efficiently on a parallel system. It is time-consuming to write fast, parallel, and scalable code for these problems. On the other hand, the sheer scale and growth of modern scientific datasets necessitate exploiting the power of both parallel and approximation algorithms where there is a potential to trade-off accuracy for performance. The main problem that we are tackling in this thesis is how to automatically generate asymptotically optimal N-body algorithms from the high-level specification of the problem. We combine the body of work in performance optimizations, compilers and the domain of N-body problems to build a unified system where domain scientists can write programs at the high level while attaining performance of code written by an expert at the low level.In order to generate a high-performance, scalable code for this group of problems, we take the following steps in this thesis; first, we propose a unified algorithmic framework named PASCAL in order to address the challenge of designing a general algorithmic template to represent the class of N-body problems. PASCAL utilizes space-partitioning trees and user-controlled pruning/approximations to reduce the asymptotic runtime complexity from linear to logarithmic in the number of data points. In PASCAL, we design an algorithm that automatically generates conditions for pruning or approximation of an N-body problem considering the problem's definition. In order to evaluate PASCAL, we developed tree-based algorithms for six well-known problems: k-nearest neighbors, range search, minimum spanning tree, kernel density estimation, expectation maximization, and Hausdorff distance. We show that applying domain-specific optimizations and parallelization to the algorithms written in PASCAL achieves 10x to 230x speedup compared to state-of-the-art libraries on a dual-socket Intel Xeon processor with 16 cores on real-world datasets. Second, we extend the PASCAL framework to build PASCAL-X that adds support for NUMA-aware parallelization. PASCAL-X also presents insights on the influence of tuning parameters. Tuning parameters such as leaf size (influences the shape of the tree) and cut-off level (controls the granularity of tasks) of the space-partitioning trees result in performance improvement of up to 4.6x. A key goal is to generate scalable and high-performance code automatically without sacrificing productivity. That implies minimizing the effort the users have to put in to generate the desired high-performance code. Another critical factor is the adaptivity, which indicates the amount of effort that is required to extend the high-performance code generation to new N-body problems. Finally, we consider these factors and develop a domain-specific language and code generator named Portal, which is built on top of PASCAL-X. Portal's language design is inspired by the mathematical representation of N-body problems, resulting in an intuitive language for rapid implementation of a variety of problems. Portal's back-end is designed and implemented to generate optimized, parallel, and scalable implementations for multi-core systems. We demonstrate that the performance achieved by using Portal is comparable to that of expert hand-optimized code while providing productivity for domain scientists. For instance, using Portal for the k-nearest neighbors problem gains performance that is similar to the hand-optimized code, while reducing the lines of code by 68x. To the best of our knowledge, there are no known libraries or frameworks that implement parallel asymptotically optimal algorithms for the class of general N-body problems and this thesis primarily aims to fill this gap. Finally, we present a case study of Portal for the real-world problem of face clustering. In this case study, we show that Portal not only provides a fast solution for the face clustering problem with similar accuracy as the state-of-the-art algorithm, but also it provides productivity by implementing the face clustering algorithm in only 14 lines of Portal code
- …