Search CORE

929 research outputs found

Minimalistic Unsupervised Learning with the Sparse Manifold Transform

Author: Chen Yubei
LeCun Yann
Ma Yi
Olshausen Bruno
Yun Zeyu
Publication venue
Publication date: 30/09/2022
Field of study

We describe a minimalistic and interpretable method for unsupervised learning, without resorting to data augmentation, hyperparameter tuning, or other engineering designs, that achieves performance close to the SOTA SSL methods. Our approach leverages the sparse manifold transform, which unifies sparse coding, manifold learning, and slow feature analysis. With a one-layer deterministic sparse manifold transform, one can achieve 99.3% KNN top-1 accuracy on MNIST, 81.1% KNN top-1 accuracy on CIFAR-10 and 53.2% on CIFAR-100. With a simple gray-scale augmentation, the model gets 83.2% KNN top-1 accuracy on CIFAR-10 and 57% on CIFAR-100. These results significantly close the gap between simplistic ``white-box'' methods and the SOTA methods. Additionally, we provide visualization to explain how an unsupervised representation transform is formed. The proposed method is closely connected to latent-embedding self-supervised methods and can be treated as the simplest form of VICReg. Though there remains a small performance gap between our simple constructive model and SOTA methods, the evidence points to this as a promising direction for achieving a principled and white-box approach to unsupervised learning

arXiv.org e-Print Archive

Multimodal Short Video Rumor Detection System Based on Contrastive Learning

Author: Min Xiangyu
Wang Haizhou
Wang Pengchao
Wang Siyi
Yang Yuxing
Zhao Junhao
Publication venue
Publication date: 18/04/2023
Field of study

With short video platforms becoming one of the important channels for news sharing, major short video platforms in China have gradually become new breeding grounds for fake news. However, it is not easy to distinguish short video rumors due to the great amount of information and features contained in short videos, as well as the serious homogenization and similarity of features among videos. In order to mitigate the spread of short video rumors, our group decides to detect short video rumors by constructing multimodal feature fusion and introducing external knowledge after considering the advantages and disadvantages of each algorithm. The ideas of detection are as follows: (1) dataset creation: to build a short video dataset with multiple features; (2) multimodal rumor detection model: firstly, we use TSN (Temporal Segment Networks) video coding model to extract video features; then, we use OCR (Optical Character Recognition) and ASR (Automatic Character Recognition) to extract video features. Recognition) and ASR (Automatic Speech Recognition) fusion to extract text, and then use the BERT model to fuse text features with video features (3) Finally, use contrast learning to achieve distinction: first crawl external knowledge, then use the vector database to achieve the introduction of external knowledge and the final structure of the classification output. Our research process is always oriented to practical needs, and the related knowledge results will play an important role in many practical scenarios such as short video rumor identification and social opinion control

arXiv.org e-Print Archive

Toward human-like pathfinding with hierarchical approaches and the GPS of the brain theory

Author: Rahmani Vahid
Publication venue: Universitat Politècnica de Catalunya
Publication date: 18/11/2020
Field of study

Pathfinding for autonomous agents and robots has been traditionally driven by finding optimal paths. Where typically optimality means finding the shortest path between two points in a given environment. However, optimality may not always be strictly necessary. For example, in the case of video games, often computing the paths for non-player characters (NPC) must be done under strict time constraints to guarantee real time simulation. In those cases, performance is more important than finding the shortest path, specially because often a sub-optimal path can be just as convincing from the point of view of the player. When simulating virtual humanoids, pathfinding has also been used with the same goal: finding the shortest path. However, humans very rarely follow precise shortest paths, and thus there are other aspects of human decision making and path planning strategies that should be incorporated in current simulation models. In this thesis we first focus on improving performance optimallity to handle as many virtual agents as possible, and then introduce neuroscience research to propose pathfinding algorithms that attempt to mimic humans in a more realistic manner.In the case of simulating NPCs for video games, one of the main challenges is to compute paths as efficiently as possible for groups of agents. As both the size of the environments and the number of autonomous agents increase, it becomes harder to obtain results in real time under the constraints of memory and computing resources. For this purpose we explored hierarchical approaches for two reasons: (1) they have shown important performance improvements for regular grids and other abstract problems, and (2) humans tend to plan trajectories also following an topbottom abstraction, focusing first on high level location and then refining the path as they move between those high level locations. Therefore, we believe that hierarchical approaches combine the best of our two goals: improving performance for multi-agent pathfinding and achieving more human-like pathfinding. Hierarchical approaches, such as HNA* (Hierarchical A* for Navigation Meshes) can compute paths more efficiently, although only for certain configurations of the hierarchy. For other configurations, the method suffers from a bottleneck in the step that connects the Start and Goal positions with the hierarchy. This bottleneck can drop performance drastically.In this thesis we present different approaches to solve the HNA* bottleneck and thus obtain a performance boost for all hierarchical configurations. The first method relies on further memory storage, and the second one uses parallelism on the GPU. Our comparative evaluation shows that both approaches offer speed-ups as high as 9x faster than A*, and show no limitations based on hierarchical configuration. Then we further exploit the potential of CUDA parallelism, to extend our implementation to HNA* for multi-agent path finding. Our method can now compute paths for over 500K agents simultaneously in real-time, with speed-ups above 15x faster than a parallel multi-agent implementation using A*. We then focus on studying neurosience research to learn about the way that humans build mental maps, in order to propose novel algorithms that take those finding into account when simulating virtual humans. We propose a novel algorithm for path finding that is inspired by neuroscience research on how the brain learns and builds cognitive maps. Our method represents the space as a hexagonal grid, based on the GPS of the brain theory, and fires memory cells as counters. Our path finder then combines a method for exploring unknown environments while building such a cognitive map, with an A* search using a modified heuristic that takes into account the GPS of the brain cognitive map.El problema de Pathfinding para agentes autónomos o robots, ha consistido tradicionalmente en encontrar un camino óptimo, donde por óptimo se entiende el camino de distancia mínima entre dos posiciones de un entorno. Sin embargo, en ocasiones puede que no sea estrictamente necesario encontrar una solución óptima. Para ofrecer la simulación de multitudes de agentes autónomos moviéndose en tiempo real, es necesario calcular sus trayectorias bajo condiciones estrictas de tiempo de computación, pero no es fundamental que las soluciones sean las de distancia mínima ya que, con frecuencia, el observador no apreciará la diferencia entre un camino óptimo y un sub-óptimo. Por tanto, suele ser suficiente con que la solución encontrada sea visualmente creíble para el observado. Cuando se simulan humanoides virtuales en aplicaciones de realidad virtual que requieren avatares que simulen el comportamiento de los humanos, se tiende a emplear los mismos algoritmos que en video juegos, con el objetivo de encontrar caminos de distancia mínima. Pero si realmente queremos que los avatares imiten el comportamiento humano, tenemos que tener en cuenta que, en el mundo real, los humanos rara vez seguimos precisamente el camino más corto, y por tanto se deben considerar otros aspectos que influyen en la toma de decisiones de los humanos y la selección de rutas en el mundo real. En esta tesis nos centraremos primero en mejorar el rendimiento de la búsqueda de caminos para poder simular grandes números de humanoides virtuales autónomos, y a continuación introduciremos conceptos de investigación con mamíferos en neurociencia, para proponer soluciones al problema de pathfinding que intenten imitar con mayor realismo, el modo en el que los humanos navegan el entorno que les rodea. A medida que aumentan tanto el tamaño de los entornos virtuales como el número de agentes autónomos, resulta más difícil obtener soluciones en tiempo real, debido a las limitaciones de memoria y recursos informáticos. Para resolver este problema, en esta tesis exploramos enfoques jerárquicos porque consideramos que combinan dos objetivos fundamentales: mejorar el rendimiento en la búsqueda de caminos para multitudes de agentes y lograr una búsqueda de caminos similar a la de los humanos. El primer método presentado en esta tesis se basa en mejorar el rendimiento del algoritmo HNA* (Hierarchical A* for Navigation Meshes) incrementando almacenamiento de datos en memoria, y el segundo utiliza el paralelismo para mejorar drásticamente el rendimiento. La evaluación cuantitativa realizada en esta tesis, muestra que ambos enfoques ofrecen aceleraciones que pueden llegar a ser hasta 9 veces más rápidas que el algoritmo A* y no presentan limitaciones debidas a la configuración jerárquica. A continuación, aprovechamos aún más el potencial del paralelismo ofrecido por CUDA para extender nuestra implementación de HNA* a sistemas multi-agentes. Nuestro método permite calcular caminos simultáneamente y en tiempo real para más de 500.000 agentes, con una aceleración superior a 15 veces la obtenida por una implementación paralela del algoritmo A*. Por último, en esta tesis nos hemos centrado en estudiar los últimos avances realizados en el ámbito de la neurociencia, para comprender la manera en la que los humanos construyen mapas mentales y poder así proponer nuevos algoritmos que imiten de forma más real el modo en el que navegamos los humanos. Nuestro método representa el espacio como una red hexagonal, basada en la distribución de ¿place cells¿ existente en el cerebro, e imita las activaciones neuronales como contadores en dichas celdas. Nuestro buscador de rutas combina un método para explorar entornos desconocidos mientras construye un mapa cognitivo hexagonal, utilizando una búsqueda A* con una nueva heurística adaptada al mapa cognitivo del cerebro y sus contadores

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

The effect of social context and social scale on the perception of relationships in monk parakeets

Author: Avery Michael L.
Hobson Elizabeth A.
John Darlene J.
Mcintosh Tiffany L.
Wright Timonthy F.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2015
Field of study

Social relationships formed within a network of interacting group members can have a profound impact on an individual’s behavior and fitness. However, we have little understanding of how individuals perceive their relationships and how this perception relates to our external measures of interactions. We investigated the perception of affiliative and agonistic relationships at both the dyadic and emergent social levels in two captive groups of monk parakeets (Myiopsitta monachus, n = 21 and 19) using social network analysis and playback experiments. At the dyadic social scale, individuals directed less aggression towards their strong affiliative partners and more aggression towards non-partner neighbors.At the emergent social scale, there was no association between relationships in different social contexts and an individual’s dominance rank did not correlate with its popularity rank. Playback response patterns were mainly driven by relationships in affiliative social contexts at the dyadic scale. In both groups, individual responses to playback experiments were significantly affected by strong affiliative relationships at the dyadic social scale, albeit in different directions in the two groups. Response patterns were also affected by affiliative relationships at the emergent social scale, but only in one of the two groups. Within affiliative relationships, those at the dyadic social scale were perceived by individuals in both groups, but those at the emergent social scale only affected responses in one group. These results provide preliminary evidence that relationships in affiliative social contexts may be perceived as more important than agonistic relationships in captive monk parakeet groups. Our approach could be used in a wide range of social species and comparative analyses could provide important insight into how individuals perceive relationships across social contexts and social scales [Current Zoology 61 (1): 55–69, 2015]

DigitalCommons@University of Nebraska

A Robotic System for Learning Visually-Driven Grasp Planning (Dissertation Proposal)

Author: Salganicoff Marcos
Publication venue: ScholarlyCommons
Publication date: 22/03/1992
Field of study

We use findings in machine learning, developmental psychology, and neurophysiology to guide a robotic learning system\u27s level of representation both for actions and for percepts. Visually-driven grasping is chosen as the experimental task since it has general applicability and it has been extensively researched from several perspectives. An implementation of a robotic system with a gripper, compliant instrumented wrist, arm and vision is used to test these ideas. Several sensorimotor primitives (vision segmentation and manipulatory reflexes) are implemented in this system and may be thought of as the innate perceptual and motor abilities of the system. Applying empirical learning techniques to real situations brings up such important issues as observation sparsity in high-dimensional spaces, arbitrary underlying functional forms of the reinforcement distribution and robustness to noise in exemplars. The well-established technique of non-parametric projection pursuit regression (PPR) is used to accomplish reinforcement learning by searching for projections of high-dimensional data sets that capture task invariants. We also pursue the following problem: how can we use human expertise and insight into grasping to train a system to select both appropriate hand preshapes and approaches for a wide variety of objects, and then have it verify and refine its skills through trial and error. To accomplish this learning we propose a new class of Density Adaptive reinforcement learning algorithms. These algorithms use statistical tests to identify possibly interesting regions of the attribute space in which the dynamics of the task change. They automatically concentrate the building of high resolution descriptions of the reinforcement in those areas, and build low resolution representations in regions that are either not populated in the given task or are highly uniform in outcome. Additionally, the use of any learning process generally implies failures along the way. Therefore, the mechanics of the untrained robotic system must be able to tolerate mistakes during learning and not damage itself. We address this by the use of an instrumented, compliant robot wrist that controls impact forces

ScholarlyCommons@Penn