Search CORE

982 research outputs found

Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work

Author: Wang Qiangchang
Yin Yilong
Publication venue
Publication date: 02/06/2023
Field of study

Inspired by the fact that human brains can emphasize discriminative parts of the input and suppress irrelevant ones, substantial local mechanisms have been designed to boost the development of computer vision. They can not only focus on target parts to learn discriminative local representations, but also process information selectively to improve the efficiency. In terms of application scenarios and paradigms, local mechanisms have different characteristics. In this survey, we provide a systematic review of local mechanisms for various computer vision tasks and approaches, including fine-grained visual recognition, person re-identification, few-/zero-shot learning, multi-modal learning, self-supervised learning, Vision Transformers, and so on. Categorization of local mechanisms in each field is summarized. Then, advantages and disadvantages for every category are analyzed deeply, leaving room for exploration. Finally, future research directions about local mechanisms have also been discussed that may benefit future works. To the best our knowledge, this is the first survey about local mechanisms on computer vision. We hope that this survey can shed light on future research in the computer vision field

arXiv.org e-Print Archive

HTM approach to image classification, sound recognition and time series forecasting

Author: Lima Tiago Azevedo
Publication venue
Publication date: 08/04/2021
Field of study

Dissertação de mestrado em Biomedical EngineeringThe introduction of Machine Learning (ML) on the orbit of the resolution of problems typically associated within the human behaviour has brought great expectations to the future. In fact, the possible development of machines capable of learning, in a similar way as of the humans, could bring grand perspectives to diverse areas like healthcare, the banking sector, retail, and any other area in which we could avoid the constant attention of a person dedicated to the solving of a problem; furthermore, there are those problems that are still not at the hands of humans to solve - these are now at the disposal of intelligent machines, bringing new possibilities to the humankind development. ML algorithms, specifically Deep Learning (DL) methods, lack a bigger acceptance by part of the community, even though they are present in various systems in our daily basis. This lack of confidence, mandatory to let systems make big, important decisions with great impact in the everyday life is due to the difficulty on understanding the learning mechanisms and previsions that result by the same - some algorithms represent themselves as ”black boxes”, translating an input into an output, while not being totally transparent to the outside. Another complication rises, when it is taken into account that the same algorithms are trained to a specific task and in accordance to the training cases found on their development, being more susceptible to error in a real environment - one can argue that they do not constitute a true Artificial Intelligence (AI). Following this line of thought, this dissertation aims at studying a new theory, Hierarchical Temporal Memory (HTM), that can be placed in the area of Machine Intelligence (MI), an area that studies the capacity of how the software systems can learn, in an identical way to the learning of a human being. The HTM is still a fresh theory, that lays on the present perception of the functioning of the human neocortex and assumes itself as under constant development; at the moment, the theory dictates that the neocortex zones are organized in an hierarchical structure, being a memory system, capable of recognizing spatial and temporal patterns. In the course of this project, an analysis was made to the functioning of the theory and its applicability to the various tasks typically solved with ML algorithms, like image classification, sound recognition and time series forecasting. At the end of this dissertation, after the evaluation of the different results obtained in various approaches, it was possible to conclude that even though these results were positive, the theory still needs to mature, not only in its theoretical basis but also in the development of libraries and frameworks of software, to capture the attention of the AI community.A introdução de ML na órbita da resolução de problemas tipicamente dedicados ao foro humano trouxe grandes expectativas para o futuro. De facto, o possível desenvolvimento de máquinas capazes de aprender, de forma semelhante aos humanos, poderia trazer grandes perspetivas para diversas áreas como a saúde, o setor bancário, retalho, e qualquer outra área em que se poderia evitar o constante alerta de uma pessoa dedicada a um problema; para além disso, problemas sem resolução humana passavam a estar a mercê destas máquinas, levando a novas possibilidades no desenvolvimento da humanidade. Apesar de se encontrar em vários sistemas no nosso dia-a-dia, estes algoritmos de ML, especificamente de DL, carecem ainda de maior aceitação por parte da comunidade, devido a dificuldade de perceber as aprendizagens e previsões resultantes, feitas pelos mesmos - alguns algoritmos apresentam-se como ”caixas negras”, traduzindo um input num output, não sendo totalmente transparente para o exterior - é necessária confiança nos sistemas que possam tomar decisões importantes e com grandes impactos no quotidiano; por outro lado, os mesmos algoritmos encontram-se treinados para uma tarefa específica e de acordo com os casos encontrados no desenvolvimento do seu treino, sendo mais suscetíveis a erros em ambientes reais, podendo se discutir que não constituem, por isso, uma verdadeira Inteligência Artificial. Seguindo este segmento, a presente dissertação procura estudar uma nova teoria, HTM, inserida na área de MI, que pretende dar a capacidade aos sistemas de software de aprenderem de uma forma idêntica a do ser humano. Esta recente teoria, assenta na atual perceção do funcionamento do neocórtex, estando por isso em constante desenvolvimento; no momento, e assumida como uma teoria que dita a hierarquização estrutural das zonas do neocórtex, sendo um sistema de memória, reconhecedor de padrões espaciais e temporais. Ao longo deste projeto, foi feita uma análise ao funcionamento da teoria, e a sua aplicabilidade a várias tarefas tipicamente resolvidas com algoritmos de ML, como classificação de imagem, reconhecimento de som e previsão de series temporais. No final desta dissertação, após uma avaliação dos diferentes resultados obtidos em várias abordagens, foi possível concluir que apesar dos resultadospositivos, a teoria precisa ainda de maturar, não só a nível teórico como a nível prático, no desenvolvimento de bibliotecas e frameworks de software, de forma a capturar a atenção da comunidade de Inteligência Artificial

Universidade do Minho: RepositoriUM