10,449 research outputs found
Recommended from our members
Ensuring Access to Safe and Nutritious Food for All Through the Transformation of Food Systems
Kurcuma: a kitchen utensil recognition collection for unsupervised domain adaptation
The use of deep learning makes it possible to achieve extraordinary results in all kinds of tasks related to computer vision. However, this performance is strongly related to the availability of training data and its relationship with the distribution in the eventual application scenario. This question is of vital importance in areas such as robotics, where the targeted environment data are barely available in advance. In this context, domain adaptation (DA) techniques are especially important to building models that deal with new data for which the corresponding label is not available. To promote further research in DA techniques applied to robotics, this work presents Kurcuma (Kitchen Utensil Recognition Collection for Unsupervised doMain Adaptation), an assortment of seven datasets for the classification of kitchen utensils—a task of relevance in home-assistance robotics and a suitable showcase for DA. Along with the data, we provide a broad description of the main characteristics of the dataset, as well as a baseline using the well-known domain-adversarial training of neural networks approach. The results show the challenge posed by DA on these types of tasks, pointing to the need for new approaches in future work.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work was supported by the I+D+i project TED2021-132103A-I00 (DOREMI), funded by MCIN/AEI/10.13039/501100011033. Some of the computing resources were provided by the Generalitat Valenciana and the European Union through the FEDER funding program (IDIFEDER/2020/003). The second author is supported by grant APOSTD/2020/256 from “Programa I+D+i de la Generalitat Valenciana”
Neural Architecture Search: Insights from 1000 Papers
In the past decade, advances in deep learning have resulted in breakthroughs
in a variety of areas, including computer vision, natural language
understanding, speech recognition, and reinforcement learning. Specialized,
high-performing neural architectures are crucial to the success of deep
learning in these areas. Neural architecture search (NAS), the process of
automating the design of neural architectures for a given task, is an
inevitable next step in automating machine learning and has already outpaced
the best human-designed architectures on many tasks. In the past few years,
research in NAS has been progressing rapidly, with over 1000 papers released
since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized
and comprehensive guide to neural architecture search. We give a taxonomy of
search spaces, algorithms, and speedup techniques, and we discuss resources
such as benchmarks, best practices, other surveys, and open-source libraries
Self-Ordering Point Clouds
In this paper we address the task of finding representative subsets of points
in a 3D point cloud by means of a point-wise ordering. Only a few works have
tried to address this challenging vision problem, all with the help of hard to
obtain point and cloud labels. Different from these works, we introduce the
task of point-wise ordering in 3D point clouds through self-supervision, which
we call self-ordering. We further contribute the first end-to-end trainable
network that learns a point-wise ordering in a self-supervised fashion. It
utilizes a novel differentiable point scoring-sorting strategy and it
constructs an hierarchical contrastive scheme to obtain self-supervision
signals. We extensively ablate the method and show its scalability and superior
performance even compared to supervised ordering methods on multiple datasets
and tasks including zero-shot ordering of point clouds from unseen categories
Transverse Velocity Field Measurement in High-Resolution Solar Images Based on Deep Learning
To address the problem of the low accuracy of transverse velocity field
measurements for small targets in high-resolution solar images, we proposed a
novel velocity field measurement method for high-resolution solar images based
on PWCNet. This method transforms the transverse velocity field measurements
into an optical flow field prediction problem. We evaluated the performance of
the proposed method using the Ha and TiO datasets obtained from New Vacuum
Solar Telescope (NVST) observations. The experimental results show that our
method effectively predicts the optical flow of small targets in images
compared with several typical machine- and deep-learning methods. On the Ha
dataset, the proposed method improves the image structure similarity from
0.9182 to 0.9587 and reduces the mean of residuals from 24.9931 to 15.2818; on
the TiO dataset, the proposed method improves the image structure similarity
from 0.9289 to 0.9628 and reduces the mean of residuals from 25.9908 to
17.0194. The optical flow predicted using the proposed method can provide
accurate data for the atmospheric motion information of solar images. The code
implementing the proposed method is available on
https://github.com/lygmsy123/transverse-velocity-field-measurement.Comment: 14 pages, 10 figures, 4 tables. Accepted for publication in Research
in Astronomy and Astrophysic
Generalized Relation Modeling for Transformer Tracking
Compared with previous two-stream trackers, the recent one-stream tracking
pipeline, which allows earlier interaction between the template and search
region, has achieved a remarkable performance gain. However, existing
one-stream trackers always let the template interact with all parts inside the
search region throughout all the encoder layers. This could potentially lead to
target-background confusion when the extracted feature representations are not
sufficiently discriminative. To alleviate this issue, we propose a generalized
relation modeling method based on adaptive token division. The proposed method
is a generalized formulation of attention-based relation modeling for
Transformer tracking, which inherits the merits of both previous two-stream and
one-stream pipelines whilst enabling more flexible relation modeling by
selecting appropriate search tokens to interact with template tokens. An
attention masking strategy and the Gumbel-Softmax technique are introduced to
facilitate the parallel computation and end-to-end learning of the token
division module. Extensive experiments show that our method is superior to the
two-stream and one-stream pipelines and achieves state-of-the-art performance
on six challenging benchmarks with a real-time running speed.Comment: Accepted by CVPR 2023. Code and models are publicly available at
https://github.com/Little-Podi/GR
Anuário científico da Escola Superior de Tecnologia da Saúde de Lisboa - 2021
É com grande prazer que apresentamos a mais recente edição (a 11.ª) do Anuário Científico da Escola Superior de Tecnologia da Saúde de Lisboa. Como instituição de ensino superior, temos o compromisso de promover e incentivar a pesquisa científica em todas as áreas do conhecimento que contemplam a nossa missão. Esta publicação tem como objetivo divulgar toda a produção científica desenvolvida pelos Professores, Investigadores, Estudantes e Pessoal não Docente da ESTeSL durante 2021. Este Anuário é, assim, o reflexo do trabalho árduo e dedicado da nossa comunidade, que se empenhou na produção de conteúdo científico de elevada qualidade e partilhada com a Sociedade na forma de livros, capítulos de livros, artigos publicados em revistas nacionais e internacionais, resumos de comunicações orais e pósteres, bem como resultado dos trabalhos de 1º e 2º ciclo. Com isto, o conteúdo desta publicação abrange uma ampla variedade de tópicos, desde temas mais fundamentais até estudos de aplicação prática em contextos específicos de Saúde, refletindo desta forma a pluralidade e diversidade de áreas que definem, e tornam única, a ESTeSL. Acreditamos que a investigação e pesquisa científica é um eixo fundamental para o desenvolvimento da sociedade e é por isso que incentivamos os nossos estudantes a envolverem-se em atividades de pesquisa e prática baseada na evidência desde o início dos seus estudos na ESTeSL. Esta publicação é um exemplo do sucesso desses esforços, sendo a maior de sempre, o que faz com que estejamos muito orgulhosos em partilhar os resultados e descobertas dos nossos investigadores com a comunidade científica e o público em geral. Esperamos que este Anuário inspire e motive outros estudantes, profissionais de saúde, professores e outros colaboradores a continuarem a explorar novas ideias e contribuir para o avanço da ciência e da tecnologia no corpo de conhecimento próprio das áreas que compõe a ESTeSL. Agradecemos a todos os envolvidos na produção deste anuário e desejamos uma leitura inspiradora e agradável.info:eu-repo/semantics/publishedVersio
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
The robustness of 3D perception systems under natural corruptions from
environments and sensors is pivotal for safety-critical applications. Existing
large-scale 3D perception datasets often contain data that are meticulously
cleaned. Such configurations, however, cannot reflect the reliability of
perception models during the deployment stage. In this work, we present Robo3D,
the first comprehensive benchmark heading toward probing the robustness of 3D
detectors and segmentors under out-of-distribution scenarios against natural
corruptions that occur in real-world environments. Specifically, we consider
eight corruption types stemming from adversarial weather conditions, external
disturbances, and internal sensor failure. We uncover that, although promising
results have been progressively achieved on standard benchmarks,
state-of-the-art 3D perception models are at risk of being vulnerable to
corruptions. We draw key observations on the use of data representations,
augmentation schemes, and training strategies, that could severely affect the
model's performance. To pursue better robustness, we propose a
density-insensitive training framework along with a simple flexible
voxelization strategy to enhance the model resiliency. We hope our benchmark
and approach could inspire future research in designing more robust and
reliable 3D perception models. Our robustness benchmark suite is publicly
available.Comment: 33 pages, 26 figures, 26 tables; code at
https://github.com/ldkong1205/Robo3D project page at
https://ldkong.com/Robo3
Deciphering multiple sclerosis disability with deep learning attention maps on clinical MRI
Deep learning; Disability; Structural MRIAprendizaje profundo; Discapacidad; Resonancia magnética estructuralAprenentatge profund; Discapacitat; Ressonància magnètica estructuralThe application of convolutional neural networks (CNNs) to MRI data has emerged as a promising approach to achieving unprecedented levels of accuracy when predicting the course of neurological conditions, including multiple sclerosis, by means of extracting image features not detectable through conventional methods. Additionally, the study of CNN-derived attention maps, which indicate the most relevant anatomical features for CNN-based decisions, has the potential to uncover key disease mechanisms leading to disability accumulation.
From a cohort of patients prospectively followed up after a first demyelinating attack, we selected those with T1-weighted and T2-FLAIR brain MRI sequences available for image analysis and a clinical assessment performed within the following six months (N = 319). Patients were divided into two groups according to expanded disability status scale (EDSS) score: ≥3.0 and < 3.0. A 3D-CNN model predicted the class using whole-brain MRI scans as input. A comparison with a logistic regression (LR) model using volumetric measurements as explanatory variables and a validation of the CNN model on an independent dataset with similar characteristics (N = 440) were also performed. The layer-wise relevance propagation method was used to obtain individual attention maps.
The CNN model achieved a mean accuracy of 79% and proved to be superior to the equivalent LR-model (77%). Additionally, the model was successfully validated in the independent external cohort without any re-training (accuracy = 71%). Attention-map analyses revealed the predominant role of frontotemporal cortex and cerebellum for CNN decisions, suggesting that the mechanisms leading to disability accrual exceed the mere presence of brain lesions or atrophy and probably involve how damage is distributed in the central nervous system.MS PATHS is funded by Biogen. This study has been possible thanks to a Junior Leader La Caixa Fellowship awarded to C. Tur (fellowship code is LCF/BQ/PI20/11760008) by “la Caixa” Foundation (ID 100010434). The salaries of C. Tur and Ll. Coll are covered by this award
- …