14,701 research outputs found
Rehabilitation Exercise Repetition Segmentation and Counting using Skeletal Body Joints
Physical exercise is an essential component of rehabilitation programs that
improve quality of life and reduce mortality and re-hospitalization rates. In
AI-driven virtual rehabilitation programs, patients complete their exercises
independently at home, while AI algorithms analyze the exercise data to provide
feedback to patients and report their progress to clinicians. To analyze
exercise data, the first step is to segment it into consecutive repetitions.
There has been a significant amount of research performed on segmenting and
counting the repetitive activities of healthy individuals using raw video data,
which raises concerns regarding privacy and is computationally intensive.
Previous research on patients' rehabilitation exercise segmentation relied on
data collected by multiple wearable sensors, which are difficult to use at home
by rehabilitation patients. Compared to healthy individuals, segmenting and
counting exercise repetitions in patients is more challenging because of the
irregular repetition duration and the variation between repetitions. This paper
presents a novel approach for segmenting and counting the repetitions of
rehabilitation exercises performed by patients, based on their skeletal body
joints. Skeletal body joints can be acquired through depth cameras or computer
vision techniques applied to RGB videos of patients. Various sequential neural
networks are designed to analyze the sequences of skeletal body joints and
perform repetition segmentation and counting. Extensive experiments on three
publicly available rehabilitation exercise datasets, KIMORE, UI-PRMD, and
IntelliRehabDS, demonstrate the superiority of the proposed method compared to
previous methods. The proposed method enables accurate exercise analysis while
preserving privacy, facilitating the effective delivery of virtual
rehabilitation programs.Comment: 8 pages, 1 figure, 2 table
Multi-modal Facial Affective Analysis based on Masked Autoencoder
Human affective behavior analysis focuses on analyzing human expressions or
other behaviors to enhance the understanding of human psychology. The CVPR 2023
Competition on Affective Behavior Analysis in-the-wild (ABAW) is dedicated to
providing high-quality and large-scale Aff-wild2 for the recognition of
commonly used emotion representations, such as Action Units (AU), basic
expression categories(EXPR), and Valence-Arousal (VA). The competition is
committed to making significant strides in improving the accuracy and
practicality of affective analysis research in real-world scenarios. In this
paper, we introduce our submission to the CVPR 2023: ABAW5. Our approach
involves several key components. First, we utilize the visual information from
a Masked Autoencoder(MAE) model that has been pre-trained on a large-scale face
image dataset in a self-supervised manner. Next, we finetune the MAE encoder on
the image frames from the Aff-wild2 for AU, EXPR and VA tasks, which can be
regarded as a static and uni-modal training. Additionally, we leverage the
multi-modal and temporal information from the videos and implement a
transformer-based framework to fuse the multi-modal features. Our approach
achieves impressive results in the ABAW5 competition, with an average F1 score
of 55.49\% and 41.21\% in the AU and EXPR tracks, respectively, and an average
CCC of 0.6372 in the VA track. Our approach ranks first in the EXPR and AU
tracks, and second in the VA track. Extensive quantitative experiments and
ablation studies demonstrate the effectiveness of our proposed method
Enhancing Low-resolution Face Recognition with Feature Similarity Knowledge Distillation
In this study, we introduce a feature knowledge distillation framework to
improve low-resolution (LR) face recognition performance using knowledge
obtained from high-resolution (HR) images. The proposed framework transfers
informative features from an HR-trained network to an LR-trained network by
reducing the distance between them. A cosine similarity measure was employed as
a distance metric to effectively align the HR and LR features. This approach
differs from conventional knowledge distillation frameworks, which use the L_p
distance metrics and offer the advantage of converging well when reducing the
distance between features of different resolutions. Our framework achieved a 3%
improvement over the previous state-of-the-art method on the AgeDB-30 benchmark
without bells and whistles, while maintaining a strong performance on HR
images. The effectiveness of cosine similarity as a distance metric was
validated through statistical analysis, making our approach a promising
solution for real-world applications in which LR images are frequently
encountered. The code and pretrained models are publicly available on
https://github.com/gist-ailab/feature-similarity-KD
BotMoE: Twitter Bot Detection with Community-Aware Mixtures of Modal-Specific Experts
Twitter bot detection has become a crucial task in efforts to combat online
misinformation, mitigate election interference, and curb malicious propaganda.
However, advanced Twitter bots often attempt to mimic the characteristics of
genuine users through feature manipulation and disguise themselves to fit in
diverse user communities, posing challenges for existing Twitter bot detection
models. To this end, we propose BotMoE, a Twitter bot detection framework that
jointly utilizes multiple user information modalities (metadata, textual
content, network structure) to improve the detection of deceptive bots.
Furthermore, BotMoE incorporates a community-aware Mixture-of-Experts (MoE)
layer to improve domain generalization and adapt to different Twitter
communities. Specifically, BotMoE constructs modal-specific encoders for
metadata features, textual content, and graphical structure, which jointly
model Twitter users from three modal-specific perspectives. We then employ a
community-aware MoE layer to automatically assign users to different
communities and leverage the corresponding expert networks. Finally, user
representations from metadata, text, and graph perspectives are fused with an
expert fusion layer, combining all three modalities while measuring the
consistency of user information. Extensive experiments demonstrate that BotMoE
significantly advances the state-of-the-art on three Twitter bot detection
benchmarks. Studies also confirm that BotMoE captures advanced and evasive
bots, alleviates the reliance on training data, and better generalizes to new
and previously unseen user communities.Comment: Accepted at SIGIR 202
Human Semantic Segmentation using Millimeter-Wave Radar Sparse Point Clouds
This paper presents a framework for semantic segmentation on sparse
sequential point clouds of millimeter-wave radar. Compared with cameras and
lidars, millimeter-wave radars have the advantage of not revealing privacy,
having a strong anti-interference ability, and having long detection distance.
The sparsity and capturing temporal-topological features of mmWave data is
still a problem. However, the issue of capturing the temporal-topological
coupling features under the human semantic segmentation task prevents previous
advanced segmentation methods (e.g PointNet, PointCNN, Point Transformer) from
being well utilized in practical scenarios. To address the challenge caused by
the sparsity and temporal-topological feature of the data, we (i) introduce
graph structure and topological features to the point cloud, (ii) propose a
semantic segmentation framework including a global feature-extracting module
and a sequential feature-extracting module. In addition, we design an efficient
and more fitting loss function for a better training process and segmentation
results based on graph clustering. Experimentally, we deploy representative
semantic segmentation algorithms (Transformer, GCNN, etc.) on a custom dataset.
Experimental results indicate that our model achieves mean accuracy on the
custom dataset by and outperforms the state-of-the-art
algorithms. Moreover, to validate the model's robustness, we deploy our model
on the well-known S3DIS dataset. On the S3DIS dataset, our model achieves mean
accuracy by , outperforming baseline algorithms
Copy-paste data augmentation for domain transfer on traffic signs
City streets carry a lot of information that can be exploited to improve the quality of the services the citizens receive. For example, autonomous vehicles need to act accordingly to all the element that are nearby the vehicle itself, like pedestrians, traffic signs and other vehicles. It is also possible to use such information for smart city applications, for example to predict and analyze the traffic or pedestrian flows.
Among all the objects that it is possible to find in a street, traffic signs are very important because of the information they carry. This information can in fact be exploited both for autonomous driving and for smart city applications. Deep learning and, more generally, machine learning models however need huge quantities to learn. Even though modern models are very good at gener- alizing, the more samples the model has, the better it can generalize between different samples.
Creating these datasets organically, namely with real pictures, is a very tedious task because of the wide variety of signs available in the whole world and especially because of all the possible light, orientation conditions and con- ditions in general in which they can appear. In addition to that, it may not be easy to collect enough samples for all the possible traffic signs available, cause some of them may be very rare to find.
Instead of collecting pictures manually, it is possible to exploit data aug- mentation techniques to create synthetic datasets containing the signs that are needed. Creating this data synthetically allows to control the distribution and the conditions of the signs in the datasets, improving the quality and quantity of training data that is going to be used. This thesis work is about using copy-paste data augmentation to create synthetic data for the traffic sign recognition task
Kurcuma: a kitchen utensil recognition collection for unsupervised domain adaptation
The use of deep learning makes it possible to achieve extraordinary results in all kinds of tasks related to computer vision. However, this performance is strongly related to the availability of training data and its relationship with the distribution in the eventual application scenario. This question is of vital importance in areas such as robotics, where the targeted environment data are barely available in advance. In this context, domain adaptation (DA) techniques are especially important to building models that deal with new data for which the corresponding label is not available. To promote further research in DA techniques applied to robotics, this work presents Kurcuma (Kitchen Utensil Recognition Collection for Unsupervised doMain Adaptation), an assortment of seven datasets for the classification of kitchen utensils—a task of relevance in home-assistance robotics and a suitable showcase for DA. Along with the data, we provide a broad description of the main characteristics of the dataset, as well as a baseline using the well-known domain-adversarial training of neural networks approach. The results show the challenge posed by DA on these types of tasks, pointing to the need for new approaches in future work.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work was supported by the I+D+i project TED2021-132103A-I00 (DOREMI), funded by MCIN/AEI/10.13039/501100011033. Some of the computing resources were provided by the Generalitat Valenciana and the European Union through the FEDER funding program (IDIFEDER/2020/003). The second author is supported by grant APOSTD/2020/256 from “Programa I+D+i de la Generalitat Valenciana”
Exploring the Training Factors that Influence the Role of Teaching Assistants to Teach to Students With SEND in a Mainstream Classroom in England
With the implementation of inclusive education having become increasingly valued over the years, the training of Teaching Assistants (TAs) is now more important than ever, given that they work alongside pupils with special educational needs and disabilities (hereinafter SEND) in mainstream education classrooms. The current study explored the training factors that influence the role of TAs when it comes to teaching SEND students in mainstream classrooms in England during their one-year training period. This work aimed to increase understanding of how the training of TAs is seen to influence the development of their personal knowledge and professional skills. The study has significance for our comprehension of the connection between the TAs’ training and the quality of education in the classroom. In addition, this work investigated whether there existed a correlation between the teaching experience of TAs and their background information, such as their gender, age, grade level taught, years of teaching experience, and qualification level.
A critical realist theoretical approach was adopted for this two-phased study, which involved the mixing of adaptive and grounded theories respectively. The multi-method project featured 13 case studies, each of which involved a trainee TA, his/her college tutor, and the classroom teacher who was supervising the trainee TA. The analysis was based on using semi-structured interviews, various questionnaires, and non-participant observation methods for each of these case studies during the TA’s one-year training period. The primary analysis of the research was completed by comparing the various kinds of data collected from the participants in the first and second data collection stages of each case. Further analysis involved cross-case analysis using a grounded theory approach, which made it possible to draw conclusions and put forth several core propositions. Compared with previous research, the findings of the current study reveal many implications for the training and deployment conditions of TAs, while they also challenge the prevailing approaches in many aspects, in addition to offering more diversified, enriched, and comprehensive explanations of the critical pedagogical issues
Shuffled ATG8 interacting motifs form an ancestral bridge between UFMylation and autophagy
UFMylation involves the covalent modification of substrate proteins with UFM1 (Ubiquitin‐fold modifier 1) and is important for maintaining ER homeostasis. Stalled translation triggers the UFMylation of ER‐bound ribosomes and activates C53‐mediated autophagy to clear toxic polypeptides. C53 contains noncanonical shuffled ATG8‐interacting motifs (sAIMs) that are essential for ATG8 interaction and autophagy initiation. However, the mechanistic basis of sAIM‐mediated ATG8 interaction remains unknown. Here, we show that C53 and sAIMs are conserved across eukaryotes but secondarily lost in fungi and various algal lineages. Biochemical assays showed that the unicellular alga Chlamydomonas reinhardtii has a functional UFMylation pathway, refuting the assumption that UFMylation is linked to multicellularity. Comparative structural analyses revealed that both UFM1 and ATG8 bind sAIMs in C53, but in a distinct way. Conversion of sAIMs into canonical AIMs impaired binding of C53 to UFM1, while strengthening ATG8 binding. Increased ATG8 binding led to the autoactivation of the C53 pathway and sensitization of Arabidopsis thaliana to ER stress. Altogether, our findings reveal an ancestral role of sAIMs in UFMylation‐dependent fine‐tuning of C53‐mediated autophagy activation
Countermeasures for the majority attack in blockchain distributed systems
La tecnología Blockchain es considerada como uno de los paradigmas informáticos más importantes posterior al Internet; en función a sus características únicas que la hacen ideal para registrar, verificar y administrar información de diferentes transacciones. A pesar de esto, Blockchain se enfrenta a diferentes problemas de seguridad, siendo el ataque del 51% o ataque mayoritario uno de los más importantes. Este consiste en que uno o más mineros tomen el control de al menos el 51% del Hash extraído o del cómputo en una red; de modo que un minero puede manipular y modificar arbitrariamente la información registrada en esta tecnología. Este trabajo se enfocó en diseñar e implementar estrategias de detección y mitigación de ataques mayoritarios (51% de ataque) en un sistema distribuido Blockchain, a partir de la caracterización del comportamiento de los mineros. Para lograr esto, se analizó y evaluó el Hash Rate / Share de los mineros de Bitcoin y Crypto Ethereum, seguido del diseño e implementación de un protocolo de consenso para controlar el poder de cómputo de los mineros. Posteriormente, se realizó la exploración y evaluación de modelos de Machine Learning para detectar software malicioso de tipo Cryptojacking.DoctoradoDoctor en Ingeniería de Sistemas y Computació
- …