10,215 research outputs found
Rehabilitation Exercise Repetition Segmentation and Counting using Skeletal Body Joints
Physical exercise is an essential component of rehabilitation programs that
improve quality of life and reduce mortality and re-hospitalization rates. In
AI-driven virtual rehabilitation programs, patients complete their exercises
independently at home, while AI algorithms analyze the exercise data to provide
feedback to patients and report their progress to clinicians. To analyze
exercise data, the first step is to segment it into consecutive repetitions.
There has been a significant amount of research performed on segmenting and
counting the repetitive activities of healthy individuals using raw video data,
which raises concerns regarding privacy and is computationally intensive.
Previous research on patients' rehabilitation exercise segmentation relied on
data collected by multiple wearable sensors, which are difficult to use at home
by rehabilitation patients. Compared to healthy individuals, segmenting and
counting exercise repetitions in patients is more challenging because of the
irregular repetition duration and the variation between repetitions. This paper
presents a novel approach for segmenting and counting the repetitions of
rehabilitation exercises performed by patients, based on their skeletal body
joints. Skeletal body joints can be acquired through depth cameras or computer
vision techniques applied to RGB videos of patients. Various sequential neural
networks are designed to analyze the sequences of skeletal body joints and
perform repetition segmentation and counting. Extensive experiments on three
publicly available rehabilitation exercise datasets, KIMORE, UI-PRMD, and
IntelliRehabDS, demonstrate the superiority of the proposed method compared to
previous methods. The proposed method enables accurate exercise analysis while
preserving privacy, facilitating the effective delivery of virtual
rehabilitation programs.Comment: 8 pages, 1 figure, 2 table
Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification
Generalizable person re-identification (Re-ID) is a very hot research topic
in machine learning and computer vision, which plays a significant role in
realistic scenarios due to its various applications in public security and
video surveillance. However, previous methods mainly focus on the visual
representation learning, while neglect to explore the potential of semantic
features during training, which easily leads to poor generalization capability
when adapted to the new domain. In this paper, we propose a Multi-Modal
Equivalent Transformer called MMET for more robust visual-semantic embedding
learning on visual, textual and visual-textual tasks respectively. To further
enhance the robust feature learning in the context of transformer, a dynamic
masking mechanism called Masked Multimodal Modeling strategy (MMM) is
introduced to mask both the image patches and the text tokens, which can
jointly works on multimodal or unimodal data and significantly boost the
performance of generalizable person Re-ID. Extensive experiments on benchmark
datasets demonstrate the competitive performance of our method over previous
approaches. We hope this method could advance the research towards
visual-semantic representation learning. Our source code is also publicly
available at https://github.com/JeremyXSC/MMET
CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities
Sleep abnormalities can have severe health consequences. Automated sleep
staging, i.e. labelling the sequence of sleep stages from the patient's
physiological recordings, could simplify the diagnostic process. Previous work
on automated sleep staging has achieved great results, mainly relying on the
EEG signal. However, often multiple sources of information are available beyond
EEG. This can be particularly beneficial when the EEG recordings are noisy or
even missing completely. In this paper, we propose CoRe-Sleep, a Coordinated
Representation multimodal fusion network that is particularly focused on
improving the robustness of signal analysis on imperfect data. We demonstrate
how appropriately handling multimodal information can be the key to achieving
such robustness. CoRe-Sleep tolerates noisy or missing modalities segments,
allowing training on incomplete data. Additionally, it shows state-of-the-art
performance when testing on both multimodal and unimodal data using a single
model on SHHS-1, the largest publicly available study that includes sleep stage
labels. The results indicate that training the model on multimodal data does
positively influence performance when tested on unimodal data. This work aims
at bridging the gap between automated analysis tools and their clinical
utility.Comment: 10 pages, 4 figures, 2 tables, journa
Copy-paste data augmentation for domain transfer on traffic signs
City streets carry a lot of information that can be exploited to improve the quality of the services the citizens receive. For example, autonomous vehicles need to act accordingly to all the element that are nearby the vehicle itself, like pedestrians, traffic signs and other vehicles. It is also possible to use such information for smart city applications, for example to predict and analyze the traffic or pedestrian flows.
Among all the objects that it is possible to find in a street, traffic signs are very important because of the information they carry. This information can in fact be exploited both for autonomous driving and for smart city applications. Deep learning and, more generally, machine learning models however need huge quantities to learn. Even though modern models are very good at gener- alizing, the more samples the model has, the better it can generalize between different samples.
Creating these datasets organically, namely with real pictures, is a very tedious task because of the wide variety of signs available in the whole world and especially because of all the possible light, orientation conditions and con- ditions in general in which they can appear. In addition to that, it may not be easy to collect enough samples for all the possible traffic signs available, cause some of them may be very rare to find.
Instead of collecting pictures manually, it is possible to exploit data aug- mentation techniques to create synthetic datasets containing the signs that are needed. Creating this data synthetically allows to control the distribution and the conditions of the signs in the datasets, improving the quality and quantity of training data that is going to be used. This thesis work is about using copy-paste data augmentation to create synthetic data for the traffic sign recognition task
LMDA-Net:A lightweight multi-dimensional attention network for general EEG-based brain-computer interface paradigms and interpretability
EEG-based recognition of activities and states involves the use of prior
neuroscience knowledge to generate quantitative EEG features, which may limit
BCI performance. Although neural network-based methods can effectively extract
features, they often encounter issues such as poor generalization across
datasets, high predicting volatility, and low model interpretability. Hence, we
propose a novel lightweight multi-dimensional attention network, called
LMDA-Net. By incorporating two novel attention modules designed specifically
for EEG signals, the channel attention module and the depth attention module,
LMDA-Net can effectively integrate features from multiple dimensions, resulting
in improved classification performance across various BCI tasks. LMDA-Net was
evaluated on four high-impact public datasets, including motor imagery (MI) and
P300-Speller paradigms, and was compared with other representative models. The
experimental results demonstrate that LMDA-Net outperforms other representative
methods in terms of classification accuracy and predicting volatility,
achieving the highest accuracy in all datasets within 300 training epochs.
Ablation experiments further confirm the effectiveness of the channel attention
module and the depth attention module. To facilitate an in-depth understanding
of the features extracted by LMDA-Net, we propose class-specific neural network
feature interpretability algorithms that are suitable for event-related
potentials (ERPs) and event-related desynchronization/synchronization
(ERD/ERS). By mapping the output of the specific layer of LMDA-Net to the time
or spatial domain through class activation maps, the resulting feature
visualizations can provide interpretable analysis and establish connections
with EEG time-spatial analysis in neuroscience. In summary, LMDA-Net shows
great potential as a general online decoding model for various EEG tasks.Comment: 20 pages, 7 Figure
Examples of works to practice staccato technique in clarinet instrument
Klarnetin staccato tekniğini güçlendirme aşamaları eser çalışmalarıyla uygulanmıştır. Staccato
geçişlerini hızlandıracak ritim ve nüans çalışmalarına yer verilmiştir. Çalışmanın en önemli amacı
sadece staccato çalışması değil parmak-dilin eş zamanlı uyumunun hassasiyeti üzerinde de
durulmasıdır. Staccato çalışmalarını daha verimli hale getirmek için eser çalışmasının içinde etüt
çalışmasına da yer verilmiştir. Çalışmaların üzerinde titizlikle durulması staccato çalışmasının ilham
verici etkisi ile müzikal kimliğe yeni bir boyut kazandırmıştır. Sekiz özgün eser çalışmasının her
aşaması anlatılmıştır. Her aşamanın bir sonraki performans ve tekniği güçlendirmesi esas alınmıştır.
Bu çalışmada staccato tekniğinin hangi alanlarda kullanıldığı, nasıl sonuçlar elde edildiği bilgisine
yer verilmiştir. Notaların parmak ve dil uyumu ile nasıl şekilleneceği ve nasıl bir çalışma disiplini
içinde gerçekleşeceği planlanmıştır. Kamış-nota-diyafram-parmak-dil-nüans ve disiplin
kavramlarının staccato tekniğinde ayrılmaz bir bütün olduğu saptanmıştır. Araştırmada literatür
taraması yapılarak staccato ile ilgili çalışmalar taranmıştır. Tarama sonucunda klarnet tekniğin de
kullanılan staccato eser çalışmasının az olduğu tespit edilmiştir. Metot taramasında da etüt
çalışmasının daha çok olduğu saptanmıştır. Böylelikle klarnetin staccato tekniğini hızlandırma ve
güçlendirme çalışmaları sunulmuştur. Staccato etüt çalışmaları yapılırken, araya eser çalışmasının
girmesi beyni rahatlattığı ve istekliliği daha arttırdığı gözlemlenmiştir. Staccato çalışmasını yaparken
doğru bir kamış seçimi üzerinde de durulmuştur. Staccato tekniğini doğru çalışmak için doğru bir
kamışın dil hızını arttırdığı saptanmıştır. Doğru bir kamış seçimi kamıştan rahat ses çıkmasına
bağlıdır. Kamış, dil atma gücünü vermiyorsa daha doğru bir kamış seçiminin yapılması gerekliliği
vurgulanmıştır. Staccato çalışmalarında baştan sona bir eseri yorumlamak zor olabilir. Bu açıdan
çalışma, verilen müzikal nüanslara uymanın, dil atış performansını rahatlattığını ortaya koymuştur.
Gelecek nesillere edinilen bilgi ve birikimlerin aktarılması ve geliştirici olması teşvik edilmiştir.
Çıkacak eserlerin nasıl çözüleceği, staccato tekniğinin nasıl üstesinden gelinebileceği anlatılmıştır.
Staccato tekniğinin daha kısa sürede çözüme kavuşturulması amaç edinilmiştir. Parmakların
yerlerini öğrettiğimiz kadar belleğimize de çalışmaların kaydedilmesi önemlidir. Gösterilen azmin ve
sabrın sonucu olarak ortaya çıkan yapıt başarıyı daha da yukarı seviyelere çıkaracaktır
Boosting the Cycle Counting Power of Graph Neural Networks with I-GNNs
Message Passing Neural Networks (MPNNs) are a widely used class of Graph
Neural Networks (GNNs). The limited representational power of MPNNs inspires
the study of provably powerful GNN architectures. However, knowing one model is
more powerful than another gives little insight about what functions they can
or cannot express. It is still unclear whether these models are able to
approximate specific functions such as counting certain graph substructures,
which is essential for applications in biology, chemistry and social network
analysis. Motivated by this, we propose to study the counting power of Subgraph
MPNNs, a recent and popular class of powerful GNN models that extract rooted
subgraphs for each node, assign the root node a unique identifier and encode
the root node's representation within its rooted subgraph. Specifically, we
prove that Subgraph MPNNs fail to count more-than-4-cycles at node level,
implying that node representations cannot correctly encode the surrounding
substructures like ring systems with more than four atoms. To overcome this
limitation, we propose I-GNNs to extend Subgraph MPNNs by assigning
different identifiers for the root node and its neighbors in each subgraph.
I-GNNs' discriminative power is shown to be strictly stronger than Subgraph
MPNNs and partially stronger than the 3-WL test. More importantly, I-GNNs
are proven capable of counting all 3, 4, 5 and 6-cycles, covering common
substructures like benzene rings in organic chemistry, while still keeping
linear complexity. To the best of our knowledge, it is the first linear-time
GNN model that can count 6-cycles with theoretical guarantees. We validate its
counting power in cycle counting tasks and demonstrate its competitive
performance in molecular prediction benchmarks
Bridging technology and educational psychology: an exploration of individual differences in technology-assisted language learning within an Algerian EFL setting
The implementation of technology in language learning and teaching has a great influence onthe teaching and learning process as a whole and its impact on the learners’ psychological state seems of paramount significance, since it could be either an aid or a barrier to students’ academic performance. This thesis therefore explores individual learner differences in technology-assisted language learning (TALL) and when using educational technologies in
higher education within an Algerian English as a Foreign Language (EFL) setting.
Although I initially intended to investigate the relationship between TALL and certain affective variables mainly motivation, anxiety, self-confidence, and learning styles inside the classroom, the collection and analysis of data shifted my focus to a holistic view of individual learner
differences in TALL environments and when using educational technologies within and beyond the classroom. In an attempt to bridge technology and educational psychology, this
ethnographic case study considers the nature of the impact of technology integration in language teaching and learning on the psychology of individual language learners inside and
outside the classroom. The study considers the reality constructed by participants and reveals multiple and distinctive views about the relationship between the use of educational technologies in higher education and individual learner differences. It took place in a university
in the north-west of Algeria and involved 27 main and secondary student and teacher participants. It consisted of focus-group discussions, follow-up discussions, teachers’
interviews, learners’ diaries, observation, and field notes. It was initially conducted within the classroom but gradually expanded to other settings outside the classroom depending on the availability of participants, their actions, and activities.
The study indicates that the impact of technology integration in EFL learning on individual learner differences is both complex and dynamic. It is complex in the sense that it is shown in multiple aspects and reflected on the students and their differences. In addition to various positive and different negative influences of different technology uses and the different psychological reactions among students to the same technology scenario, the study reveals the
unrecognised different manifestations of similar psychological traits in the same ELT technology scenario. It is also dynamic since it is characterised by constant change according to contextual approaches to and practical realities of technology integration in language teaching and learning in the setting, including discrepancies between students’ attitudes and teacher’ actions, mismatches between technological experiences inside and outside the classroom, local concerns and generalised beliefs about TALL in the context, and the rapid and unplanned shift to online educational delivery during the Covid-19 pandemic situation.
The study may therefore be of interest, not only to Algerian teachers and students, but also to academics and institutions in other contexts through considering the complex and dynamic
impact of TALL and technology integration at higher education on individual differences, and to academics in similar low-resource contexts by undertaking a context approach to technology integration
A Decision Support System for Economic Viability and Environmental Impact Assessment of Vertical Farms
Vertical farming (VF) is the practice of growing crops or animals using the vertical dimension via multi-tier racks or vertically inclined surfaces. In this thesis, I focus on the emerging industry of plant-specific VF. Vertical plant farming (VPF) is a promising and relatively novel practice that can be conducted in buildings with environmental control and artificial lighting. However, the nascent sector has experienced challenges in economic viability, standardisation, and environmental sustainability. Practitioners and academics call for a comprehensive financial analysis of VPF, but efforts are stifled by a lack of valid and available data.
A review of economic estimation and horticultural software identifies a need for a decision support system (DSS) that facilitates risk-empowered business planning for vertical farmers. This thesis proposes an open-source DSS framework to evaluate business sustainability through financial risk and environmental impact assessments. Data from the literature, alongside lessons learned from industry practitioners, would be centralised in the proposed DSS using imprecise data techniques. These techniques have been applied in engineering but are seldom used in financial forecasting. This could benefit complex sectors which only have scarce data to predict business viability.
To begin the execution of the DSS framework, VPF practitioners were interviewed using a mixed-methods approach. Learnings from over 19 shuttered and operational VPF projects provide insights into the barriers inhibiting scalability and identifying risks to form a risk taxonomy. Labour was the most commonly reported top challenge. Therefore, research was conducted to explore lean principles to improve productivity.
A probabilistic model representing a spectrum of variables and their associated uncertainty was built according to the DSS framework to evaluate the financial risk for VF projects. This enabled flexible computation without precise production or financial data to improve economic estimation accuracy. The model assessed two VPF cases (one in the UK and another in Japan), demonstrating the first risk and uncertainty quantification of VPF business models in the literature. The results highlighted measures to improve economic viability and the viability of the UK and Japan case.
The environmental impact assessment model was developed, allowing VPF operators to evaluate their carbon footprint compared to traditional agriculture using life-cycle assessment. I explore strategies for net-zero carbon production through sensitivity analysis. Renewable energies, especially solar, geothermal, and tidal power, show promise for reducing the carbon emissions of indoor VPF. Results show that renewably-powered VPF can reduce carbon emissions compared to field-based agriculture when considering the land-use change.
The drivers for DSS adoption have been researched, showing a pathway of compliance and design thinking to overcome the ‘problem of implementation’ and enable commercialisation. Further work is suggested to standardise VF equipment, collect benchmarking data, and characterise risks. This work will reduce risk and uncertainty and accelerate the sector’s emergence
- …