10,215 research outputs found

    Rehabilitation Exercise Repetition Segmentation and Counting using Skeletal Body Joints

    Full text link
    Physical exercise is an essential component of rehabilitation programs that improve quality of life and reduce mortality and re-hospitalization rates. In AI-driven virtual rehabilitation programs, patients complete their exercises independently at home, while AI algorithms analyze the exercise data to provide feedback to patients and report their progress to clinicians. To analyze exercise data, the first step is to segment it into consecutive repetitions. There has been a significant amount of research performed on segmenting and counting the repetitive activities of healthy individuals using raw video data, which raises concerns regarding privacy and is computationally intensive. Previous research on patients' rehabilitation exercise segmentation relied on data collected by multiple wearable sensors, which are difficult to use at home by rehabilitation patients. Compared to healthy individuals, segmenting and counting exercise repetitions in patients is more challenging because of the irregular repetition duration and the variation between repetitions. This paper presents a novel approach for segmenting and counting the repetitions of rehabilitation exercises performed by patients, based on their skeletal body joints. Skeletal body joints can be acquired through depth cameras or computer vision techniques applied to RGB videos of patients. Various sequential neural networks are designed to analyze the sequences of skeletal body joints and perform repetition segmentation and counting. Extensive experiments on three publicly available rehabilitation exercise datasets, KIMORE, UI-PRMD, and IntelliRehabDS, demonstrate the superiority of the proposed method compared to previous methods. The proposed method enables accurate exercise analysis while preserving privacy, facilitating the effective delivery of virtual rehabilitation programs.Comment: 8 pages, 1 figure, 2 table

    Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification

    Full text link
    Generalizable person re-identification (Re-ID) is a very hot research topic in machine learning and computer vision, which plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. However, previous methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training, which easily leads to poor generalization capability when adapted to the new domain. In this paper, we propose a Multi-Modal Equivalent Transformer called MMET for more robust visual-semantic embedding learning on visual, textual and visual-textual tasks respectively. To further enhance the robust feature learning in the context of transformer, a dynamic masking mechanism called Masked Multimodal Modeling strategy (MMM) is introduced to mask both the image patches and the text tokens, which can jointly works on multimodal or unimodal data and significantly boost the performance of generalizable person Re-ID. Extensive experiments on benchmark datasets demonstrate the competitive performance of our method over previous approaches. We hope this method could advance the research towards visual-semantic representation learning. Our source code is also publicly available at https://github.com/JeremyXSC/MMET

    CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities

    Full text link
    Sleep abnormalities can have severe health consequences. Automated sleep staging, i.e. labelling the sequence of sleep stages from the patient's physiological recordings, could simplify the diagnostic process. Previous work on automated sleep staging has achieved great results, mainly relying on the EEG signal. However, often multiple sources of information are available beyond EEG. This can be particularly beneficial when the EEG recordings are noisy or even missing completely. In this paper, we propose CoRe-Sleep, a Coordinated Representation multimodal fusion network that is particularly focused on improving the robustness of signal analysis on imperfect data. We demonstrate how appropriately handling multimodal information can be the key to achieving such robustness. CoRe-Sleep tolerates noisy or missing modalities segments, allowing training on incomplete data. Additionally, it shows state-of-the-art performance when testing on both multimodal and unimodal data using a single model on SHHS-1, the largest publicly available study that includes sleep stage labels. The results indicate that training the model on multimodal data does positively influence performance when tested on unimodal data. This work aims at bridging the gap between automated analysis tools and their clinical utility.Comment: 10 pages, 4 figures, 2 tables, journa

    Copy-paste data augmentation for domain transfer on traffic signs

    Get PDF
    City streets carry a lot of information that can be exploited to improve the quality of the services the citizens receive. For example, autonomous vehicles need to act accordingly to all the element that are nearby the vehicle itself, like pedestrians, traffic signs and other vehicles. It is also possible to use such information for smart city applications, for example to predict and analyze the traffic or pedestrian flows. Among all the objects that it is possible to find in a street, traffic signs are very important because of the information they carry. This information can in fact be exploited both for autonomous driving and for smart city applications. Deep learning and, more generally, machine learning models however need huge quantities to learn. Even though modern models are very good at gener- alizing, the more samples the model has, the better it can generalize between different samples. Creating these datasets organically, namely with real pictures, is a very tedious task because of the wide variety of signs available in the whole world and especially because of all the possible light, orientation conditions and con- ditions in general in which they can appear. In addition to that, it may not be easy to collect enough samples for all the possible traffic signs available, cause some of them may be very rare to find. Instead of collecting pictures manually, it is possible to exploit data aug- mentation techniques to create synthetic datasets containing the signs that are needed. Creating this data synthetically allows to control the distribution and the conditions of the signs in the datasets, improving the quality and quantity of training data that is going to be used. This thesis work is about using copy-paste data augmentation to create synthetic data for the traffic sign recognition task

    LMDA-Net:A lightweight multi-dimensional attention network for general EEG-based brain-computer interface paradigms and interpretability

    Full text link
    EEG-based recognition of activities and states involves the use of prior neuroscience knowledge to generate quantitative EEG features, which may limit BCI performance. Although neural network-based methods can effectively extract features, they often encounter issues such as poor generalization across datasets, high predicting volatility, and low model interpretability. Hence, we propose a novel lightweight multi-dimensional attention network, called LMDA-Net. By incorporating two novel attention modules designed specifically for EEG signals, the channel attention module and the depth attention module, LMDA-Net can effectively integrate features from multiple dimensions, resulting in improved classification performance across various BCI tasks. LMDA-Net was evaluated on four high-impact public datasets, including motor imagery (MI) and P300-Speller paradigms, and was compared with other representative models. The experimental results demonstrate that LMDA-Net outperforms other representative methods in terms of classification accuracy and predicting volatility, achieving the highest accuracy in all datasets within 300 training epochs. Ablation experiments further confirm the effectiveness of the channel attention module and the depth attention module. To facilitate an in-depth understanding of the features extracted by LMDA-Net, we propose class-specific neural network feature interpretability algorithms that are suitable for event-related potentials (ERPs) and event-related desynchronization/synchronization (ERD/ERS). By mapping the output of the specific layer of LMDA-Net to the time or spatial domain through class activation maps, the resulting feature visualizations can provide interpretable analysis and establish connections with EEG time-spatial analysis in neuroscience. In summary, LMDA-Net shows great potential as a general online decoding model for various EEG tasks.Comment: 20 pages, 7 Figure

    Examples of works to practice staccato technique in clarinet instrument

    Get PDF
    Klarnetin staccato tekniğini güçlendirme aşamaları eser çalışmalarıyla uygulanmıştır. Staccato geçişlerini hızlandıracak ritim ve nüans çalışmalarına yer verilmiştir. Çalışmanın en önemli amacı sadece staccato çalışması değil parmak-dilin eş zamanlı uyumunun hassasiyeti üzerinde de durulmasıdır. Staccato çalışmalarını daha verimli hale getirmek için eser çalışmasının içinde etüt çalışmasına da yer verilmiştir. Çalışmaların üzerinde titizlikle durulması staccato çalışmasının ilham verici etkisi ile müzikal kimliğe yeni bir boyut kazandırmıştır. Sekiz özgün eser çalışmasının her aşaması anlatılmıştır. Her aşamanın bir sonraki performans ve tekniği güçlendirmesi esas alınmıştır. Bu çalışmada staccato tekniğinin hangi alanlarda kullanıldığı, nasıl sonuçlar elde edildiği bilgisine yer verilmiştir. Notaların parmak ve dil uyumu ile nasıl şekilleneceği ve nasıl bir çalışma disiplini içinde gerçekleşeceği planlanmıştır. Kamış-nota-diyafram-parmak-dil-nüans ve disiplin kavramlarının staccato tekniğinde ayrılmaz bir bütün olduğu saptanmıştır. Araştırmada literatür taraması yapılarak staccato ile ilgili çalışmalar taranmıştır. Tarama sonucunda klarnet tekniğin de kullanılan staccato eser çalışmasının az olduğu tespit edilmiştir. Metot taramasında da etüt çalışmasının daha çok olduğu saptanmıştır. Böylelikle klarnetin staccato tekniğini hızlandırma ve güçlendirme çalışmaları sunulmuştur. Staccato etüt çalışmaları yapılırken, araya eser çalışmasının girmesi beyni rahatlattığı ve istekliliği daha arttırdığı gözlemlenmiştir. Staccato çalışmasını yaparken doğru bir kamış seçimi üzerinde de durulmuştur. Staccato tekniğini doğru çalışmak için doğru bir kamışın dil hızını arttırdığı saptanmıştır. Doğru bir kamış seçimi kamıştan rahat ses çıkmasına bağlıdır. Kamış, dil atma gücünü vermiyorsa daha doğru bir kamış seçiminin yapılması gerekliliği vurgulanmıştır. Staccato çalışmalarında baştan sona bir eseri yorumlamak zor olabilir. Bu açıdan çalışma, verilen müzikal nüanslara uymanın, dil atış performansını rahatlattığını ortaya koymuştur. Gelecek nesillere edinilen bilgi ve birikimlerin aktarılması ve geliştirici olması teşvik edilmiştir. Çıkacak eserlerin nasıl çözüleceği, staccato tekniğinin nasıl üstesinden gelinebileceği anlatılmıştır. Staccato tekniğinin daha kısa sürede çözüme kavuşturulması amaç edinilmiştir. Parmakların yerlerini öğrettiğimiz kadar belleğimize de çalışmaların kaydedilmesi önemlidir. Gösterilen azmin ve sabrın sonucu olarak ortaya çıkan yapıt başarıyı daha da yukarı seviyelere çıkaracaktır

    Boosting the Cycle Counting Power of Graph Neural Networks with I2^2-GNNs

    Full text link
    Message Passing Neural Networks (MPNNs) are a widely used class of Graph Neural Networks (GNNs). The limited representational power of MPNNs inspires the study of provably powerful GNN architectures. However, knowing one model is more powerful than another gives little insight about what functions they can or cannot express. It is still unclear whether these models are able to approximate specific functions such as counting certain graph substructures, which is essential for applications in biology, chemistry and social network analysis. Motivated by this, we propose to study the counting power of Subgraph MPNNs, a recent and popular class of powerful GNN models that extract rooted subgraphs for each node, assign the root node a unique identifier and encode the root node's representation within its rooted subgraph. Specifically, we prove that Subgraph MPNNs fail to count more-than-4-cycles at node level, implying that node representations cannot correctly encode the surrounding substructures like ring systems with more than four atoms. To overcome this limitation, we propose I2^2-GNNs to extend Subgraph MPNNs by assigning different identifiers for the root node and its neighbors in each subgraph. I2^2-GNNs' discriminative power is shown to be strictly stronger than Subgraph MPNNs and partially stronger than the 3-WL test. More importantly, I2^2-GNNs are proven capable of counting all 3, 4, 5 and 6-cycles, covering common substructures like benzene rings in organic chemistry, while still keeping linear complexity. To the best of our knowledge, it is the first linear-time GNN model that can count 6-cycles with theoretical guarantees. We validate its counting power in cycle counting tasks and demonstrate its competitive performance in molecular prediction benchmarks

    Bridging technology and educational psychology: an exploration of individual differences in technology-assisted language learning within an Algerian EFL setting

    Get PDF
    The implementation of technology in language learning and teaching has a great influence onthe teaching and learning process as a whole and its impact on the learners’ psychological state seems of paramount significance, since it could be either an aid or a barrier to students’ academic performance. This thesis therefore explores individual learner differences in technology-assisted language learning (TALL) and when using educational technologies in higher education within an Algerian English as a Foreign Language (EFL) setting. Although I initially intended to investigate the relationship between TALL and certain affective variables mainly motivation, anxiety, self-confidence, and learning styles inside the classroom, the collection and analysis of data shifted my focus to a holistic view of individual learner differences in TALL environments and when using educational technologies within and beyond the classroom. In an attempt to bridge technology and educational psychology, this ethnographic case study considers the nature of the impact of technology integration in language teaching and learning on the psychology of individual language learners inside and outside the classroom. The study considers the reality constructed by participants and reveals multiple and distinctive views about the relationship between the use of educational technologies in higher education and individual learner differences. It took place in a university in the north-west of Algeria and involved 27 main and secondary student and teacher participants. It consisted of focus-group discussions, follow-up discussions, teachers’ interviews, learners’ diaries, observation, and field notes. It was initially conducted within the classroom but gradually expanded to other settings outside the classroom depending on the availability of participants, their actions, and activities. The study indicates that the impact of technology integration in EFL learning on individual learner differences is both complex and dynamic. It is complex in the sense that it is shown in multiple aspects and reflected on the students and their differences. In addition to various positive and different negative influences of different technology uses and the different psychological reactions among students to the same technology scenario, the study reveals the unrecognised different manifestations of similar psychological traits in the same ELT technology scenario. It is also dynamic since it is characterised by constant change according to contextual approaches to and practical realities of technology integration in language teaching and learning in the setting, including discrepancies between students’ attitudes and teacher’ actions, mismatches between technological experiences inside and outside the classroom, local concerns and generalised beliefs about TALL in the context, and the rapid and unplanned shift to online educational delivery during the Covid-19 pandemic situation. The study may therefore be of interest, not only to Algerian teachers and students, but also to academics and institutions in other contexts through considering the complex and dynamic impact of TALL and technology integration at higher education on individual differences, and to academics in similar low-resource contexts by undertaking a context approach to technology integration

    A Decision Support System for Economic Viability and Environmental Impact Assessment of Vertical Farms

    Get PDF
    Vertical farming (VF) is the practice of growing crops or animals using the vertical dimension via multi-tier racks or vertically inclined surfaces. In this thesis, I focus on the emerging industry of plant-specific VF. Vertical plant farming (VPF) is a promising and relatively novel practice that can be conducted in buildings with environmental control and artificial lighting. However, the nascent sector has experienced challenges in economic viability, standardisation, and environmental sustainability. Practitioners and academics call for a comprehensive financial analysis of VPF, but efforts are stifled by a lack of valid and available data. A review of economic estimation and horticultural software identifies a need for a decision support system (DSS) that facilitates risk-empowered business planning for vertical farmers. This thesis proposes an open-source DSS framework to evaluate business sustainability through financial risk and environmental impact assessments. Data from the literature, alongside lessons learned from industry practitioners, would be centralised in the proposed DSS using imprecise data techniques. These techniques have been applied in engineering but are seldom used in financial forecasting. This could benefit complex sectors which only have scarce data to predict business viability. To begin the execution of the DSS framework, VPF practitioners were interviewed using a mixed-methods approach. Learnings from over 19 shuttered and operational VPF projects provide insights into the barriers inhibiting scalability and identifying risks to form a risk taxonomy. Labour was the most commonly reported top challenge. Therefore, research was conducted to explore lean principles to improve productivity. A probabilistic model representing a spectrum of variables and their associated uncertainty was built according to the DSS framework to evaluate the financial risk for VF projects. This enabled flexible computation without precise production or financial data to improve economic estimation accuracy. The model assessed two VPF cases (one in the UK and another in Japan), demonstrating the first risk and uncertainty quantification of VPF business models in the literature. The results highlighted measures to improve economic viability and the viability of the UK and Japan case. The environmental impact assessment model was developed, allowing VPF operators to evaluate their carbon footprint compared to traditional agriculture using life-cycle assessment. I explore strategies for net-zero carbon production through sensitivity analysis. Renewable energies, especially solar, geothermal, and tidal power, show promise for reducing the carbon emissions of indoor VPF. Results show that renewably-powered VPF can reduce carbon emissions compared to field-based agriculture when considering the land-use change. The drivers for DSS adoption have been researched, showing a pathway of compliance and design thinking to overcome the ‘problem of implementation’ and enable commercialisation. Further work is suggested to standardise VF equipment, collect benchmarking data, and characterise risks. This work will reduce risk and uncertainty and accelerate the sector’s emergence
    corecore