3,990 research outputs found

    Generalized Low-Rank Update: Model Parameter Bounds for Low-Rank Training Data Modifications

    Full text link
    In this study, we have developed an incremental machine learning (ML) method that efficiently obtains the optimal model when a small number of instances or features are added or removed. This problem holds practical importance in model selection, such as cross-validation (CV) and feature selection. Among the class of ML methods known as linear estimators, there exists an efficient model update framework called the low-rank update that can effectively handle changes in a small number of rows and columns within the data matrix. However, for ML methods beyond linear estimators, there is currently no comprehensive framework available to obtain knowledge about the updated solution within a specific computational complexity. In light of this, our study introduces a method called the Generalized Low-Rank Update (GLRU) which extends the low-rank update framework of linear estimators to ML methods formulated as a certain class of regularized empirical risk minimization, including commonly used methods such as SVM and logistic regression. The proposed GLRU method not only expands the range of its applicability but also provides information about the updated solutions with a computational complexity proportional to the amount of dataset changes. To demonstrate the effectiveness of the GLRU method, we conduct experiments showcasing its efficiency in performing cross-validation and feature selection compared to other baseline methods

    Automatic recognition of gait patterns in human motor disorders using machine learning: A review

    Get PDF
    Background: automatic recognition of human movement is an effective strategy to assess abnormal gait patterns. Machine learning approaches are mainly applied due to their ability to work with multidimensional nonlinear features. Purpose: to compare several machine learning algorithms employed for gait pattern recognition in motor disorders using discriminant features extracted from gait dynamics. Additionally, this work highlights procedures that improve gait recognition performance. Methods: we conducted an electronic literature search on Web of Science, IEEE, and Scopus, using “human recognition”, “gait patterns’’, and “feature selection methods” as relevant keywords. Results: analysis of the literature showed that kernel principal component analysis and genetic algorithms are efficient at reducing dimensional features due to their ability to process nonlinear data and converge to global optimum. Comparative analysis of machine learning performance showed that support vector machines (SVMs) exhibited higher accuracy and proper generalization for new instances. Conclusions: automatic recognition by combining dimensional data reduction, cross-validation and normalization techniques with SVMs may offer an objective and rapid tool for investigating the subject's clinical status. Future directions comprise the real-time application of these tools to drive powered assistive devices in free-living conditions.This work was supported by the FCT - Fundação para a Ciência e Tecnologia - with the reference scholarship SFRH/BD/108309/2015, and the reference project UID/EEA/04436/2013, by FEDER funds through the COMPETE 2020 - Programa Operacional Competitividade e Internacionalização (POCI) - with the reference project POCI-01-0145-FEDER-006941. Also, this work was partially supported by grant RYC-2014-16613 by Spanish Ministry of Economy and Competitiveness

    Geometric data understanding : deriving case specific features

    Get PDF
    There exists a tradition using precise geometric modeling, where uncertainties in data can be considered noise. Another tradition relies on statistical nature of vast quantity of data, where geometric regularity is intrinsic to data and statistical models usually grasp this level only indirectly. This work focuses on point cloud data of natural resources and the silhouette recognition from video input as two real world examples of problems having geometric content which is intangible at the raw data presentation. This content could be discovered and modeled to some degree by such machine learning (ML) approaches like deep learning, but either a direct coverage of geometry in samples or addition of special geometry invariant layer is necessary. Geometric content is central when there is a need for direct observations of spatial variables, or one needs to gain a mapping to a geometrically consistent data representation, where e.g. outliers or noise can be easily discerned. In this thesis we consider transformation of original input data to a geometric feature space in two example problems. The first example is curvature of surfaces, which has met renewed interest since the introduction of ubiquitous point cloud data and the maturation of the discrete differential geometry. Curvature spectra can characterize a spatial sample rather well, and provide useful features for ML purposes. The second example involves projective methods used to video stereo-signal analysis in swimming analytics. The aim is to find meaningful local geometric representations for feature generation, which also facilitate additional analysis based on geometric understanding of the model. The features are associated directly to some geometric quantity, and this makes it easier to express the geometric constraints in a natural way, as shown in the thesis. Also, the visualization and further feature generation is much easier. Third, the approach provides sound baseline methods to more traditional ML approaches, e.g. neural network methods. Fourth, most of the ML methods can utilize the geometric features presented in this work as additional features.Geometriassa käytetään perinteisesti tarkkoja malleja, jolloin datassa esiintyvät epätarkkuudet edustavat melua. Toisessa perinteessä nojataan suuren datamäärän tilastolliseen luonteeseen, jolloin geometrinen säännönmukaisuus on datan sisäsyntyinen ominaisuus, joka hahmotetaan tilastollisilla malleilla ainoastaan epäsuorasti. Tämä työ keskittyy kahteen esimerkkiin: luonnonvaroja kuvaaviin pistepilviin ja videohahmontunnistukseen. Nämä ovat todellisia ongelmia, joissa geometrinen sisältö on tavoittamattomissa raakadatan tasolla. Tämä sisältö voitaisiin jossain määrin löytää ja mallintaa koneoppimisen keinoin, esim. syväoppimisen avulla, mutta joko geometria pitää kattaa suoraan näytteistämällä tai tarvitaan neuronien lisäkerros geometrisia invariansseja varten. Geometrinen sisältö on keskeinen, kun tarvitaan suoraa avaruudellisten suureiden havainnointia, tai kun tarvitaan kuvaus geometrisesti yhtenäiseen dataesitykseen, jossa poikkeavat näytteet tai melu voidaan helposti erottaa. Tässä työssä tarkastellaan datan muuntamista geometriseen piirreavaruuteen kahden esimerkkiohjelman suhteen. Ensimmäinen esimerkki on pintakaarevuus, joka on uudelleen virinneen kiinnostuksen kohde kaikkialle saatavissa olevan datan ja diskreetin geometrian kypsymisen takia. Kaarevuusspektrit voivat luonnehtia avaruudellista kohdetta melko hyvin ja tarjota koneoppimisessa hyödyllisiä piirteitä. Toinen esimerkki koskee projektiivisia menetelmiä käytettäessä stereovideosignaalia uinnin analytiikkaan. Tavoite on löytää merkityksellisiä paikallisen geometrian esityksiä, jotka samalla mahdollistavat muun geometrian ymmärrykseen perustuvan analyysin. Piirteet liittyvät suoraan johonkin geometriseen suureeseen, ja tämä helpottaa luonnollisella tavalla geometristen rajoitteiden käsittelyä, kuten väitöstyössä osoitetaan. Myös visualisointi ja lisäpiirteiden luonti muuttuu helpommaksi. Kolmanneksi, lähestymistapa suo selkeän vertailumenetelmän perinteisemmille koneoppimisen lähestymistavoille, esim. hermoverkkomenetelmille. Neljänneksi, useimmat koneoppimismenetelmät voivat hyödyntää tässä työssä esitettyjä geometrisia piirteitä lisäämällä ne muiden piirteiden joukkoon

    Fear Classification using Affective Computing with Physiological Information and Smart-Wearables

    Get PDF
    Mención Internacional en el título de doctorAmong the 17 Sustainable Development Goals proposed within the 2030 Agenda and adopted by all of the United Nations member states, the fifth SDG is a call for action to effectively turn gender equality into a fundamental human right and an essential foundation for a better world. It includes the eradication of all types of violence against women. Focusing on the technological perspective, the range of available solutions intended to prevent this social problem is very limited. Moreover, most of the solutions are based on a panic button approach, leaving aside the usage and integration of current state-of-the-art technologies, such as the Internet of Things (IoT), affective computing, cyber-physical systems, and smart-sensors. Thus, the main purpose of this research is to provide new insight into the design and development of tools to prevent and combat Gender-based Violence risky situations and, even, aggressions, from a technological perspective, but without leaving aside the different sociological considerations directly related to the problem. To achieve such an objective, we rely on the application of affective computing from a realist point of view, i.e. targeting the generation of systems and tools capable of being implemented and used nowadays or within an achievable time-frame. This pragmatic vision is channelled through: 1) an exhaustive study of the existing technological tools and mechanisms oriented to the fight Gender-based Violence, 2) the proposal of a new smart-wearable system intended to deal with some of the current technological encountered limitations, 3) a novel fear-related emotion classification approach to disentangle the relation between emotions and physiology, and 4) the definition and release of a new multi-modal dataset for emotion recognition in women. Firstly, different fear classification systems using a reduced set of physiological signals are explored and designed. This is done by employing open datasets together with the combination of time, frequency and non-linear domain techniques. This design process is encompassed by trade-offs between both physiological considerations and embedded capabilities. The latter is of paramount importance due to the edge-computing focus of this research. Two results are highlighted in this first task, the designed fear classification system that employed the DEAP dataset data and achieved an AUC of 81.60% and a Gmean of 81.55% on average for a subjectindependent approach, and only two physiological signals; and the designed fear classification system that employed the MAHNOB dataset data achieving an AUC of 86.00% and a Gmean of 73.78% on average for a subject-independent approach, only three physiological signals, and a Leave-One-Subject-Out configuration. A detailed comparison with other emotion recognition systems proposed in the literature is presented, which proves that the obtained metrics are in line with the state-ofthe- art. Secondly, Bindi is presented. This is an end-to-end autonomous multimodal system leveraging affective IoT throughout auditory and physiological commercial off-theshelf smart-sensors, hierarchical multisensorial fusion, and secured server architecture to combat Gender-based Violence by automatically detecting risky situations based on a multimodal intelligence engine and then triggering a protection protocol. Specifically, this research is focused onto the hardware and software design of one of the two edge-computing devices within Bindi. This is a bracelet integrating three physiological sensors, actuators, power monitoring integrated chips, and a System- On-Chip with wireless capabilities. Within this context, different embedded design space explorations are presented: embedded filtering evaluation, online physiological signal quality assessment, feature extraction, and power consumption analysis. The reported results in all these processes are successfully validated and, for some of them, even compared against physiological standard measurement equipment. Amongst the different obtained results regarding the embedded design and implementation within the bracelet of Bindi, it should be highlighted that its low power consumption provides a battery life to be approximately 40 hours when using a 500 mAh battery. Finally, the particularities of our use case and the scarcity of open multimodal datasets dealing with emotional immersive technology, labelling methodology considering the gender perspective, balanced stimuli distribution regarding the target emotions, and recovery processes based on the physiological signals of the volunteers to quantify and isolate the emotional activation between stimuli, led us to the definition and elaboration of Women and Emotion Multi-modal Affective Computing (WEMAC) dataset. This is a multimodal dataset in which 104 women who never experienced Gender-based Violence that performed different emotion-related stimuli visualisations in a laboratory environment. The previous fear binary classification systems were improved and applied to this novel multimodal dataset. For instance, the proposed multimodal fear recognition system using this dataset reports up to 60.20% and 67.59% for ACC and F1-score, respectively. These values represent a competitive result in comparison with the state-of-the-art that deal with similar multi-modal use cases. In general, this PhD thesis has opened a new research line within the research group under which it has been developed. Moreover, this work has established a solid base from which to expand knowledge and continue research targeting the generation of both mechanisms to help vulnerable groups and socially oriented technology.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: David Atienza Alonso.- Secretaria: Susana Patón Álvarez.- Vocal: Eduardo de la Torre Arnan

    Compression under pressure: physiological and methodological factors influencing the effect of compression garments on running economy

    Get PDF
    Evidence for the effects of compression garments on sports performance and physiological responses to dynamic exercise remains equivocal. Contradictory findings within the sporting literature are confounded by methodological heterogeneity in terms of; intensity and modality of exercise, type of garment worn, and the interface pressure produced by the garment. The interface pressure applied by compression clothing is an important measure in evaluating the bio-physical impact of compression. Interface pressure values obtained in vivo with two portable pressure devices (PicoPress and Kikuhime) were compared against a reference standard (HOSY). The PicoPress satisfied the a priori thresholds for acceptable validity at the posterior and lateral orientation with calf stockings and tights, confirming its future use to assess interface pressure. A small, likely beneficial improvement in running economy was observed with correctly fitted (95%:5%:0%; η2 = 0.55) but not oversized compression tights, indicating that a certain level of interface pressure is required. Compression tights improved running economy only at higher relative exercise intensities (77.7 - 91.5% V̇O2max). The absence of any improvement at lower intensities (67.1 - 77.6 % V̇O2max) suggest that changes in running economy from compression are dependent on relative exercise intensity when V̇O2max (%) is used as an anchor of exercise intensity. Comparing measures from two portable, wireless near-infrared spectroscopy (NIRS) devices (PortaMon and MOXY) we found that the low-cost and light-weight MOXY device gave tissue oxygen saturation values at rest and during exercise that were physiologically credible and suitable for future research. Compression tights did affect ground contact time but not tissue oxygen saturation, cardiovascular or other kinematic parameters during running at intensities equivalent to long-distance race speed. Compression tights can produce small improvements in running economy, but effects are restricted to higher intensity exercise and appear dependent on garment interface pressure. It remains unlikely that this small positive effect on running economy, in very specific conditions, is enough to result in a meaningful impact on running performance

    Heuristic ensembles of filters for accurate and reliable feature selection

    Get PDF
    Feature selection has become increasingly important in data mining in recent years. However, the accuracy and stability of feature selection methods vary considerably when used individually, and yet no rule exists to indicate which one should be used for a particular dataset. Thus, an ensemble method that combines the outputs of several individual feature selection methods appears to be a promising approach to address the issue and hence is investigated in this research. This research aims to develop an effective ensemble that can improve the accuracy and stability of the feature selection. We proposed a novel heuristic ensemble of filters (HEF). It combines two types of filters: subset filters and ranking filters with a heuristic consensus algorithm in order to utilise the strength of each type. The ensemble is tested on ten benchmark datasets and its performance is evaluated by two stability measures and three classifiers. The experimental results demonstrate that HEF improves the stability and accuracy of the selected features and in most cases outperforms the other ensemble algorithms, individual filters and the full feature set. The research on the HEF algorithm is extended in several dimensions; including more filter members, three novel schemes of mean rank aggregation with partial lists, and three novel schemes for a weighted heuristic ensemble of filters. However, the experimental results demonstrate that adding weight to filters in HEF does not achieve the expected improvement in accuracy, but increases time and space complexity, and clearly decreases stability. Therefore, the core ensemble algorithm (HEF) is demonstrated to be not just simpler but also more reliable and consistent than the later more complicated and weighted ensembles. In addition, we investigated how to use data in feature selection, using ALL or PART of it. Systematic experiments with thirty five synthetic and benchmark real-world datasets were carried out

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF
    corecore